Resumen: This work deals with the challenging task of activity recognition in unconstrained videos. Standard methods are based on video encoding of low-level features using Fisher Vectors or Bag of Features. However, these approaches model every sequence into a single vector with fixed dimensionality that lacks any long-term temporal information, which may be important for recognition, especially of complex activities. This work proposes a novel framework with two main technical novelties: First, a video encoding method that maintains the temporal structure of sequences and second a Time Flexible Kernel that allows comparison of sequences of different lengths and random alignment. Results on challenging benchmarks and comparison to previous work demonstrate the applicability and value of our framework. Idioma: Inglés DOI: 10.1016/j.imavis.2015.12.006 Año: 2016 Publicado en: IMAGE AND VISION COMPUTING 48-49 (2016), 26-36 ISSN: 0262-8856 Factor impacto JCR: 2.671 (2016) Categ. JCR: COMPUTER SCIENCE, THEORY & METHODS rank: 19 / 104 = 0.183 (2016) - Q1 - T1 Categ. JCR: COMPUTER SCIENCE, SOFTWARE ENGINEERING rank: 17 / 106 = 0.16 (2016) - Q1 - T1 Categ. JCR: OPTICS rank: 26 / 92 = 0.283 (2016) - Q2 - T1 Categ. JCR: ENGINEERING, ELECTRICAL & ELECTRONIC rank: 75 / 260 = 0.288 (2016) - Q2 - T1 Categ. JCR: COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE rank: 37 / 133 = 0.278 (2016) - Q2 - T1 Factor impacto SCIMAGO: 0.953 - Computer Vision and Pattern Recognition (Q1) - Signal Processing (Q1) - Electrical and Electronic Engineering (Q1)