TY - CHAP

T1 - Video activity recognition by luminance differential trajectory and aligned projection distance

AU - Zheng, Haomian

AU - Li, Zhu

AU - Fu, Yun

AU - Katsaggelos, Aggelos K.

AU - You, Jia

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Video content analysis and understanding are active research topics in modern visual computing and communication. In this context, a particular challenging problem that attracts much attention is human action recognition. In this chapter, we propose a new methodology to solve the problem using geometric statistical information. Two new approaches, Differential Luminance Field Trajectory (DLFT) and Luminance Aligned Projection Distance (LAPD), are proposed. Instead of extracting the object or using interest points as a representation, we treat each video clip as a trajectory in a very high dimensionality space and extract the useful statistical geometric information for action recognition. For DLFT, we take advantage of the differential signals which preserve both the temporal and spatial information, and then classify the action by supervised learning. For the LAPD approach, we generate a trajectory for each video clip and compute a distance metric to describe the similarity for classification. Decision is made by applying a K-Nearest Neighbor classifier. Since DLFT is more sensitive in the temporal domain while the LAPD approach can handle more variance in appearance luminance field, a potential fusion of the two methods would yield more desirable properties. Experimental results demonstrate that the methods work effectively and efficiently. The performance is comparable or better and more robust than conventional methods.

AB - Video content analysis and understanding are active research topics in modern visual computing and communication. In this context, a particular challenging problem that attracts much attention is human action recognition. In this chapter, we propose a new methodology to solve the problem using geometric statistical information. Two new approaches, Differential Luminance Field Trajectory (DLFT) and Luminance Aligned Projection Distance (LAPD), are proposed. Instead of extracting the object or using interest points as a representation, we treat each video clip as a trajectory in a very high dimensionality space and extract the useful statistical geometric information for action recognition. For DLFT, we take advantage of the differential signals which preserve both the temporal and spatial information, and then classify the action by supervised learning. For the LAPD approach, we generate a trajectory for each video clip and compute a distance metric to describe the similarity for classification. Decision is made by applying a K-Nearest Neighbor classifier. Since DLFT is more sensitive in the temporal domain while the LAPD approach can handle more variance in appearance luminance field, a potential fusion of the two methods would yield more desirable properties. Experimental results demonstrate that the methods work effectively and efficiently. The performance is comparable or better and more robust than conventional methods.

KW - Action recognition

KW - Differential luminance field trajectory

KW - Gesture recognition

KW - Luminance aligned projection distance

UR - http://www.scopus.com/inward/record.url?scp=84878060631&partnerID=8YFLogxK

U2 - 10.1016/B978-0-444-53859-8.00012-6

DO - 10.1016/B978-0-444-53859-8.00012-6

M3 - Chapter in an edited book (as author)

T3 - Handbook of Statistics

SP - 301

EP - 325

BT - Handbook of Statistics

ER -