TY - CONF T1 - Learning action dictionaries from video T2 - Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on Y1 - 2008 A1 - Turaga,P. A1 - Chellapa, Rama KW - (artificial KW - action KW - action-phrases;learning KW - automated KW - decomposition;video KW - dictionaries;spatial KW - intelligence);video KW - segment KW - segmentation;learning KW - sequence;computer KW - Surveillance KW - surveillance; KW - systems;computer KW - transforms;video KW - vision;image KW - vision;independent AB - Summarizing the contents of a video containing human activities is an important problem in computer vision and has important applications in automated surveillance systems. Summarizing a video requires one to identify and learn a 'vocabulary' of action-phrases corresponding to specific events and actions occurring in the video. We propose a generative model for dynamic scenes containing human activities as a composition of independent action-phrases - each of which is derived from an underlying vocabulary. Given a long video sequence, we propose a completely unsupervised approach to learn the vocabulary. Once the vocabulary is learnt, a video segment can be decomposed into a collection of phrases for summarization. We then describe methods to learn the correlations between activities and sequentiality of events. We also propose a novel method for building invariances to spatial transforms in the summarization scheme. JA - Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on M3 - 10.1109/ICIP.2008.4712102 ER -