TY - CONF T1 - Vehicle detection and tracking using acoustic and video sensors T2 - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on Y1 - 2004 A1 - Chellapa, Rama A1 - Qian,Gang A1 - Qinfen Zheng KW - acoustic KW - applications; KW - audio KW - audio-visual KW - beam-forming KW - Carlo KW - chain KW - density KW - detection; KW - direction-of-arrival KW - DOA KW - empirical KW - estimation; KW - framework; KW - functions; KW - fusion KW - fusion; KW - joint KW - Markov KW - methods; KW - Monte KW - moving KW - multimodal KW - object KW - optical KW - posterior KW - probability KW - probability; KW - processes; KW - processing; KW - sensing; KW - sensor KW - sensors; KW - signal KW - Surveillance KW - surveillance; KW - systems; KW - target KW - techniques; KW - tracking; KW - vehicle KW - video AB - Multimodal sensing has attracted much attention in solving a wide range of problems, including target detection, tracking, classification, activity understanding, speech recognition, etc. In surveillance applications, different types of sensors, such as video and acoustic sensors, provide distinct observations of ongoing activities. We present a fusion framework using both video and acoustic sensors for vehicle detection and tracking. In the detection phase, a rough estimate of target direction-of-arrival (DOA) is first obtained using acoustic data through beam-forming techniques. This initial DOA estimate designates the approximate target location in video. Given the initial target position, the DOA is refined by moving target detection using the video data. Markov chain Monte Carlo techniques are then used for joint audio-visual tracking. A novel fusion approach has been proposed for tracking, based on different characteristics of audio and visual trackers. Experimental results using both synthetic and real data are presented. Improved tracking performance has been observed by fusing the empirical posterior probability density functions obtained using both types of sensors. JA - Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on VL - 3 M3 - 10.1109/ICASSP.2004.1326664 ER -