Lecture 1

CMSC828D : Algorithms and systems for capture and playback of spatial audio.

Time: MW from 3:30-4:45; Spring 2006.

Instructor: Ramani Duraiswami (ramani@umiacs.umd.edu http://www.umiacs.umd.edu/~ramani )

Prerequisites: The course will be somewhat mathematical, and you should be comfortable or willing to work with some differential equations and do some (Matlab) programming.

Exams: There will be a mid-term and a final exam. In addition there will be homework that either involves short problem sets, or require reading of research papers.

Credit: For computer science students, the class will count as a PhD qualifying course, and a MS qualifying course in Visual and Geometric Computing. The MS comp grades will be based on the mid-term and final exam.

Short outline:

Audio is a fundamental mode of human perception, and becoming increasingly important in machine perception. The principles of the way humans perceive their space via audition are beginning to be understood. Similarly, machine systems that track and locate objects via the sound they emanate are becoming practical and widely deployed. This course will serve as a broad introduction to graduate work in the field.

The course will survey the field of audio capture, processing and playback. A series of introductory lectures by the instructor will provide the physical, mathematical and signal processing basis for the course. The course will then move on to a discussion of papers and systems dealing with various aspects of spatial audio.

Detailed outline:

A survey of the field and applications;

Some basic principles of physical acoustics;

Partial differential equations governing acoustic wave propagation

Notions of frequency and the Fourier transform;

An introduction to the human auditory system

An introduction to signal processing

Source localization and beamforming with arrays of microphones.

Human spatial hearing: The physical and psychoacoustical basis of sound localization and space perception.

Room acoustics: sound propagation in rooms. Modeling. The influence of short and long term reverberation.

Head related transfer functions

Modeling Room impulse responses and head related impulse responses.

An introduction to commercial systems for surround sound and spatial audio.

Emerging Spatial Audio Playback systems: Wave Field Synthesis. Ambisonics.

Research systems being developed at the University of Maryland.

Selected Research topics

Policy: Honor code http://www.studenthonorcouncil.umd.edu/code.html

Grading: Homework 40%, Mid-Term 25%, Final 35%

*DATE*	*LECTURE*	*CONTENTS*
01/25/2006	Lecture 1	Introduction to the course and audio in computing
01/30/2006	Lecture 2	Physical Acoustics
02/01/2006	Lecture 3	The wave equation. Helmholtz equation. Boundary Conditions. Properties of solutions.
02/06/2006	Lecture 4	(Guest lecture by Dr. Elena Grassi). Using Matlab to do digital audio Analog_in.m makesignal.m Analog_out.m filters.m Fig1.m Figure2.m
02/08/2006	Lecture 5	Separation of Variables. Fourier Series.
02/13/2006	Lecture 6 Book chapter	FFT Homework Problems 1, 2, 8
02/15/2006	Lecture 7	Introduction to Signal Processing (based on material from CERN)
02/20/2006	Lecture 8	Convolution, Impulse Response (based on Berkeley EE course)
02/22/2006	Lecture 9	Fourier analysis by the auditory system http://www.doc.ic.ac.uk/~phwl/teaching/mm/cochleaWeb3.mov Homework question: Plot spectrograms of different sounds (“words”)
02/27/2006	Lecture 10	Dimensions of “auditory” space. Capacity of humans to detect intensity, pitch, location. Based on a tutorial lecture by Prof. Simon Carlile of Sydney, at ICAD 2002.
03/01/2006	Lecture 11	A continuation of material in lecture 10. Microphone arrays: Reading material Chapter 1 and 2 of Michael Brandstein’s Ph.D. thesis (1995).
03/06/2006	Lecture 12	Time delay estimation Beamforming paper
03/08/2006	Lecture 13	Beamforming
03/13/2006	Lecture 14	Spherical arrays Homework pi02.wav pi11.wav
03/15/2006	Lecture 15 (see previous class notes)	Spherical arrays
03/20/2006 03/22/2006	No class	Spring break
03/27/2006	Lecture 16	Head Related Transfer functions Papers: Spherical Model: Duda and Martens, 1998, Head and Torso Models: Algazi et al. 2002,
03/29/2006	Lecture 17 (see previous class notes)	Recreation of Spatial Audio: Zotkin et al. 2004, The CIPIC HRTF Database: Algazi et al. 2001 Brown and Duda (1998) Homework 4 Allen and Berkley, 1979 Bee sound Dmitry’s Snowman HRTF code shrtf.c
04/03/2006	Lecture 18	Plane-wave representation HRTF and spherical array based playback Duraiswami et al. (2005)
04/05/2005	Lecture 19	Transaural rendering Commercial speaker-based spatial audio systems
04/10/2006	Lecture 20	Room Acoustics
04/12/2006	Lecture 21	Room Acoustics HRTF Measurement
04/17/2006
04/19/2006	Lecture 22	Wave Field Synthesis Thiele and Wittek (2004) Rabenstein and Spors (2005) DeVries and Boone (1999)
04/24/2006	Lecture 23	Cortical Model Zotkin et al (2005) Chi et al (2004)
04/26/2006	Lecture 24	Projects: 1. Independent Component Analysis for Audio 2. Automatic Echo Cancellation, Noise removal and noise suppression 3. Creating room transfer functions for graphically prescribed models 4. Others in the lecture
05/01/2006	Lecture 25	Graphics based algorithms for Architectural Acoustics Sttetner and Greenberg (1989) Takala and Hahn (1992) Funkhouser et al. (Siggraph 98, 99) Tsingos et al. (Siggraph 2001) Course notes Siggraph 2002 (Funkhouser et al.)