CMSC828D : Algorithms and systems for capture and playback of spatial audio.


Time: MW from 3:30-4:45; Spring 2006.

Instructor: Ramani Duraiswami ( )


Prerequisites:   The course will be somewhat mathematical, and you should be comfortable or willing to work with some differential equations and do some (Matlab) programming.


Exams: There will be a mid-term and a final exam. In addition there will be homework that either involves short problem sets, or require reading of research papers.


Credit: For computer science students, the class will count as a PhD qualifying course, and a MS qualifying course in Visual and Geometric Computing. The MS comp grades will be based on the mid-term and final exam.


Short outline:

Audio is a fundamental mode of human perception, and becoming increasingly important in machine perception. The principles of the way humans perceive their space via audition are beginning to be understood. Similarly, machine systems that track and locate objects via the sound they emanate are becoming practical and widely deployed. This course will serve as a broad introduction to graduate work in the field.


The course will survey the field of audio capture, processing and playback. A series of introductory lectures by the instructor will provide the physical, mathematical and signal processing basis for the course. The course will then move on to a discussion of papers and systems dealing with various aspects of spatial audio.


Detailed outline:

A survey of the field and applications;

Some basic principles of physical acoustics;

Partial differential equations governing acoustic wave propagation

Notions of frequency and the Fourier transform;

An introduction to the human auditory system

An introduction to signal processing

Source localization and beamforming with arrays of microphones.

Human spatial hearing: The physical and psychoacoustical basis of sound localization and space perception.

Room acoustics: sound propagation in rooms. Modeling. The influence of short and long term reverberation.

Head related transfer functions

Modeling Room impulse responses and head related impulse responses.

An introduction to commercial systems for surround sound and spatial audio.

Emerging Spatial Audio Playback systems: Wave Field Synthesis. Ambisonics.

Research systems being developed at the University of Maryland.

Selected Research topics

Policy: Honor code

Grading: Homework 40%, Mid-Term 25%, Final 35%






Lecture 1

Introduction to the course and audio in computing


Lecture 2

Physical Acoustics


Lecture 3

The wave equation. Helmholtz equation. Boundary Conditions. Properties of solutions.


Lecture 4

(Guest lecture by Dr. Elena Grassi). Using Matlab to do digital audio

Analog_in.m  makesignal.m  Analog_out.m  filters.m  Fig1.m  Figure2.m


Lecture 5

Separation of Variables. Fourier Series.


Lecture 6

Book chapter



Homework Problems  1, 2, 8


Lecture 7

Introduction to Signal Processing (based on material from CERN)


Lecture 8

Convolution, Impulse Response (based on Berkeley EE course)


Lecture 9

Fourier analysis by the auditory system

Homework question: Plot spectrograms of different sounds (“words”)


Lecture 10

Dimensions of “auditory” space. Capacity of humans to detect intensity, pitch, location. Based on a tutorial lecture by Prof. Simon Carlile of Sydney, at ICAD 2002.


Lecture 11

A continuation of material in lecture 10.

Microphone arrays: Reading material Chapter 1 and 2 of Michael Brandstein’s Ph.D. thesis (1995).


Lecture 12

Time delay estimation


Beamforming paper


Lecture 13



Lecture 14

Spherical arrays


Homework pi02.wav pi11.wav


Lecture 15

(see previous class notes)

Spherical arrays


No class

Spring break


Lecture 16

Head Related Transfer functions


Spherical Model: Duda and Martens, 1998,

Head and Torso Models: Algazi et al. 2002,


Lecture 17

(see previous class notes)

Recreation of Spatial Audio: Zotkin et al. 2004,

The CIPIC HRTF Database: Algazi et al. 2001

Brown and Duda (1998)

Homework 4  Allen and Berkley, 1979

Bee sound

Dmitry’s Snowman HRTF code shrtf.c


Lecture 18

Plane-wave representation

HRTF and spherical array based playback

Duraiswami et al. (2005)


Lecture 19

Transaural rendering

Commercial speaker-based spatial audio systems


Lecture 20

Room Acoustics


Lecture 21

Room Acoustics

HRTF Measurement





Lecture 22

Wave Field Synthesis

Thiele and Wittek (2004)

Rabenstein and Spors (2005)

DeVries and Boone (1999)


Lecture 23

Cortical Model

Zotkin et al (2005)

Chi et al (2004)


Lecture 24


1.     Independent Component Analysis for Audio

2.     Automatic Echo Cancellation, Noise removal and noise suppression

3.     Creating room transfer functions for graphically prescribed models

4.     Others in the lecture


Lecture 25

 Graphics based algorithms for Architectural Acoustics

Sttetner and Greenberg (1989)  Takala and Hahn (1992)

Funkhouser et al. (Siggraph 98, 99)  Tsingos et al. (Siggraph 2001)

Course notes Siggraph 2002 (Funkhouser et al.)














Useful Links


MATLAB resources:

  Introductory Tutorials

  Slightly more advanced Tutorials

MATLAB tutorial from University of New Hampshire

MATLAB tutorial/reference from University of Florida

MATLAB tutorial from Michigan Technological University

  More complete references/tutorials/FAQs

MATLAB tutorial from University of Maryland

The MathWorks home page for MATLAB