CMSC828D : Algorithms and systems for capture and playback of spatial audio.

 

Time: MW from 3:30-4:45; Spring 2006.

Instructor: Ramani Duraiswami (ramani@umiacs.umd.edu http://www.umiacs.umd.edu/~ramani )

 

Prerequisites:   The course will be somewhat mathematical, and you should be comfortable or willing to work with some differential equations and do some (Matlab) programming.

 

Exams: There will be a mid-term and a final exam. In addition there will be homework that either involves short problem sets, or require reading of research papers.

 

Credit: For computer science students, the class will count as a PhD qualifying course, and a MS qualifying course in Visual and Geometric Computing. The MS comp grades will be based on the mid-term and final exam.

 

Short outline:

Audio is a fundamental mode of human perception, and becoming increasingly important in machine perception. The principles of the way humans perceive their space via audition are beginning to be understood. Similarly, machine systems that track and locate objects via the sound they emanate are becoming practical and widely deployed. This course will serve as a broad introduction to graduate work in the field.

 

The course will survey the field of audio capture, processing and playback. A series of introductory lectures by the instructor will provide the physical, mathematical and signal processing basis for the course. The course will then move on to a discussion of papers and systems dealing with various aspects of spatial audio.

 

Detailed outline:

A survey of the field and applications;

Some basic principles of physical acoustics;

Partial differential equations governing acoustic wave propagation

Notions of frequency and the Fourier transform;

An introduction to the human auditory system

An introduction to signal processing

Source localization and beamforming with arrays of microphones.

Human spatial hearing: The physical and psychoacoustical basis of sound localization and space perception.

Room acoustics: sound propagation in rooms. Modeling. The influence of short and long term reverberation.

Head related transfer functions

Modeling Room impulse responses and head related impulse responses.

An introduction to commercial systems for surround sound and spatial audio.

Emerging Spatial Audio Playback systems: Wave Field Synthesis. Ambisonics.

Research systems being developed at the University of Maryland.

Selected Research topics

Policy: Honor code http://www.studenthonorcouncil.umd.edu/code.html

Grading: Homework 40%, Mid-Term 25%, Final 35%

 

DATE

LECTURE

CONTENTS

01/25/2006

Lecture 1

Introduction to the course and audio in computing

01/30/2006

Lecture 2

Physical Acoustics

02/01/2006

Lecture 3

The wave equation. Helmholtz equation. Boundary Conditions. Properties of solutions.

02/06/2006

Lecture 4

(Guest lecture by Dr. Elena Grassi). Using Matlab to do digital audio

Analog_in.m  makesignal.m  Analog_out.m  filters.m  Fig1.m  Figure2.m

02/08/2006

Lecture 5

Separation of Variables. Fourier Series.

02/13/2006

Lecture 6

Book chapter

FFT

 

Homework Problems  1, 2, 8

02/15/2006

Lecture 7

Introduction to Signal Processing (based on material from CERN)

02/20/2006

Lecture 8

Convolution, Impulse Response (based on Berkeley EE course)

02/22/2006

Lecture 9

Fourier analysis by the auditory system

http://www.doc.ic.ac.uk/~phwl/teaching/mm/cochleaWeb3.mov

Homework question: Plot spectrograms of different sounds (“words”)

02/27/2006

Lecture 10

Dimensions of “auditory” space. Capacity of humans to detect intensity, pitch, location. Based on a tutorial lecture by Prof. Simon Carlile of Sydney, at ICAD 2002.

03/01/2006

Lecture 11

A continuation of material in lecture 10.

Microphone arrays: Reading material Chapter 1 and 2 of Michael Brandstein’s Ph.D. thesis (1995).

03/06/2006

Lecture 12

Time delay estimation

 

Beamforming paper

03/08/2006

Lecture 13

Beamforming

03/13/2006

Lecture 14

Spherical arrays

 

Homework pi02.wav pi11.wav

03/15/2006

Lecture 15

(see previous class notes)

Spherical arrays

03/20/2006
03/22/2006

No class

Spring break

03/27/2006

Lecture 16

Head Related Transfer functions

Papers:

Spherical Model: Duda and Martens, 1998,

Head and Torso Models: Algazi et al. 2002,

03/29/2006

Lecture 17

(see previous class notes)

Recreation of Spatial Audio: Zotkin et al. 2004,

The CIPIC HRTF Database: Algazi et al. 2001

Brown and Duda (1998)

Homework 4  Allen and Berkley, 1979

Bee sound

Dmitry’s Snowman HRTF code shrtf.c

04/03/2006

Lecture 18

Plane-wave representation

HRTF and spherical array based playback

Duraiswami et al. (2005)

04/05/2005

Lecture 19

Transaural rendering

Commercial speaker-based spatial audio systems

04/10/2006

Lecture 20

Room Acoustics

04/12/2006

Lecture 21

Room Acoustics

HRTF Measurement

04/17/2006

 

 

04/19/2006

Lecture 22

Wave Field Synthesis

Thiele and Wittek (2004)

Rabenstein and Spors (2005)

DeVries and Boone (1999)

04/24/2006

Lecture 23

Cortical Model

Zotkin et al (2005)

Chi et al (2004)

04/26/2006

Lecture 24

Projects:

1.     Independent Component Analysis for Audio

2.     Automatic Echo Cancellation, Noise removal and noise suppression

3.     Creating room transfer functions for graphically prescribed models

4.     Others in the lecture

05/01/2006

Lecture 25

 Graphics based algorithms for Architectural Acoustics

Sttetner and Greenberg (1989)  Takala and Hahn (1992)

Funkhouser et al. (Siggraph 98, 99)  Tsingos et al. (Siggraph 2001)

Course notes Siggraph 2002 (Funkhouser et al.)

 

 

 

 

 

 

 

 

 

 

 

 

 


Useful Links

 

MATLAB resources:

  Introductory Tutorials

  Slightly more advanced Tutorials

MATLAB tutorial from University of New Hampshire

MATLAB tutorial/reference from University of Florida

MATLAB tutorial from Michigan Technological University

  More complete references/tutorials/FAQs

MATLAB tutorial from University of Maryland

The MathWorks home page for MATLAB