Description: Description: Description: Description: M:\public_html\accessories\aau_logo.gif


Syllabus

Calendar

Readings

Favorite links

 

Speech Communication

Zheng-Hua Tan

Tel: +45 9940-8686
Email zt@es.aau.dk

Office: Room A6-319, Niels Jernes Vej 12

Description: Description: Description: Description: M:\public_html\accessories\movline.gif

 

Course description:

Speech is the most natural means for human-human communication. As computing machines become more and more capable and widespread, there is an increasing demand to include speech as a key component in human-machine interface. This course attempts to provide the students with a basic comprehension of the methods and models applied in speech communication systems.

Course outline:

·         Speech production and acoustic phonetics

·         Speech perception

·         Speech coding

·         Speech synthesis

·         Speech and speaker recognition

Literature:

·         Deller, Hansen, Proakis, Discrete-Time Processing of Speech Signals, 2nd Edition, Wiley-IEEE Press, 1999. 


Lecture notes:

·         Lecture 1 Slides (Introduction, speech production and acoustic phonetics) 

o    Readings: J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, pp. 81-85, 99-150.

o    Assignment 1: 2.1, 2.6, 2.17, and to play with the speech tool - Speech Filing System.

o    Speech files for retrieving at ftp://archive.egr.msu.edu/pub/jojo/DPHTEXT/

o    Sounds for downloading

·         Lecture 2 Slides  (Speech analysis)

o    Readings: J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 6, OR L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Chapter 4, Chapter 8.

o    Assignment 2: Matlab for speech analysis

·         Lecture 3 Slides (Speech coding and synthesis)

o    Readings: OR J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 7, OR Huang, Acero and Hon, Spoken Language Processing, Chapter 7, Chapter 16.

o    Assignment 3: Assignment on speech coding and synthesis

·         Lecture 4 Slides (Speech recognition, Part I)

o    Readings: download them from http://ieeexplore.ieee.org/ OR send me an email to request.

§  J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 10-12.

§  Rabiner, L.R., "A tutorial on hidden Markov models and selected applications in speech recognition", Proceedings of the IEEE, 77 (2), 1989, pp. 257 - 286.

§  Sakoe, H. and Chiba, S., "Dynamic programming algorithm optimization for spoken word recognition", IEEE Trans. Acoustics, Speech, and Signal Processing, 26(1), 1978, pp. 43 - 49.

o    Assignment 4: Use Matlab to implement Dynamic Time Warping to compare speech signals.

·         Lecture 5 Slides (Speech recognition, Part 2)

o    Readings: 

§  Steve Young, "A review of large-vocabulary continuous-speech", IEEE Signal Processing Magazine, Sep 1996. Alternatively, Huang, Acero and Hon, Spoken Language Processing, Chapter 9.

§  Steve Young, et al., "The HTK Book" (optional).

o    Assignment 5:

§  Hidden Markov Model and Viterbi decoding

§  HTK Demo

 


Contact Information:

Zheng-Hua Tan


Department of Electronic Systems
Aalborg University
Niels Jernes Vej 12
DK-9220, Aalborg
Denmark
Tel +45 9940-8686
Email zt@es.aau.dk