Speech Communication Course, Zheng-Hua Tan

$Description: Description: Description: Description: M:\public_html\accessories\aau_logo.gif$

Syllabus

Calendar

Readings

Favorite links

Speech Communication

Zheng-Hua Tan

Tel: +45 9940-8686
Email zt@es.aau.dk

Office: Room A6-319, Niels Jernes Vej 12

$Description: Description: Description: Description: M:\public_html\accessories\movline.gif$

Course description:

Speech is the most natural means for human-human communication. As computing machines become more and more capable and widespread, there is an increasing demand to include speech as a key component in human-machine interface. This course attempts to provide the students with a basic comprehension of the methods and models applied in speech communication systems.

Course outline:

· Speech production and acoustic phonetics

· Speech perception

· Speech coding

· Speech synthesis

· Speech and speaker recognition

Literature:

· Deller, Hansen, Proakis, Discrete-Time Processing of Speech Signals, 2nd Edition, Wiley-IEEE Press, 1999.

Lecture notes:

· Lecture 1 Slides (Introduction, speech production and acoustic phonetics)

o Readings: J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, pp. 81-85, 99-150.

o Assignment 1: 2.1, 2.6, 2.17, and to play with the speech tool - Speech Filing System.

o Speech files for retrieving at ftp://archive.egr.msu.edu/pub/jojo/DPHTEXT/

o Sounds for downloading

· Lecture 2 Slides (Speech analysis)

o Readings: J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 6, OR L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Chapter 4, Chapter 8.

o Assignment 2: Matlab for speech analysis

· Lecture 3 Slides (Speech coding and synthesis)

o Readings: OR J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 7, OR Huang, Acero and Hon, Spoken Language Processing, Chapter 7, Chapter 16.

o Assignment 3: Assignment on speech coding and synthesis

· Lecture 4 Slides (Speech recognition, Part I)

o Readings: download them from http://ieeexplore.ieee.org/ OR send me an email to request.

§ J. Deller, J. Hansen, J, Proakis, Discrete-Time Processing of Speech Signals, Chapter 10-12.

§ Rabiner, L.R., "A tutorial on hidden Markov models and selected applications in speech recognition", Proceedings of the IEEE, 77 (2), 1989, pp. 257 - 286.

§ Sakoe, H. and Chiba, S., "Dynamic programming algorithm optimization for spoken word recognition", IEEE Trans. Acoustics, Speech, and Signal Processing, 26(1), 1978, pp. 43 - 49.

o Assignment 4: Use Matlab to implement Dynamic Time Warping to compare speech signals.

· Lecture 5 Slides (Speech recognition, Part 2)

o Readings:

§ Steve Young, "A review of large-vocabulary continuous-speech", IEEE Signal Processing Magazine, Sep 1996. Alternatively, Huang, Acero and Hon, Spoken Language Processing, Chapter 9.

§ Steve Young, et al., "The HTK Book" (optional).

o Assignment 5:

§ Hidden Markov Model and Viterbi decoding

§ HTK Demo.

Contact Information:

Zheng-Hua Tan

Department of Electronic Systems
Aalborg University
Niels Jernes Vej 12
DK-9220, Aalborg
Denmark
Tel +45 9940-8686
Email zt@es.aau.dk