Postgraduate Course: Speech Processing (LASC11065)
Course Outline
School | School of Philosophy, Psychology and Language Sciences |
College | College of Arts, Humanities and Social Sciences |
Credit level (Normal year taken) | SCQF Level 11 (Postgraduate) |
Availability | Available to all students |
SCQF Credits | 10 |
ECTS Credits | 5 |
Summary | A foundation course in speech processing for students of linguistics, informatics, and related subjects. |
Course description |
The course is delivered as a combination of lectures, flipped classrooms, an online forum, short videos, readings, and practical exercises in the lab. The first hour of each lecture is generally devoted to foundation material, making the course accessible to students from a wide variety of backgrounds, including Linguistics, Informatics, and Music Technology.
Students deciding whether to take this course should visit the lecturer's blog http://www.speech.zone where much of the course material can be found.
In the lab, students investigate speech signals, experiment with a text-to-speech system, and build their own simple automatic speech recognition system, using industry-standard tools.
Fundamentals of speech processing: familiarity with waveforms, spectra, spectrograms, resonance, formants, human speech production and perception., perceptually-motivated frequency scales, time vs. frequency representations; conversion between the two, the Fourier transform, source-filter model of speech, hands on experience.
Automatic Speech recognition: components of a typical recogniser, parameterisation of the speech signal, dynamic time warping, distance measures, the Hidden Markov Model, the generative model paradigm, simple probability theory, conditional and joint probabilities, Bayes theorem, Gaussian probability density function, continuous density HMMs, monophone models with Gaussian observation densities, Viterbi algorithm for recognition, training from fully labelled data, Viterbi training, bigram language models.
Text-to-speech synthesis: components of a typical text-to-speech synthesiser, text analysis, phonology, finite-state automata, POS tagging, lexicon, phrasing, accents, F0, learning from data, CART models, waveform generation, concatenative methods, TD-PSOLA and linear prediction, F0 and duration modification.
|
Entry Requirements (not applicable to Visiting Students)
Pre-requisites |
|
Co-requisites | |
Prohibited Combinations | |
Other requirements | None |
Additional Costs | None |
Information for Visiting Students
Pre-requisites | None |
High Demand Course? |
Yes |
Course Delivery Information
|
Academic year 2019/20, Available to all students (SV1)
|
Quota: None |
Course Start |
Semester 1 |
Timetable |
Timetable |
Learning and Teaching activities (Further Info) |
Total Hours:
100
(
Lecture Hours 27,
Feedback/Feedforward Hours 1,
Programme Level Learning and Teaching Hours 2,
Directed Learning and Independent Learning Hours
70 )
|
Assessment (Further Info) |
Written Exam
50 %,
Coursework
50 %,
Practical Exam
0 %
|
Feedback |
After Assignment 1 - whole-class formative feedback event based on examples taken from students' submitted work.
After Assignment 2 & the Exam - all students will have the opportunity during second semester for an individual 15 minute summative feedback session with the lecturer covering either or both of these items of assessment.
Comments will be provided on both items of submitted coursework and structured marking schemes will be used. |
Exam Information |
Exam Diet |
Paper Name |
Hours & Minutes |
|
Main Exam Diet S1 (December) | | 2:00 | |
Learning Outcomes
On completion of this course, the student will be able to:
- give an overview of the components of state-of-the art speech recognition and speech synthesis systems
- understand the main concepts and what each component does and describe a simple version of each component
- see what the difficult problems are in recognition and synthesis. They will also: use tools for visualising and manipulating speech waveforms
- experiment with two state-of-the-art speech technology systems and put experimental methodology into practice
- see how knowledge and skills from different areas come together in an interdisciplinary field
|
Reading List
http://resourcelists.ed.ac.uk/courses/lasc11065sv1sem1.html |
Additional Information
Course URL |
http://www.speech.zone/courses/speech-processing/ |
Graduate Attributes and Skills |
Main Graduate Attributes:
- ability to use standard speech processing packages including Wavesurfer, Praat, Festival and HTK
- basic shell scripting
- scientific writing
- experimental design |
Additional Class Delivery Information |
Attend all lectures as scheduled
Students only need to attend ONE of the two lab sessions. You will be assigned a lab session by the lecturer. |
Keywords | automatic speech recognition,text-to-speech synthesis,speech signal processing |
Contacts
Course organiser | Prof Simon King
Tel: (0131 6)51 1725
Email: Simon.King@ed.ac.uk |
Course secretary | Miss Toni Noble
Tel: (0131 6)51 3188
Email: Toni.noble@ed.ac.uk |
|
|