THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2021/2022

Information in the Degree Programme Tables may still be subject to change in response to Covid-19

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Philosophy, Psychology and Language Sciences : Language Sciences

Postgraduate Course: Speech Processing (LASC11158)

Course Outline
SchoolSchool of Philosophy, Psychology and Language Sciences CollegeCollege of Arts, Humanities and Social Sciences
Credit level (Normal year taken)SCQF Level 11 (Postgraduate) AvailabilityAvailable to all students
SCQF Credits20 ECTS Credits10
SummaryA foundation course in speech processing for students of linguistics, informatics, and related subjects.

Enrolments for students outwith Philosophy, Psychology and Language Sciences must be approved by the Course Organiser.
Course description The course is delivered as a combination of lectures, flipped classrooms, an online forum, short videos, readings, and practical exercises in the lab. The first hour of each lecture is generally devoted to foundation material, making the course accessible to students from a wide variety of backgrounds, including Linguistics, Informatics, and Music Technology.

Students deciding whether to take this course should visit the lecturer's blog http://www.speech.zone where much of the course material can be found.

In the lab, students investigate speech signals, experiment with a text-to-speech system, and build their own simple automatic speech recognition system, using industry-standard tools.

Fundamentals of speech processing: familiarity with waveforms, spectra, spectrograms, resonance, formants, human speech production and perception., perceptually-motivated frequency scales, time vs. frequency representations; conversion between the two, the Fourier transform, source-filter model of speech, hands on experience.

Automatic Speech recognition: components of a typical recogniser, parameterisation of the speech signal, dynamic time warping, distance measures, the Hidden Markov Model, the generative model paradigm, simple probability theory, conditional and joint probabilities, Bayes theorem, Gaussian probability density function, continuous density HMMs, monophone models with Gaussian observation densities, Viterbi algorithm for recognition, training from fully labelled data, Viterbi training, bigram language models.

Text-to-speech synthesis: components of a typical text-to-speech synthesiser, text analysis, phonology, finite-state automata, POS tagging, lexicon, phrasing, accents, F0, learning from data, CART models, waveform generation, concatenative methods, TD-PSOLA and linear prediction, F0 and duration modification.
Entry Requirements (not applicable to Visiting Students)
Pre-requisites Co-requisites
Prohibited Combinations Students MUST NOT also be taking Speech Processing (LASC11065) AND Speech Processing (Hons) (LASC10061)
Other requirements Enrolments for students outwith Philosophy, Psychology and Language Sciences must be approved by the Course Organiser.
Information for Visiting Students
Pre-requisitesNone
High Demand Course? Yes
Course Delivery Information
Academic year 2021/22, Available to all students (SV1) Quota:  60
Course Start Semester 1
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 200 ( Lecture Hours 20, Supervised Practical/Workshop/Studio Hours 30, Online Activities 60, Feedback/Feedforward Hours 2, Summative Assessment Hours 4, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 80 )
Assessment (Further Info) Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment) Assignment 1: 20%
Assignment 2: 30%
Assignment 3: 50%
Feedback Whole-class feedback on first coursework in form of written document and/or video
No Exam Information
Learning Outcomes
On completion of this course, the student will be able to:
  1. understand human speech production and perception, including the use of tools for visualising and manipulating speech
  2. give an overview of the components of automatic speech recognition and speech synthesis systems and describe a simple version of each component
  3. understand what the difficult problems are in automatic speech recognition and speech synthesis
  4. perform experiments with speech technology systems and relate theory to practice
  5. see how knowledge and skills from different areas come together in an interdisciplinary field
Reading List
http://resourcelists.ed.ac.uk/courses/lasc11065sv1sem1.html
Additional Information
Graduate Attributes and Skills Ability to use standard speech processing packages including Wavesurfer, Praat, Festival and HTK
Basic shell scripting
Scientific writing
Experimental design
Keywordsautomatic speech recognition,text-to-speech synthesis,speech signal processing,phonetics
Contacts
Course organiserDr Catherine Lai
Tel: (0131 6)50 2698
Email: C.Lai@ed.ac.uk
Course secretaryMiss Toni Noble
Tel: (0131 6)51 3188
Email: Toni.noble@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information