THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2023/2024

Timetable information in the Course Catalogue may be subject to change.

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : Informatics

Postgraduate Course: Automatic Speech Recognition (INFR11033)

Course Outline
SchoolSchool of Informatics CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 11 (Year 4 Undergraduate) AvailabilityAvailable to all students
SCQF Credits10 ECTS Credits5
SummaryThis course covers the theory and practice of automatic speech recognition (ASR), with a focus on the statistical approaches that comprise the state of the art. The course introduces the overall framework for speech recognition, including speech signal analysis, acoustic modelling using hidden Markov models, language modelling and recognition search.

Advanced topics covered will include speaker adaptation, robust speech recognition and speaker identification. The practical side of the course will involve the development of a speech recognition system using a speech recognition software toolkit.
Course description Signal analysis for ASR
Statistical pattern recognition (Bayes decision theory, Learning algorithms, Evaluation methods, Gaussian mixture model, and EM algorithm)
Hidden Markov Models (HMM)
Context-dependent models
Discriminative training
Language models for LVCSR (large vocabulary continuous speech recognition)
Decoding
Robust ASR (Robust features Noise reduction, Microphone arrays)
Adaptation (Noise adaptation, Speaker adaptation/normalization, Language model adaptation)
Speaker recognition
History of speech recognition
Advanced topics (Using prosody for ASR, Audio-visual ASR, Indexing, Bayesian network)
Speech recognition applications (including privacy implications)

Relevant QAA Computing Curriculum Sections: Artificial Intelligence, Natural Language Computing
Entry Requirements (not applicable to Visiting Students)
Pre-requisites It is RECOMMENDED that students have passed Speech Processing (LASC11065)
Co-requisites
Prohibited Combinations Students MUST NOT also be taking Automatic Speech Recognition (UG) (INFR11219)
Other requirements MSc students must register for this course, while Undergraduate students must register for INFR11219 instead.

This course is open to all Informatics students including those on joint degrees. For external students where this course is not listed in your DPT, please seek special permission from the course organiser.

Some general mathematical ability is essential; Special functions log, exp are fundamental; mathematical notation (such as sums) used throughout; some calculus. Probability theory is used extensively: joint and conditional probabilities, Gaussian and multinomial distributions.

Programming using Python or shell scripting is required for the practicals and coursework.
Information for Visiting Students
Pre-requisitesNone
High Demand Course? Yes
Course Delivery Information
Academic year 2023/24, Available to all students (SV1) Quota:  None
Course Start Semester 2
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 100 ( Lecture Hours 15, Seminar/Tutorial Hours 8, Supervised Practical/Workshop/Studio Hours 5, Feedback/Feedforward Hours 6, Summative Assessment Hours 2, Programme Level Learning and Teaching Hours 2, Directed Learning and Independent Learning Hours 62 )
Assessment (Further Info) Written Exam 50 %, Coursework 50 %, Practical Exam 0 %
Additional Information (Assessment) Exam 50%
Coursework 50%

Assessed coursework will be worth 50% of the grade of the course. This will consist of:
- 5 short weekly practical assignments (1-2 hours each) worth 10% in total;
- A longer practical and written assignment (expected to take around 30 hours work) worth 40%.

Both sets of coursework will use Python and other standard software toolkits to develop a speech recognition system. They will be marked in compliance with the Common Marking Scheme.
Feedback Not entered
Exam Information
Exam Diet Paper Name Hours & Minutes
Main Exam Diet S2 (April/May)2:00
Learning Outcomes
On completion of this course, the student will be able to:
  1. describe the statistical framework used for automatic speech recognition
  2. understand the weakness of the simplified speech recognition systems and demonstrate knowledge of more advanced methods to overcome these problems
  3. describe speech recognition as an optimization problem in probabilistic terms
  4. relate individual terms in the mathematical framework for speech recognition to particular modules of the system
  5. build a large vocabulary continuous speech recognition system, using a standard software toolkit
Reading List
John N. Holmes, Wendy J. Holmes, "Speech Synthesis and Recognition", Taylor & Francis (2001), 2nd edition
Xuedong Huang, Alex Acero and Hsiao-Wuen Hon, "Spoken language processing: a guide to theory, algorithm, and system development", Prentice Hall (2001)
Lawrence R. Rabiner and Biing-Hwang Juang, "Fundamental of Speech Recognition", Prentice Hall (1993)
B. Gold, N. Morgan, "Speech and Audio Signal Processing: Processing and Perception of Speech and Music", John Wiley and Sons (1999)
Additional Information
Graduate Attributes and Skills Not entered
Keywordstheory,automatic speech recognition,artificial intelligence,natural language computing
Contacts
Course organiserDr Peter Bell
Tel: (0131 6)51 3284
Email: peter.bell@ed.ac.uk
Course secretaryMiss Yesica Marco Azorin
Tel: (0131 6)505113
Email: ymarcoa@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information