THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2025/2026

Timetable information in the Course Catalogue may be subject to change.

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : Informatics

Undergraduate Course: Foundations of Natural Language Processing (INFR10078)

Course Outline
SchoolSchool of Informatics CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 10 (Year 3 Undergraduate) AvailabilityAvailable to all students
SCQF Credits20 ECTS Credits10
SummaryThis course covers some of the foundations of natural language processing (NLP) and equips students for more advanced NLP courses in year 4. We focus on what makes automatic processing of language unique and challenging: its statistical properties, complex structure, and pervasive ambiguity. We cover a range of architectures and algorithms for NLP. The course starts with simple models for text classification and generation. We will then discuss neural models to represent the meaning of words and model language, such as Recurrent Neural Networks and Transformers.

Students will gain insight into the technology behind contemporary Large Language Models, including pre-training and supervised fine-tuning techniques. As part of the course, we will also introduce methodological and ethical considerations (e.g., evaluation, data collection, algorithmic bias) that are important for working in the field.
Course description The course will first introduce simple and interpretable models, such as n-gram and bag-of-words models and logistic regression, to illustrate a range of NLP tasks (language modelling, classification, and generation), as well as the basic framework for NLP experiments (training, evaluation, baselines). We will also introduce classic approaches to predicting linguistic representations (HMMs and PCFGs).

Next, we will cover neural architectures for NLP (such as Multi-Layer Perceptrons, Recurrent Neural Networks, and Transformers), which are more opaque and data-hungry, but also achieve better performance on NLP tasks.

Finally, we will discuss the framework of transfer learning, which leverages large amounts of unsupervised data, and the training pipeline for Large Language Models.

Throughout the course, we will introduce concepts and findings from linguistics as a way to understand the challenges of this type of data. We will discuss the strengths and weaknesses of different approaches, including both technical and ethical challenges (such as bias and interpretability). We will illustrate how NLP models can be used for specific applications (e.g., translation and summarisation).
Entry Requirements (not applicable to Visiting Students)
Pre-requisites Students MUST have passed: Informatics 2 - Introduction to Algorithms and Data Structures (INFR08026) OR Informatics Research Review (INFR11136)
Co-requisites
Prohibited Combinations Students MUST NOT also be taking Accelerated Natural Language Processing (INFR11125)
Other requirements Open to MSc students, so long as they have not taken ANLP, and they have the following expertise:

Understanding of basic probability; e.g., Bayes Rule
Familiar with basic computational processes: e.g., recursion, dynamic programming
Able to code in Python.
Basic knowledge of linguistic categories: e.g., Noun, Verb.
Familiar with first order logic.
Information for Visiting Students
Pre-requisitesUnderstanding of basic probability; e.g., Bayes Rule
Familiar with basic computational processes: e.g., recursion, dynamic programming
Able to code in Python.
Basic knowledge of linguistic categories: e.g., Noun, Verb.
Familiar with first order logic.
High Demand Course? Yes
Course Delivery Information
Academic year 2025/26, Available to all students (SV1) Quota:  None
Course Start Semester 2
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 200 ( Lecture Hours 30, Seminar/Tutorial Hours 5, Supervised Practical/Workshop/Studio Hours 5, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 156 )
Assessment (Further Info) Written Exam 75 %, Coursework 25 %, Practical Exam 0 %
Additional Information (Assessment) The coursework will include two practical assignments with written reports, in which parts of an NLP system will be implemented and the results analysed.
Feedback Tutorial exercises will be pen and paper (e.g., using an algorithm to analyse a toy example step by step). Students will prepare answers in advance of the tutorial, and present their analyses and get feedback on it during the tutorial.

Labs will consist of doing a small amount of programming, implementing algorithms taught in the lectures, running it on corpora and evaluating the results, with demonstrators available for guidance.

Coursework will involve more extensive implementation of the algorithms and models taught in lectures. Feedback will be a raw grade plus qualitative feedback.
Exam Information
Exam Diet Paper Name Minutes
Main Exam Diet S2 (April/May)Foundations of Natural Language Processing (INFR10078)120
Learning Outcomes
On completion of this course, the student will be able to:
  1. explain and provide examples illustrating some of the main challenges facing machine learning approaches to natural language data, including issues arising from properties of language (e.g., long sequence modelling; variation across languages, domains, and genres) and from social and ethical concerns (e.g., algorithmic bias and discrimination, interpretability)
  2. describe NLP models for classification and generation tasks; the experimental setup for training and testing; and how these models address some of the technical challenges described above
  3. for a range of NLP applications, outline possible approaches, including standard data sets, models, and evaluation methods. Discuss potential strengths and weaknesses of the suggested approaches (including both technical and ethical issues, where appropriate), and provide examples to illustrate
  4. implement parts of an NLP system with the help of appropriate support code and/or tools. Evaluate and interpret the results of implemented methods on natural language data sets
Reading List
REQUIRED: Dan Jurafsky and James Martin Speech and Language Processing (3rd edition online, and 2009 2nd edition for chapters that aren't yet updated in 3rd edition).

RECOMMENDED: Bird, S., E. Klein and E. Loper, Natural Language Processing with Python, (2009) O'Reilly Media.
Additional Information
Course URL https://opencourse.inf.ed.ac.uk/fnlp
Graduate Attributes and Skills Cognitive skills: critical thinking (via tutorials, labs and assessed work), detecting and handling ambiguity (via the study of linguistic ambiguity in this course).

Responsibility, autonomy and effectiveness: self-awareness and reflection (via acquisition of the skill of perceiving linguistic ambiguity that in normal human language processing people don't perceive), independent learning (via the labs, required reading and preparation for tutorials), exploration and testing of evidence towards (or against) a hypothesis (via the labs and tutorials), time management (via coursework).

Communication: written communication.
Additional Class Delivery Information 30 lectures, 5 tutorials and 5 labs. The tutorials and labs will occur in alternate weeks.
Keywordsnatural language,corpus-based methods,machine learning
Contacts
Course organiserDr Ivan Titov
Tel: (0131 6)51 3092
Email: ititov@exseed.ed.ac.uk
Course secretaryMiss Rose Hynd
Tel: (0131 6)50 5194
Email: rhynd@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information