THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2025/2026

Timetable information in the Course Catalogue may be subject to change.

University Homepage

DRPS : Course Catalogue : School of Informatics : Informatics

Undergraduate Course: Foundations of Natural Language Processing (INFR10078)

Course Outline
School	School of Informatics	College	College of Science and Engineering
Credit level (Normal year taken)	SCQF Level 10 (Year 3 Undergraduate)	Availability	Available to all students
SCQF Credits	20	ECTS Credits	10
Summary	This course covers some of the foundations of natural language processing (NLP) and equips students for more advanced NLP courses in year 4. We focus on what makes automatic processing of language unique and challenging: its statistical properties, complex structure, and pervasive ambiguity. We cover a range of architectures and algorithms for NLP. The course starts with simple models for text classification and generation. We will then discuss neural models to represent the meaning of words and model language, such as Recurrent Neural Networks and Transformers. Students will gain insight into the technology behind contemporary Large Language Models, including pre-training and supervised fine-tuning techniques. As part of the course, we will also introduce methodological and ethical considerations (e.g., evaluation, data collection, algorithmic bias) that are important for working in the field.
Course description	The course will first introduce simple and interpretable models, such as n-gram and bag-of-words models and logistic regression, to illustrate a range of NLP tasks (language modelling, classification, and generation), as well as the basic framework for NLP experiments (training, evaluation, baselines). We will also introduce classic approaches to predicting linguistic representations (HMMs and PCFGs). Next, we will cover neural architectures for NLP (such as Multi-Layer Perceptrons, Recurrent Neural Networks, and Transformers), which are more opaque and data-hungry, but also achieve better performance on NLP tasks. Finally, we will discuss the framework of transfer learning, which leverages large amounts of unsupervised data, and the training pipeline for Large Language Models. Throughout the course, we will introduce concepts and findings from linguistics as a way to understand the challenges of this type of data. We will discuss the strengths and weaknesses of different approaches, including both technical and ethical challenges (such as bias and interpretability). We will illustrate how NLP models can be used for specific applications (e.g., translation and summarisation).

Entry Requirements (not applicable to Visiting Students)
Pre-requisites	Students MUST have passed: Informatics 2 - Introduction to Algorithms and Data Structures (INFR08026) OR Informatics Research Review (INFR11136)	Co-requisites
Prohibited Combinations	Students MUST NOT also be taking Accelerated Natural Language Processing (INFR11125)	Other requirements	Open to MSc students, so long as they have not taken ANLP, and they have the following expertise: Understanding of basic probability; e.g., Bayes Rule Familiar with basic computational processes: e.g., recursion, dynamic programming Able to code in Python. Basic knowledge of linguistic categories: e.g., Noun, Verb. Familiar with first order logic.

Information for Visiting Students
Pre-requisites	Understanding of basic probability; e.g., Bayes Rule Familiar with basic computational processes: e.g., recursion, dynamic programming Able to code in Python. Basic knowledge of linguistic categories: e.g., Noun, Verb. Familiar with first order logic.
High Demand Course?	Yes

Course Delivery Information

Academic year 2025/26, Available to all students (SV1)		Quota: None
Course Start	Semester 2
Timetable	Timetable
Learning and Teaching activities (Further Info)	Total Hours: 200 ( Lecture Hours 30, Seminar/Tutorial Hours 5, Supervised Practical/Workshop/Studio Hours 5, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 156 )
Assessment (Further Info)	Written Exam 70 %, Coursework 30 %, Practical Exam 0 %
Feedback	Tutorial exercises will be pen and paper (e.g., using an algorithm to analyse a toy example step by step). Students will prepare answers in advance of the tutorial, and present their analyses and get feedback on it during the tutorial. Labs will consist of doing a small amount of programming, implementing algorithms taught in the lectures, running it on corpora and evaluating the results, with demonstrators available for guidance. Coursework will involve more extensive implementation of the algorithms and models taught in lectures. Feedback will be a raw grade plus qualitative feedback.
Exam Information
Exam Diet	Paper Name		Minutes
Main Exam Diet S2 (April/May)	Foundations of Natural Language Processing (INFR10078)		120

Learning Outcomes
On completion of this course, the student will be able to: explain and provide examples illustrating some of the main challenges facing machine learning approaches to natural language data, including issues arising from properties of language (e.g., long sequence modelling; variation across languages, domains, and genres) and from social and ethical concerns (e.g., algorithmic bias and discrimination, interpretability) describe NLP models for classification and generation tasks; the experimental setup for training and testing; and how these models address some of the technical challenges described above for a range of NLP applications, outline possible approaches, including standard data sets, models, and evaluation methods. Discuss potential strengths and weaknesses of the suggested approaches (including both technical and ethical issues, where appropriate), and provide examples to illustrate implement parts of an NLP system with the help of appropriate support code and/or tools. Evaluate and interpret the results of implemented methods on natural language data sets

Reading List
REQUIRED: Dan Jurafsky and James Martin Speech and Language Processing (3rd edition online, and 2009 2nd edition for chapters that aren't yet updated in 3rd edition). RECOMMENDED: Bird, S., E. Klein and E. Loper, Natural Language Processing with Python, (2009) O'Reilly Media.

Additional Information
Course URL	https://opencourse.inf.ed.ac.uk/fnlp
Graduate Attributes and Skills	Cognitive skills: critical thinking (via tutorials, labs and assessed work), detecting and handling ambiguity (via the study of linguistic ambiguity in this course). Responsibility, autonomy and effectiveness: self-awareness and reflection (via acquisition of the skill of perceiving linguistic ambiguity that in normal human language processing people don't perceive), independent learning (via the labs, required reading and preparation for tutorials), exploration and testing of evidence towards (or against) a hypothesis (via the labs and tutorials), time management (via coursework). Communication: written communication.
Keywords	natural language,corpus-based methods,machine learning

Contacts
Course organiser	Dr Adam Lopez Tel: (0131 6)50 4430 Email: alopez@inf.ed.ac.uk	Course secretary	Miss Rose Hynd Tel: (0131 6)50 5194 Email: rhynd@ed.ac.uk

Navigation

Help & Information

Search DPTs and Courses

Regulations

Degree Programmes

Courses

Humanities and Social Science

Science and Engineering

Medicine and Veterinary Medicine

Other Information

Combined Course Timetable

Important Information