THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2024/2025

Timetable information in the Course Catalogue may be subject to change.

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : EPCD Online

Postgraduate Course: Machine Learning at Scale (EPCD11013)

Course Outline
SchoolSchool of Informatics CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 11 (Postgraduate)
Course typeOnline Distance Learning AvailabilityNot available to visiting students
SCQF Credits10 ECTS Credits5
SummaryThis course aims to teaching the skills required to take machine learning, specifically deep neural networks (DDNs), from simple examples up to deployment on very large datasets or models at scale. It looks at the implementation and optimization of large scale machine learning solutions, considering both training and inference functions, and targeting high performance hardware such as GPUs and TPUs. The course will consider both utilizing common machine learning frameworks and writing standalone implementations.
Course description Machine learning occurs at a range of scales, from small scale networks with small datasets or parameter sizes, through to extremely large networks with millions of parameters and datasets of terabyte sizes. Machine learning also has two very distinct phases of operation; training and inference. To enable efficient and quick machine learning exploitation when working with very large networks or very large datasets, powerful computing hardware is required. When using significant amounts of computational hardware, there are challenges in ensuring that applications run efficiently and effectively at scale.

This course will provide the practical skills and knowledge require to run machine learning on large-scale HPC systems and hardware to deliver trained models and inference as quickly and efficiently as possible. We will work with a range of common machine learning frameworks, examining how to run them efficiently in parallel and on a range of hardware. We will also utilize parallel programming skills to develop our own implementations of machine learning functionality and augment existing framework solutions, where appropriate. The course will evaluate a range of real-world examples where researchers have scaled machine learning up to very large computers, and learn from the current state of the art for distributed machine learning training and inference.
Entry Requirements (not applicable to Visiting Students)
Pre-requisites Co-requisites
Prohibited Combinations Other requirements None
Course Delivery Information
Academic year 2024/25, Not available to visiting students (SS1) Quota:  None
Course Start Semester 2
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 100 ( Online Activities 30, Programme Level Learning and Teaching Hours 2, Directed Learning and Independent Learning Hours 68 )
Assessment (Further Info) Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment) 100% Coursework splt into two assignments:«br /»
1) Traditional Coursework exercise (50%) «br /»
2) Exam-style short answer questions to be taken over 1-2 weeks maximum (50%)
Feedback Provided via live session discussion of material and practical exercises and on assessed work.
No Exam Information
Learning Outcomes
On completion of this course, the student will be able to:
  1. Efficiently deploy machine learning on CPUs, GPUs, and other accelerators on a single node
  2. Understand impacts on I/O for training and inference systems and how to efficiently exploit parallel filesystems
  3. Diagnose and mitigate bottlenecks in scaling machine learning up to large dataset or large numbers of nodes and computational resources
  4. Develop custom machine learning applications
  5. Efficiently exploit and evaluate pre-existing machine learning frameworks
Reading List
Provided via Learn/Leganto and live sessions based on discussion topics raised
Additional Information
Graduate Attributes and Skills Problem solving and analytical thinking

Knowledge integration

Planning and time management

Situational awareness
Special Arrangements This is the Online Learning version of on-campus course EPCC11013 Machine Learning at Scale. On-campus students should refer to that course.
KeywordsHPC,Machine Learning,Parallelism,Deep Neural Networks,Imaging and Vision,Data Science,Big Data
Contacts
Course organiserMr William Jackson
Tel:
Email: Adrian.Jackson@ed.ac.uk
Course secretaryMr James Richards
Tel: 90131 6)51 3578
Email: J.Richards@epcc.ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information