THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2020/2021

Information in the Degree Programme Tables may still be subject to change in response to Covid-19

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : Informatics

Undergraduate Course: Informatics 2 - Foundations of Data Science (INFR08030)

Course Outline
SchoolSchool of Informatics CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 8 (Year 2 Undergraduate) AvailabilityNot available to visiting students
SCQF Credits20 ECTS Credits10
SummaryThis course introduces students to a core set of knowledge, skills, and ways of thinking that are needed for data science. It brings together several strands: mathematical and computational techniques from statistics and machine learning; practical work with toolchains for data wrangling, analysis, and presentation; critical thinking and writing skills needed to evaluate and present claims; and case studies prompting discussion of the real world implications of data science.

*This course replaces "Informatics 2B - Learning" (INFR08028) from 2020/21.*
Course description The course will be delivered through a combination of lectures, workshops, and practical labs; students will be expected to complete both pencil-and-paper and programming-based exercises on their own time as well as during workshops and scheduled labs. Students will complete a data science mini-project to assess their practical and writing skills, and will also sit an exam. Technical topics in the course will be covered in three sections, with indicative topics listed below. Practical aspects of these will use a Python-based ecosystem.

1. Data wrangling and exploratory data analysis
- Working with tabular data
- Descriptive statistics and visualisation
- Linear regression and correlation
- Clustering

2. Supervised machine learning
- Classification
- More on linear regression; logistic regression
- Generalization and regularization

3. Statistical inference
- Randomness, simulation and sampling
- Confidence intervals, law of large numbers
- Randomized studies, hypothesis testing

Interleaved with these topics will be topics focusing on real-world implications (often using case studies), critical thinking, working and writing skills. These may be introduced in lecture but will often include a workshop discussion and/or peer review of written work. Indicative topics include:

A. Implications:
- Where does data come from? (Sample bias, data licensing and privacy issues)
- Visualisation: misleading plots, accessible design
- Machine learning: algorithmic bias and discrimination

B. Thinking, working, and writing:
- Claims and evidence: what can we conclude; analysis of errors
- Reproducibility; programming "notebooks" vs modular code
- Scientific communication; structure of a lab report
- Reading and critique of data science articles
Entry Requirements (not applicable to Visiting Students)
Pre-requisites Students MUST have passed: Informatics 1 - Introduction to Computation (INFR08025) AND Informatics 1 - Object Oriented Programming (INFR08029) AND Introduction to Linear Algebra (MATH08057)
It is RECOMMENDED that students have passed Calculus and its Applications (MATH08058)
Co-requisites Students MUST also take: Discrete Mathematics and Probability (INFR08031) OR Probability (MATH08066)
Prohibited Combinations Other requirements Only available to Informatics students, including those on joint degrees
Course Delivery Information
Academic year 2020/21, Not available to visiting students (SS1) Quota:  None
Course Start Full Year
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 200 ( Lecture Hours 30, Supervised Practical/Workshop/Studio Hours 27, Summative Assessment Hours 2, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 137 )
Assessment (Further Info) Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment) The course will not have a final exam, but rather a number of pieces of assessed coursework including:

* Exercise in data visualisation
* 1 or 2 timed short tests (possibly multiple choice)
* An essay describing and evaluating a data science academic paper or news article
* A data science project report and presentation
Feedback Students will receive feedback from instructors and/or peers during workshop discussions and on at least one formative assessment similar to the final written assignment.
No Exam Information
Learning Outcomes
On completion of this course, the student will be able to:
  1. Describe and apply good practices for storing, manipulating, summarising, and visualising data.
  2. Use standard packages and tools for data analysis and describing this analysis, such as Python and LaTeX.
  3. Apply basic techniques from descriptive and inferential statistics and machine learning; interpret and describe the output from such analyses.
  4. Critically evaluate data-driven methods and claims from case studies, in order to identify and discuss a) potential ethical issues and b) the extent to which stated conclusions are warranted given evidence provided.
  5. Complete a data science project and write a report describing the question, methods, and results.
Reading List
None
Additional Information
Graduate Attributes and Skills Not entered
Special Arrangements Only available to Informatics students, including those on joint degrees.
Keywordsdata science,statistics,machine learning
Contacts
Course organiserDr David Sterratt
Tel: (0131 6)51 1739
Email: David.C.Sterratt@ed.ac.uk
Course secretaryMs Kendal Reid
Tel: (0131 6)51 3249
Email: kr@inf.ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information