Undergraduate Course: Informatics 2 - Foundations of Data Science (INFR08030)
Course Outline
School | School of Informatics |
College | College of Science and Engineering |
Credit level (Normal year taken) | SCQF Level 8 (Year 2 Undergraduate) |
Availability | Not available to visiting students |
SCQF Credits | 20 |
ECTS Credits | 10 |
Summary | This course introduces students to a core set of knowledge, skills, and ways of thinking that are needed for data science. It brings together several strands: mathematical and computational techniques from statistics and machine learning; practical work with toolchains for data wrangling, analysis, and presentation; critical thinking and writing skills needed to evaluate and present claims; and case studies prompting discussion of the real world implications of data science.
*This course replaces "Informatics 2B - Learning" (INFR08028) from 2020/21.* |
Course description |
The course will be delivered through a combination of lectures, workshops, and practical labs; students will be expected to complete both pencil-and-paper and programming-based exercises on their own time as well as during workshops and scheduled labs. Students will complete a data science mini-project to assess their practical and writing skills, and will also sit an exam. Technical topics in the course will be covered in three sections, with indicative topics listed below. Practical aspects of these will use a Python-based ecosystem.
1. Data wrangling and exploratory data analysis
- Working with tabular data
- Descriptive statistics and visualisation
- Linear regression and correlation
- Clustering
2. Supervised machine learning
- Classification
- More on linear regression; logistic regression
- Generalization and regularization
3. Statistical inference
- Randomness, simulation and sampling
- Confidence intervals, law of large numbers
- Randomized studies, hypothesis testing
Interleaved with these topics will be topics focusing on real-world implications (often using case studies), critical thinking, working and writing skills. These may be introduced in lecture but will often include a workshop discussion and/or peer review of written work. Indicative topics include:
A. Implications:
- Where does data come from? (Sample bias, data licensing and privacy issues)
- Visualisation: misleading plots, accessible design
- Machine learning: algorithmic bias and discrimination
B. Thinking, working, and writing:
- Claims and evidence: what can we conclude; analysis of errors
- Reproducibility; programming "notebooks" vs modular code
- Scientific communication; structure of a lab report
- Reading and critique of data science articles
|
Course Delivery Information
|
Academic year 2020/21, Not available to visiting students (SS1)
|
Quota: None |
Course Start |
Full Year |
Timetable |
Timetable |
Learning and Teaching activities (Further Info) |
Total Hours:
200
(
Lecture Hours 30,
Supervised Practical/Workshop/Studio Hours 27,
Summative Assessment Hours 2,
Programme Level Learning and Teaching Hours 4,
Directed Learning and Independent Learning Hours
137 )
|
Assessment (Further Info) |
Written Exam
0 %,
Coursework
100 %,
Practical Exam
0 %
|
Additional Information (Assessment) |
The course will not have a final exam, but rather a number of pieces of assessed coursework including:
* Exercise in data visualisation
* 1 or 2 timed short tests (possibly multiple choice)
* An essay describing and evaluating a data science academic paper or news article
* A data science project report and presentation
|
Feedback |
Students will receive feedback from instructors and/or peers during workshop discussions and on at least one formative assessment similar to the final written assignment. |
No Exam Information |
Learning Outcomes
On completion of this course, the student will be able to:
- Describe and apply good practices for storing, manipulating, summarising, and visualising data.
- Use standard packages and tools for data analysis and describing this analysis, such as Python and LaTeX.
- Apply basic techniques from descriptive and inferential statistics and machine learning; interpret and describe the output from such analyses.
- Critically evaluate data-driven methods and claims from case studies, in order to identify and discuss a) potential ethical issues and b) the extent to which stated conclusions are warranted given evidence provided.
- Complete a data science project and write a report describing the question, methods, and results.
|
Additional Information
Graduate Attributes and Skills |
Not entered |
Special Arrangements |
Only available to Informatics students, including those on joint degrees. |
Keywords | data science,statistics,machine learning |
Contacts
Course organiser | Dr David Sterratt
Tel: (0131 6)51 1739
Email: David.C.Sterratt@ed.ac.uk |
Course secretary | Ms Kendal Reid
Tel: (0131 6)51 3249
Email: kr@inf.ed.ac.uk |
|
|