Undergraduate Course: Informatics 2 - Foundations of Data Science (INFR08030)
|School||School of Informatics
||College||College of Science and Engineering
|Credit level (Normal year taken)||SCQF Level 8 (Year 2 Undergraduate)
||Availability||Not available to visiting students
|Summary||This course introduces students to a core set of knowledge, skills, and ways of thinking that are needed for data science. It brings together several strands: mathematical and computational techniques from statistics and machine learning; practical work with toolchains for data wrangling, analysis, and presentation; critical thinking and writing skills needed to evaluate and present claims; and case studies prompting discussion of the real world implications of data science.
*This course replaces "Informatics 2B - Learning" (INFR08028) from 2020/21.*
The course will be delivered through a combination of lectures, workshops, and practical labs; students will be expected to complete both pencil-and-paper and programming-based exercises on their own time as well as during workshops and scheduled labs. Students will complete a data science project to assess their practical and writing skills. Technical topics in the course will be covered in three sections, with indicative topics listed below. Practical aspects of these will use a Python-based ecosystem.
1. Data wrangling and exploratory data analysis
- Working with tabular data
- Descriptive statistics and visualisation
- Linear regression and correlation
2. Supervised machine learning
- More on linear regression; logistic regression
- Generalization and regularization
3. Statistical inference
- Randomness, simulation and sampling
- Confidence intervals, law of large numbers
- Randomized studies, hypothesis testing
Interleaved with these topics will be topics focusing on real-world implications (often using case studies), critical thinking, working and writing skills. These may be introduced in lecture but will often include a workshop discussion and/or peer review of written work. Indicative topics include:
- Where does data come from? (Sample bias, data licensing and privacy issues)
- Visualisation: misleading plots, accessible design
- Machine learning: algorithmic bias and discrimination
B. Thinking, working, and writing:
- Claims and evidence: what can we conclude; analysis of errors
- Reproducibility; programming "notebooks" vs modular code
- Scientific communication; structure of a lab report
- Reading and critique of data science articles
Course Delivery Information
|Academic year 2021/22, Not available to visiting students (SS1)
|Learning and Teaching activities (Further Info)
Lecture Hours 30,
Supervised Practical/Workshop/Studio Hours 27,
Summative Assessment Hours 2,
Programme Level Learning and Teaching Hours 4,
Directed Learning and Independent Learning Hours
|Assessment (Further Info)
|Additional Information (Assessment)
||The course will not have a final exam, but rather a number of pieces of assessed coursework including:
* Exercise in data wrangling and data visualisation
* 1 or 2 timed short tests (possibly multiple choice)
* A structured critical evaluation of a data science academic paper or news article
* A data science project report and presentation
||Students will receive feedback from instructors and/or peers during workshop discussions and on at least one formative assessment similar to the final written assignment.
|No Exam Information
On completion of this course, the student will be able to:
- Describe and apply good practices for storing, manipulating, summarising, and visualising data.
- Use standard packages and tools for data analysis and describing this analysis, such as Python and LaTeX.
- Apply basic techniques from descriptive and inferential statistics and machine learning; interpret and describe the output from such analyses.
- Critically evaluate data-driven methods and claims from case studies, in order to identify and discuss a) potential ethical issues and b) the extent to which stated conclusions are warranted given evidence provided.
- Complete a data science project and write a report describing the question, methods, and results.
|Graduate Attributes and Skills
||Only available to Informatics students, including those on joint degrees.
|Keywords||data science,statistics,machine learning
|Course organiser||Dr David Sterratt
Tel: (0131 6)51 1739
|Course secretary||Miss Kerry Fernie
Tel: (0131 6)50 5194