THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2020/2021

Information in the Degree Programme Tables may still be subject to change in response to Covid-19

University Homepage

DRPS : Course Catalogue : School of Mathematics : Mathematics

Undergraduate Course: Introduction to Data Science (MATH08077)

Course Outline
School	School of Mathematics	College	College of Science and Engineering
Credit level (Normal year taken)	SCQF Level 8 (Year 1 Undergraduate)	Availability	Not available to visiting students
SCQF Credits	20	ECTS Credits	10
Summary	This is an introductory level course on data science and statistical thinking. Students will learn to explore, visualize, and analyze data to understand natural phenomena, investigate patterns, model outcomes, and make predictions, and do so in a reproducible and shareable manner. In doing so, they will gain experience in data collection, wrangling, and visualization, exploratory data analysis, predictive modelling, and effective communication of results while working on problems and case studies inspired by and based on real-world questions. The course will focus on the R statistical computing language. No statistical or computing background is necessary.
Course description	This course is comprised of three learning units: Unit 1 - Collecting and exploring data: This unit focuses on data visualization, wrangling, and collection. Specifically we cover fundamentals of data and data visualization, confounding variables, and Simpsonŋs paradox as well as the concept of tidy data, data import, data rectangling and cleaning, and data collection. We end the unit with web scraping and introduce the idea of iteration in preparation for the next unit. Also in this unit students are introduced to the toolkit: R, RStudio, R Markdown, Git, GitHub, etc. Unit 2 - Modelling and prediction: This unit introduces simple and multiple linear regression models, with a focus on interpretations, visualizing interactions, model selection, prediction, and model validation. Unit 3 - Making rigorous conclusions: In this part we introduce statistical inference for making data based conclusions from a simulation based perspective, focusing on bootstrapping and randomization.

Entry Requirements (not applicable to Visiting Students)
Pre-requisites		Co-requisites
Prohibited Combinations		Other requirements	None

Course Delivery Information

Academic year 2020/21, Not available to visiting students (SS1)		Quota: 300
Course Start	Semester 1
Timetable	Timetable
Learning and Teaching activities (Further Info)	Total Hours: 200 ( Lecture Hours 22, Seminar/Tutorial Hours 22, Supervised Practical/Workshop/Studio Hours 11, Summative Assessment Hours 3, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 138 )
Assessment (Further Info)	Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment)	Homework: 50% - Individually completed. Lab: 20% - Completed in teams, designed to foster engagement among students and prepare them for the project. Project: 20% - Completed in teams. Quiz: 10% - Completed individually, designed to provide weekly check-ins for covered material.
Feedback	Homework: Partially auto marked, partially marked by tutors. Lab: Partially auto marked, partially marked by CO. Project: Marked during presentations, extensive feedback is not returned to students. Quiz: Auto marked.
No Exam Information

Learning Outcomes
On completion of this course, the student will be able to: employ all stages of a modern data science pipeline, including import, tidy, transform, visualize, model, and communicate. critique data-based claims and evaluate data-based decisions. interpret results correctly, effectively, and in context without relying on statistical jargon complete a research project on a dataset of their choosing, demonstrating mastery of the data science pipeline. use the statistical computing language R to perform fully reproducible data analyses that are version controlled.

Reading List
There is no compulsory course text. The following books are useful complements to parts of the course for those who prefer learning from textbooks. Both books are freely available online. - R for Data Science - Grolemund, Wickham O'Reilly, 1st edition, 2016 - OpenIntro: Introduction to Modern Statistics - ĮetinkayaRundel, Hardin. CreateSpace, Preliminary Edition, 2020

Additional Information
Graduate Attributes and Skills	Not entered
Keywords	IDS

Contacts
Course organiser	Dr Mine Cetinkaya-Rundel Tel: (0131 6)50 5060 Email: mine.cetinkaya-rundel@ed.ac.uk	Course secretary	Mrs Frances Reid Tel: (0131 6)50 4883 Email: f.c.reid@ed.ac.uk

Navigation

Help & Information

Search DPTs and Courses

Regulations

Degree Programmes

Courses

Humanities and Social Science

Science and Engineering

Medicine and Veterinary Medicine

Other Information

Combined Course Timetable

Important Information