THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2016/2017

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Physics and Astronomy : Postgraduate (School of Physics and Astronomy)

Postgraduate Course: Practical Introduction to Data Science (PGPH11092)

Course Outline
SchoolSchool of Physics and Astronomy CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 11 (Postgraduate)
Course typeOnline Distance Learning AvailabilityNot available to visiting students
SCQF Credits20 ECTS Credits10
SummaryThis course is ONLY available to students on the online DSTI programmes (please see below* for alternatives).

This online course will provide a practical introduction to data science. It will have two broad themes, namely Data Management and Data Analytics.

Data Science is an emerging field, which is becoming very important both in research, business and industry. The amount of data that is being generated and stored is greater than it has ever been, and this brings both challenges in terms of how you work with the data and - importantly - rewards in terms of new insight gained from analysing the data.

The course is practical in the sense that you will have the chance to use R and Python to explore the techniques and ideas described in the course videos.

The course will be delivered entirely online, and videos, notes and exercises will be released as the course progresses.

* The course is also available as PGPH11100 for non-DTSI students but it still has a non-standard duration (20 weeks) which will be incompatible with many programmes.
Course description Data Science means different things to different people. In this course, we interpret the term fairly broadly, and look at the various aspects of the process of extracting knowledge from data. This course is intended to give a broad introduction to the topic but it will get into sufficient detail to provide practical, hands-on experience of some of the tools and techniques used widely in academic research and in commercial environments.

Data Science is a very interdisciplinary field and so the course will expose students to aspects of computer science, software engineering, maths and statistics. It is designed to be accessible whether you come from one of these backgrounds, or whether you come from an applications area (be that in business, science, or elsewhere).

The course will have two main (intertwined) strands that will cover "looking after data" so that it can be used for analysis and the actual processing of this data to provide insight and answers to specific questions.

The course will cover:
- Why managing data better matters, and why it's hard
- Data formats: structuring data and keeping them useful Metadata: describing data and keeping them useful Research data management planning
- Publication and citation of research data
- Persistence, preservation and provenance of research data
- Licensing, copyright and access rights: some things researchers need to know
- Key data analytical techniques such as, classification, optimisation, and unsupervised learning
- Key parallel patterns, such as Map Reduce, for implementing analytical techniques
- Practical introductions to key Data Science tools and their application to data science problems, e.g., R, Python
- Case studies from academia and business

Entry Requirements (not applicable to Visiting Students)
Pre-requisites Co-requisites
Prohibited Combinations Students MUST NOT also be taking Practical Introduction to Data Science (non-DTSI) (PGPH11100)
Other requirements None
Course Delivery Information
Academic year 2016/17, Not available to visiting students (SS1) Quota:  None
Course Start Semester 2
Course Start Date 16/01/2017
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 200 ( Lecture Hours 35, Seminar/Tutorial Hours 10, Online Activities 20, Summative Assessment Hours 60, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 71 )
Assessment (Further Info) Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment) ** Formative
1.Peer assessment of answers to set exercises;
2.Self-assessment via online multiple-choice questions to reinforce key topics.

** Summative
1.Report/Essay on the importance of data management and associated concepts (25%)
2.Report/Essay addressing appropriate data analytic techniques for a given problem (25%)
3.Write-Up of an Exercise involving the analysis of an online data set (or sets) (50%)

The first report will be due half-way through the course. This will ensure that all students engage in the course from an early stage.

The reports will be based on core course material, but for an excellent mark, the student will be expected to illustrate and expand on their answer using examples from existing literature or online materials produced by others.

The write-up will show the results of an exercise designed to test the practical skills of the students. This will involve some programming and/or use of analysis tools. The exercise will use an existing online, open data set. Some degree of flexibility will be incorporated to allow the student to demonstrate that they can extract interesting information from the dataset which is not specified in advance.
Feedback Not entered
No Exam Information
Learning Outcomes
On completion of this course, the student will be able to:
  1. Have knowledge of the common, popular, important data analytics techniques and the types of compute and data infrastructures used for data analytics
  2. Understand what data analytics, data science and big data are and the importance of data management in general, and in relation to their own potential futures as data professionals
  3. Understand the importance of structuring research data, the importance of good metadata and have knowledge of best practice for publishing, citing and preserving data
  4. Be able to write programs in R and Python to undertake basic data processing and analysis
  5. Be able to identify and apply appropriate data analytic techniques to a problem and critically evaluate the analytical performance of a data analytic technique
Learning Resources
There is no compulsory course text. The book "Doing Data Science" (O'Neil, Schutt; O'Reilly, 2013 ISBN:978-1-4493-5865-5) is a useful complement to parts of the course for those who prefer learning from textbooks.
Additional Information
Course URL http://www.epcc.ed.ac.uk/online-courses
Graduate Attributes and Skills - Critical thinking
- Communication of complex ideas in accessible language
- Working in an interdisciplinary field
- Programming and Scripting
KeywordsNot entered
Contacts
Course organiserDr Adam Carter
Tel: (0131 6)50 6009
Email: A.carter@epcc.ed.ac.uk
Course secretaryMs Joan Strachan
Tel: (0131 6)50 5030
Email: Joan.Strachan@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information
 
© Copyright 2016 The University of Edinburgh - 3 February 2017 4:58 am