Undergraduate Course: Applied Data Science 2 (IBMS08015)
|School||Deanery of Biomedical Sciences
||College||College of Medicine and Veterinary Medicine
|Credit level (Normal year taken)||SCQF Level 8 (Year 2 Undergraduate)
||Availability||Not available to visiting students
|Summary||ADS2 is an introduction to statistics for informaticians. Statistics is the art of inferring information about very large populations from comparatively small datasets. Computers give us an opportunity to simulate the big populations in order to evolve our intuition about how statistical methods work. Using a simulation-based approach, this course introduces common methods from frequentist inferential statistics, bootstrapping, Bayesian inference and Machine Learning. Concepts around R programming, data handling, organisation and presentation are covered on the way.
ADS2 is an introduction to statistics and data science using tools from informatics. Statistics is the art of inferring information about very large populations from comparatively small datasets. Computers give us an opportunity to simulate the big populations in order to evolve our intuition about how statistical methods work. Using a simulation-based approach, you will learn common methods from frequentist inferential statistics, bootstrapping, Bayesian inference and Machine Learning. You will also learn to handle, describe, analyse and present data in the statistical software R.
Entry Requirements (not applicable to Visiting Students)
||Other requirements|| None
Course Delivery Information
|Academic year 2022/23, Not available to visiting students (SS1)
|Learning and Teaching activities (Further Info)
Lecture Hours 14,
Supervised Practical/Workshop/Studio Hours 28,
Formative Assessment Hours 14,
Programme Level Learning and Teaching Hours 4,
Directed Learning and Independent Learning Hours
|Assessment (Further Info)
|Additional Information (Assessment)
||Semester 1 Open-book timed coding challenge (30%) «br /»
Data analysis group project (30%)«br /»
Semester 2 Open-book timed coding challenge (40%) «br /»
||Summative feedback is given on the three components of in-course assessment: two coding challenges and a group data analysis project.
For the group project, you will be asked to submit an outline and data analysis plan a few weeks before the deadline, and will be given formative feedback on that.
For the coding challenge, you are given weekly problem sets with formative exercises. These will help you prepare for weekly practicals, revise lecture content, and practice for the end-of-semester coding challenges.
Continuous opportunities for formative feedback, both from instructors and peers, are built into the weekly practical sessions.
|No Exam Information
On completion of this course, the student will be able to:
- Critically evaluate statistical representations in the scientific literature, as well as popular media
- Describe common methods for statistical inference and hypothesis testing, understand what data sets they can be applied to, and perform and interpret common hypothesis tests
- Understand the components of a dataset, handle and prepare raw data for further analysis, and display and describe datasets in meaningful ways, while considering ethical implications of data gathering, storage, analysis, and presentation
- Understand the probabilistic underpinnings of frequentist and Bayesian statistics
- Name and describe common Machine Learning methods and implement simple machine learning tasks
|Graduate Attributes and Skills
||This course covers the following core areas of technological knowledge for Biomedical Informatics, as identified by the American Medical Informatics Association (Kulikowski et al., J Am Med Inform Assoc 19, 2012): Information documentation, storage, and retrieval, machine learning, including data mining, representation of logical and probabilistic knowledge and reasoning. Simulation and modelling.
In addition, the course develops the following core graduate skills and attributes for graduates in Bioinformatics, as identified by Welch et al. (2014) PLoS Comp Biol 10(3).
Statistics and Mathematics:
Application of statistics in the contexts of molecular biology and genomics, mastery of relevant statistical and mathematical modelling methods (including experimental design, descriptive and inferential statistics, probability theory)
programming, machine learning, ability to use scientific and statistical analysis software packages
Analysis of biological data, ability to manage, interpret, and analyse large datasets
Time management, project management, independence, curiosity, self-motivation, ability to synthesize information, ability to complete projects, critical thinking, dedication, analytical reasoning, collaborative ability
|Keywords||Data science,statistics,machine learning,R
|Course organiser||Dr Duncan MacGregor
Tel: (0131 6)50 3273
|Course secretary||Miss Natasha Goldie