THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2024/2025

Timetable information in the Course Catalogue may be subject to change.

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Biological Sciences : Postgraduate

Postgraduate Course: Using R for Data Science (PGBI11122)

Course Outline
SchoolSchool of Biological Sciences CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 11 (Postgraduate) AvailabilityNot available to visiting students
SCQF Credits10 ECTS Credits5
SummaryR is an environment and a language for data analysis and statistics. R provides a generic set of tools that can be applied to problems in many areas of data science as well as in related areas such as bioinformatics and genomics. This course explores the rich set of tools that R provides and how in practice these tools can be applied to solve real complex biological problems. As part of the course we introduce a biological model problem in detail, currently a problem from regulatory genomics, and show how we can build complex analysis processes using R and apply machine learning techniques to model this data.
Course description The course is taught from first principles and no previous experience of R is required. The course begins with an introduction to R and RStudio and also the biological background to the model question and the specific machine learning methods used on the course. It will go on to explore the R programming model and how in practice scripts and workflows are written in R. The course will then explore how interactive graphical applications can be build using R and Shiny, how complex relational data can be exploited using R. It will explore how data can be imported and processed in R (data cleaning and wrangling) and ultimately how interactive workflows can be run using cluster computing (using Apache Spark). Finally the course will explore detailed data visualisation and plotting of results (eg using ggplot2) and how the analysis outcome can be interpreted in the context of the motivating biological problem.

This course is designed to be complementary to other existing courses that make extensive use of R such as Statistics and Data Analysis (PGBI11003) or Functional Genomic Technologies (PGBI11040). There is a strong focus on developing real practical generic skills in R that can then be applied to a wide range of biological problems.
Entry Requirements (not applicable to Visiting Students)
Pre-requisites Co-requisites
Prohibited Combinations Other requirements None
Course Delivery Information
Academic year 2024/25, Not available to visiting students (SS1) Quota:  30
Course Start Semester 1
Timetable Timetable
Learning and Teaching activities (Further Info) Total Hours: 100 ( Lecture Hours 30, Programme Level Learning and Teaching Hours 2, Directed Learning and Independent Learning Hours 68 )
Assessment (Further Info) Written Exam 50 %, Coursework 50 %, Practical Exam 0 %
Additional Information (Assessment) In-course assessment (50%) and exam (50%). The in-course assessment will be a generalised analysis task using R applying methodologies taught in the course. The exam will be made up of three questions: one compulsory and two optional.
Feedback Assignment marks and written feedback will be provided fifteen working days after submission.
Exam marks and written feedback will be provided after mark ratification at the semester 2 Board of Examiners.
Exam Information
Exam Diet Paper Name Hours & Minutes
Main Exam Diet S1 (December)2:00
Learning Outcomes
On completion of this course, the student will be able to:
  1. Implement in software a complex data analysis task in R.
  2. Pick the appropriate analysis strategy to achieve address a particular analysis question.
  3. Interpret a complex analysis output in terms of the experiment hypothesis and specific biological context.
Reading List
R for Data Science, Hadley Wickham & Garrett Grolemund
Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Editors: Robert Gentleman, Vincent J. Carey, Wolfgang Huber, Rafael A. Irizarry, Sandrine Dudoit
Additional Information
Graduate Attributes and Skills SCQF Level 11, Characteristic 2-Practice, Applied knowledge, skills and understanding. For example, Develop original and creative responses to problems and issues.
SCQF Level 11, Characteristic 3-Generic cognitive skills. For example Knowledge that covers and integrates most, if not all, of the main areas of the subject/discipline/sector including their features, boundaries, terminology and conventions.
SCQF Level 11, Characteristic 4-Communication, Numeracy and ICT skills. For example, use a wide range of ICT applications to support and enhance work at this level and adjust features to suit purpose.
SCQF Level 11, Characteristic 5- Autonomy, Accountability and Working with Others. For example, exercise substantial initiative in professional and equivalent activities and take responsibility for own work.
KeywordsBioinformatics,R,Data Science
Contacts
Course organiserDr Simon Tomlinson
Tel: (0131 6)51 7252
Email: simon.tomlinson@ed.ac.uk
Course secretaryMr Alex Ramsay
Tel:
Email: gramsay3@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information