# DEGREE REGULATIONS & PROGRAMMES OF STUDY 2024/2025

### Timetable information in the Course Catalogue may be subject to change.

 University Homepage DRPS Homepage DRPS Search DRPS Contact
DRPS : Course Catalogue : School of Mathematics : Mathematics

# Postgraduate Course: Extended Statistical Programming (MATH11242)

 School School of Mathematics College College of Science and Engineering Credit level (Normal year taken) SCQF Level 11 (Postgraduate) Availability Not available to visiting students SCQF Credits 20 ECTS Credits 10 Summary The course covers the fundamentals of Statistical Programming, using the R language for practical work. The aims are 1. To teach good programming practice: design, structure, documentation/commenting, testing, debugging, version control and reproducibility. 2. To teach the key programming skills and methods required for statistics and data science. These are stochastic simulation, visualization, data handling, matrix computation and linear modelling, likelihood and optimization, bootstrapping and Bayesian stochastic simulation. Course description This course is designed for MSc students on the Statistics with Data Science MSc. It prepares students for the practical computational aspects of the MSc and future work in Statistics and Data Science. The aim for students to learn structured reproducible programming using the R statistical computing language, and to acquire a basic skill set in the core elements of statistical computing. The outline content of syllabus is: git and github Setting up a repo on github; using git; making a local working copy of the repo; modifying work and synchronising with the github repo; simple work cycle; simple conflict resolution; adding and deleting files with git; more advanced use Programming for statistical data analysis - basic principles - what is programming - what makes an analysis statistical Getting started with R A first R session; dissecting a simple programming example; data are not always numbers A more systematic look at R Objects, classes, attributes; data structures; attributes; operators; loops and conditional execution; functions; '...' in R; pipes; planning and coding; vectorization; useful built in functions; apply. Simulation I Random sampling building blocks: sampling data, sampling from distributions; simulation from stochastic models; statistical simulation studies. Reading and storing data in files Working directories; reading code; reading and writing text data; reading and writing binary files; reading from other sources. Data re-arrangement and tidy data Concept of tidy data; data tidying; regular expressions. Statistical Modelling: linear models Linear model as a prototype statistical model; basic model concepts; interactions; computing with linear models; model matrix; model formulae; fitting linear models in R. R classes Concepts of classes and object orientation; S3 methods in R Matrix computation Ordering and efficiency; general solution of linear systems; Cholesky, forward and backsolve; QR; Pivoting and triangular factorizations; Symmetric eigen decomposition and PCA; SVD. Design, Debug, Test and Profile Design before you code; worked example; testing; debugging and debuggers; profiling flops and memory. Maximum Likelihood Estimation Concept; statement of large sample results; what they mean. Numerical Optimization Newton's method; quasi-Newton; Nelder Mead; R optimization functions; simple constraints on parameters; getting derivatives. Graphics Systematic look at graphics already encountered; scatterplots ggplot and base R; plotting to file; univariate plots; boxplots etc; 3D plotting. Simulation II: simulation for inference Bootstrapping: generating from distributions and the nonparametric bootstrap, bootstrapping multivariate data; MCMC for Bayesian inference; Metropolis Hastings; Gibbs; Graphical model, DAGs and automatic Gibbs; JAGS. R markdown Basic overview of R markdown for reproducible data analysis.
 Pre-requisites Co-requisites Prohibited Combinations Students MUST NOT also be taking Statistical Programming (MATH11176) Other requirements * This course is only available to students on an Mathematics MSc or MSc Data Science (Informatics)* Note that PGT students on School of Mathematics MSc programmes are not required to have taken pre-requisite courses, but they are advised to check that they have studied the material covered in the syllabus of each pre-requisite course before enrolling.
 Academic year 2024/25, Not available to visiting students (SS1) Quota:  None Course Start Semester 1 Timetable Timetable Learning and Teaching activities (Further Info) Total Hours: 200 ( Lecture Hours 22, Supervised Practical/Workshop/Studio Hours 33, Summative Assessment Hours 60, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 81 ) Assessment (Further Info) Written Exam 0 %, Coursework 100 %, Practical Exam 0 % Additional Information (Assessment) Coursework : 100%«br /» Examination : 0% Feedback Not entered No Exam Information
 On completion of this course, the student will be able to: Write reasonably efficient, well structured and documented computer programs in R.Write efficient implementations of statistical methods.Be able to process data effectively, in particular preparing data for analysis and visualizing data.Show appreciation of reliable and reproducible computational methods, and the nature of a statistical analysis.Demonstrate expertise in commonly used statistical computing methods.