Postgraduate Course: Design and Sampling for Data Science (MATH11245)
Course Outline
School | School of Mathematics |
College | College of Science and Engineering |
Credit level (Normal year taken) | SCQF Level 11 (Postgraduate) |
Availability | Not available to visiting students |
SCQF Credits | 10 |
ECTS Credits | 5 |
Summary | This course provides instruction on methodology for efficient and statistically sound data collection and generation procedures in three core areas: classic statistical sampling, experiment design, and observational studies. Optional, and varying topics, include spatial sampling, designs for computer experiments, and the problematic handling of convenience, found or transactional data,
e.g., citizen science data and web-scraping. |
Course description |
Core topics covered in the course, to be always included, are:
- Discussion of general data quality issues such as bias and precision and effects on inference quality
- Fundamental statistical sampling designs including simple random, systematic, stratified, cluster, multi-stage sampling.
- Fundamental experimental designs including completely randomised, randomised blocks, repeated measures
- Observational studies such as cross-sectional, prospective, retrospective and case-control, potentially before-aftercontrol-impact designs.
Other potential topics to be selected from, and can vary between course deliveries
- Spatial and spatial-temporal sampling designs.
- Design for analysis of computer experiments including space filling designs.
- The handling of opportunistic and found data, such as citizen science data.
- Data collection of massively large data sets and very wide data sets as for machine learning algorithms
|
Entry Requirements (not applicable to Visiting Students)
Pre-requisites |
Students MUST have passed:
Statistical Methodology (MATH10095) AND
Statistical Computing (MATH10093)
|
Co-requisites | |
Prohibited Combinations | |
Other requirements | Note that PGT students on School of Mathematics MSc programmes are not required to have taken pre-requisite courses, but they are advised to check that they have studied the material covered in the syllabus of each pre-requisite course before enrolling. |
Course Delivery Information
|
Academic year 2024/25, Not available to visiting students (SS1)
|
Quota: None |
Course Start |
Semester 2 |
Timetable |
Timetable |
Learning and Teaching activities (Further Info) |
Total Hours:
100
(
Lecture Hours 22,
Seminar/Tutorial Hours 6,
Summative Assessment Hours 2,
Programme Level Learning and Teaching Hours 2,
Directed Learning and Independent Learning Hours
68 )
|
Assessment (Further Info) |
Written Exam
80 %,
Coursework
20 %,
Practical Exam
0 %
|
Additional Information (Assessment) |
Coursework : 20%
Examination : 80% |
Feedback |
Not entered |
Exam Information |
Exam Diet |
Paper Name |
Hours & Minutes |
|
Main Exam Diet S2 (April/May) | | 2:00 | |
Learning Outcomes
On completion of this course, the student will be able to:
- Formulate and implement several classic sampling designs and carry out population level inference for simple parameters.
- Formulate and implement several classic experimental designs and carry out common statistical estimation and testing procedures.
- Describe and distinguish major classes of observational studies, discuss their advantages and disadvantages, and some analysis procedures
- Identify limitations and potential problems with data from observational studies and found, convenience data as well as approaches for minimizing biases and controlling for confounding factors.
- Understand and implement procedures for sample size and experiment size determination to achieve desired levels of precision or statistical power.
|
Reading List
Possible textbooks for the following topics:
1. Sampling: Sampling and Estimation from Finite Populations (2019, Tillé);
Sampling 3rd edition (2012, Thompson);
Sampling Theory For the Ecological and Natural Resource Sciences (2019, Hankin et al)
2. Experiment Design: Design of experiments : a modern approach (2020, Jones and Montgomery);
Design of comparative experiments (2008, Bailey);
Design and Analysis of Experiments and Observational Studies Using R (2022, Tabach)
3. Observational Study Design: Design of Observational Studies (2010, Rosenbaum);
Design and Analysis of Experiments and Observational Studies Using R (2022, Tabach)
4. Computer Experiments: The Design and Analysis of Computer Experiments, 2nd Ed (2018, Santner et al)
Papers:
1. The Accuracy of Citizen Science Data: A Quantitative Review (https://doi.org/10.1002/bes2.1336)
2. BACI: Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration (https://doi.org/10.1007%2Fs10661-016-5526-6)
3. Big Data, Wide Data: Statistics for big data: A perspective (https://doi.org/10.1016/j.spl.2018.02.016 |
Additional Information
Graduate Attributes and Skills |
Not entered |
Keywords | DSDS,Data Science |
Contacts
Course organiser | Dr Ken Newman
Tel: (0131 6)50 4899
Email: ken.newman@ed.ac.uk |
Course secretary | Miss Kirstie Paterson
Tel:
Email: Kirstie.Paterson@ed.ac.uk |
|
|