THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2020/2021

Information in the Degree Programme Tables may still be subject to change in response to Covid-19

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : Informatics

Postgraduate Course: Foundations of Databases (INFR11200)

Course Outline
SchoolSchool of Informatics CollegeCollege of Science and Engineering
Credit level (Normal year taken)SCQF Level 11 (Postgraduate) AvailabilityAvailable to all students
SCQF Credits10 ECTS Credits5
SummaryThis course is a revised and simplified version of the course Advanced Topics in Foundations of Databases (INFR11122) which was offered until 2019-20.

Data is everywhere, coming in different shapes and volumes, and needs to be stored and managed using appropriate data management technologies. The basic software package that supports the management of data is called a database management system (DBMS). The main goal of this course is to explain some of the underlying theoretical principles and characteristics of DBMSs. More precisely, this course will explain how real-life concepts (such as a database and a query) and phenomena (such as incompleteness and inconsistency of data), can be abstracted from their physical implementation and mathematically formalised using tools coming from other areas such as computational logic. This will pave the way towards the study of query evaluation, that is, the central task of extracting meaningful information from (possibly incomplete and inconsistent) data by means of queries, following a mathematically rigorous approach. This analysis will expose the source of complexity in evaluating a query over a database, which in turn provides ideas and tools on how to devise more efficient query evaluation algorithms.
Course description The course will cover the following topics:
- Relational model: data model, relational algebra, relational calculus (first-order queries), first-order query evaluation, static analysis of first-order queries (satisfiability and containment).

- Conjunctive queries (CQs): syntax and semantics (via the notion of homomorphism), CQ evaluation, static analysis of CQs (satisfiability, containment and the Homomorphism Theorem), minimisation of CQs.

- Fast conjunctive query evaluation: acyclic CQs, evaluating acyclic CQs (Yannakaki's algorithm), semantically acyclic CQs and their evaluation, size bounds for joins (AGM bound), worst-case optimal join algorithms.

- Adding recursion - Datalog: inexpressibility of recursive queries, syntax and semantics of Datalog, Datalog query evaluation, static analysis of Datalog queries, Datalog vs. first-order queries.

- Uncertainty - reasoning over possible worlds: incomplete databases, inconsistent databases, probabilistic databases, knowledge-enriched databases.

Student will be assessed 100% by in-course assessment:

- Essay 1 (formative): students will choose a research paper (from a given list) on the relational model, and present (i) a summary of the paper, and (ii) analysis and critical thoughts (criticism of the paper, and discussion on follow-up papers that show how the ideas of the paper under review have influenced the field). The length of the essay must be 5 to 7 pages (including references). Each essay should be clearly written in sentences with appropriate punctuation, display of formulae, appropriate use of "Definition", "Lemma", "Theorem", "Proof", etc. The work should be properly and adequately referenced in the text, with the full list of references at the end of the essay, following any of the standard labelling conventions as technical papers (e.g., numerical, or by abbreviated name). The essay will be marked on its clarity and technical accuracy, which are equally important criteria. It must be understood by someone who has not read the paper. Of course, in reality, it will be marked by someone who has read the paper, but still this is an important criterion that will be used in marking.

- Essay 2 (50%): as for Essay 1, but with the difference that students will choose a research paper on a topic covered during the lectures (not only on the relational model as for Essay 1). The essay will be marked on its clarity (25%) and technical accuracy (25%).

- Take-home test (50%): students will complete a take-home test at the end of the course, which will consist of three problems on relational queries, Datalog queries, and uncertain data. The problems will be of variable difficulty (easy - 10%, medium - 15%, hard - 25%). For the hard problem, a hint that will guide the student towards the final solution will be given. The completion of the take-home assignment will take no more than 25 hours.
Entry Requirements (not applicable to Visiting Students)
Pre-requisites It is RECOMMENDED that students have passed Database Systems (INFR10070) OR Introduction to Databases (INFR10080)
Co-requisites
Prohibited Combinations Other requirements While there are no formal prerequisites, it is recommended that students have passed an introductory course in
Databases such as the undergraduate course Database Systems (INFR10070) or Introduction to Databases (INFR10080); in particular, some familiarity with the relational model, the main relational query languages (calculus and algebra), and integrity constraints is welcome. It is also recommended that students have some basic familiarity with complexity theory (standard complexity classes such as PTIME and NP, and the notion of completeness). In any case, this course is self-contained, and all the necessary tools will be properly introduced and explained during the lectures.
Information for Visiting Students
Pre-requisitesSame as other requirements
High Demand Course? Yes
Course Delivery Information
Not being delivered
Learning Outcomes
On completion of this course, the student will be able to:
  1. Abstract relational data and relational queries from their physical implementation, and formalise them in a rigorous way.
  2. Analyse the complexity of querying relational data, and isolate the source of complexity.
  3. Explain the semantics of Datalog queries, analyse the complexity of evaluating Datalog queries, and model real-life queries in a declarative way.
  4. Formalise uncertain data, analyse the complexity of querying uncertain data, and explain the reasons that lead to intractability.
  5. Read and summarise research papers.
Reading List
- Abiteboul, Hull, Vianu, Foundations of Databases, 1995
- Libkin, Elements of Finite Model Theory, 2012
- Bertossi, Database Repairing and Consistent Query answering, 2011
- Suciu, Olteanu, Re, Koch, Probabilistic Databases, 2011
Additional Information
Graduate Attributes and Skills Problem-solving, analytical thinking, independent learning, written communication.
Special Arrangements While there are no formal prerequisites, it is recommended that students have passed an introductory course in Databases such as the undergraduate course Introduction to Databases; in particular, some familiarity with the relational model, the main relational query languages (calculus and algebra), and integrity constraints is welcome. It is also recommended that students have some basic familiarity with complexity theory (standard complexity classes such as PTIME and NP, and the notion of completeness). In any case, this course is self-contained, and all the necessary tools will be properly introduced and explained during the lectures.
KeywordsRelational data,Relational queries,Query evaluation,Static analysis of queries,Fast query evaluation
Contacts
Course organiserDr Andreas Pieris
Tel: (0131 6)51 5606
Email: apieris@inf.ed.ac.uk
Course secretaryMs Lindsay Seal
Tel: (0131 6)50 2701
Email: lindsay.seal@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Combined Course Timetable
Prospectuses
Important Information