Postgraduate Course: Topics in Distributed Databases (INFR11025)
|School||School of Informatics
||College||College of Science and Engineering
|Credit level (Normal year taken)||SCQF Level 11 (Postgraduate)
||Availability||Available to all students
|Summary||This course covers not only the basic technology required for distributed databases, but also some of the emerging technology of database integration, data cleaning, schema matching/mapping and peer-to-peer technology for highly distributed databases.
Topics to be covered:
* Parallel and Distributed Databases
* Distributed Query Optimisation and Evaluation
* Integrating data from distributed sources
* Schema matching and mapping
* Cleaning integrated data
* Propagation analysis of data quality rules via views.
Relevant QAA Computing Curriculum Sections: Computer Networks, Databases, Distributed Computer Systems, Information Systems, Web-based Computing
Entry Requirements (not applicable to Visiting Students)
||Other requirements|| This course assumes knowledge of database systems comparable to that covered in Database Systems. Students who have not taken Database Systems must obtain permission to take the course from the course organiser.
Information for Visiting Students
Course Delivery Information
|Academic year 2014/15, Available to all students (SV1)
|Learning and Teaching activities (Further Info)
Lecture Hours 20,
Summative Assessment Hours 1,
Programme Level Learning and Teaching Hours 2,
Directed Learning and Independent Learning Hours
|Assessment (Further Info)
|Additional Information (Assessment)
||This is a research seminar module. Each student is required to read research papers, complete a practical project, and write and present a final report for the project.
The project (70%) deals in more depth with a topic covered in the class. It should consist of algorithm design, prototype implementation, and experimental study for developing a practical tool. Example projects include: SQL techniques for detection of data inconsistencies based on integrity constraints, a tool for schema matching/mapping, or a tool for repairing dirty databases.
The presentation (30%) should report and demonstrate the tool developed in the project.
If delivered in semester 1, this course will have an option for semester 1 only visiting undergraduate students, providing assessment prior to the end of the calendar year.
|No Exam Information
| 1 - Describe emerging issues in distributed databases: data integration, schema matching, schema mapping, data cleaning, distributed query evaluation and optimisation.
2 - Describe the problems faced in data integration, as well as model solutions to these problems.
3 - Translate data between example XML schemas without loss of information.
4 - Describe the need for data cleaning in data integration, and approaches to improving the quality of integrated data.
5 - Detect inconsistencies in data using integrity constraints.
6 - Repair dirty databases based on integrity constraints.
7 - Propagate data quality rules via data transformation/integration
8 - Demonstrate the issues involved in data integration for distributed query processing.
9 - Describe the issues involved in distributed query optimisation regarding cost modeling and algorithms for query evaluation.
|* An introduction to Database Management Systems by Raghu Ramakrishnan (Chapters 16-18, 22).|
* Research papers
|Course organiser||Prof Wenfei Fan
Tel: (0131 6)51 3818
|Course secretary||Ms Katey Lee
Tel: (0131 6)50 2701
© Copyright 2014 The University of Edinburgh - 12 January 2015 4:11 am