Postgraduate Course: Applied Databases (INFR11015)
|School||School of Informatics
||College||College of Science and Engineering
|Credit level (Normal year taken)||SCQF Level 11 (Postgraduate)
||Availability||Available to all students
|Summary||The course gives an introduction to relational database systems and SQL, as well as to modern "NoSQL" database systems. The student learns how to search over heterogeneous data, both in exact and in approximate ways. This is accompanied by hands-on programming assignments. We study how to carry out large scale data analytics using stream processors and statistical programming languages. The course content is dynamic and will constantly be updated to state-of-the-art database systems.
* Introduction to RDBMS: data models and ER diagrams. How to design a database schema. How to populate a database table. How to express queries using SQL. Speedup of Query Evaluation through indexes.
* Introduction to Storing and Searching Heterogeneous Data: Discuss heterogeneous data models, such as text, hierarchical, and graph shaped data. How can they be mapped into a RDBMS? How are they supported by new systems such as NoSQL databases. What kind of consistency guarantees are provided by the new NoSQL systems?
* Similarity Search: How can we capture that two items are "similar". How can we efficiently search for "similar items"? How can these insights be used to build, e.g., recommendation systems as the ones used by Amazon? What are the challenges when looking for similar complex items, such as images or videos?
* Data Analytics: What is the precise difference to conventional database queries? What kind of analytics can be supported highly efficiently, such as through stream processing where data is not stored locally. Where are the limits? What kind of analytics can be carried out efficiently using statistical programming languages over large data sets?
Relevant QAA Computing Curriculum Sections: Databases, Middleware
The course gives a practical introduction to databases. The first focus is on conventional relational databases systems, and on SQL as query language. Then, new modern systems such as NoSQL systems are introduced. Two particular important topics are covered: how to search for "similar items", and, how to carry out large scale data analytics.
Entry Requirements (not applicable to Visiting Students)
|| Students MUST have passed:
Database Systems (INFR10055)
||Other requirements|| The course assumes good programming skills in Java.
For Informatics PG and final year MInf students only, or by special permission of the School.
Information for Visiting Students
|High Demand Course?
Course Delivery Information
|Academic year 2015/16, Available to all students (SV1)
|Learning and Teaching activities (Further Info)
Lecture Hours 20,
Seminar/Tutorial Hours 6,
Summative Assessment Hours 2,
Programme Level Learning and Teaching Hours 2,
Directed Learning and Independent Learning Hours
|Assessment (Further Info)
|Additional Information (Assessment)
For proper evaluation, students must be presented with real problems, rather than "toy" ones which can be solved in a very limited time. The focus on this course will be on largescale problem solving and critical thinking. To that end, students will pick a miniproject which comes in two linked assignments.
Exam: worth 70%
The students will deliver their work in two instalments:
* a refinement of the project description and rough prototype (worth 15%);
* a final presentation and a demonstration of their work at the end of the semester (worth 15%).
If delivered in semester 1, this course will have an option for semester 1 only visiting undergraduate students, providing assessment prior to the end of the calendar year.
||Hours & Minutes
|Main Exam Diet S2 (April/May)||2:00|
On completion of this course, the student will be able to:
- Describe how database systems work and their application in Informatics
- Analyse data and describe using common description methods such as ER diagrams
- Populate a relational database and run queries over it. Load heterogeneous data into a NoSQL database and run queries over it.
- Design and implement similarity search in a database
- Run basic statistical queries over large data sets
|"Web Data Management" - Abiteboul, Manolescu, Rigaux, Rousset, Senellart, published by Cambridge University Press 2011|
"Mining of Massive Datasets" - Anand Rajaraman, Jeffrey David Ullman, published by Cambridge University Press 2011.
Please note, these books are available online and free of charge:
|Course organiser||Dr Sebastian Maneth
Tel: (0131 6)51 5642
|Course secretary||Miss Maree Matheson
Tel: (0131 6)50 9989
© Copyright 2015 The University of Edinburgh - 18 January 2016 4:13 am