Undergraduate Course: Scalable Data Management Systems (INFR11123)
|School||School of Informatics
||College||College of Science and Engineering
|Credit level (Normal year taken)||SCQF Level 11 (Year 4 Undergraduate)
||Availability||Available to all students
|Summary||The course focuses on core systems building aspects of data management. The material of the course is fuelled by technology, techniques, and architectures. One of the key aspects of the course will be to weave technology along with algorithms and systems and present fundamental data-centric computing challenges in light of the systems being built to address them. The course content is dynamic and continuously updated to cover the state-of-the-art in scalable data management systems.
Background: Fundamental challenges introduced by data-centric computation; where current practices stop and where the need for new techniques arises; how new technology addresses this need and what challenges arise with the introduction of this new technology; how the hardware and software substrates of computing platforms come together in addressing these challenges.
* Technology: Existing technology is not enough to deal with contemporary data needs, both at a single-server level and at the distributed computation level. We will first deal with advances in memory technology and specifically flash, solid-state, and non-volatile memory. We will introduce the notions of massively parallel processors and discuss the programming models and performance implications that come with incorporating them into the systems stack.
* Techniques: We will then introduce rack-scale computing and remote direct memory access as mechanisms by which future deployments will leverage the newly found power and discuss techniques for ensuring high performance in storing and accessing data; algorithms for data management; techniques for ensuring data consistency; protocols for data coherence.
* Architectures: We will focus on new architectures and programming substrates for building systems that access and manipulate large volumes of data in performing ways. We will discuss NoSQL and cluster-based solutions, and introduce the notion of high-level languages for managing data on clusters.
The course will take a practical approach towards introducing scalable data management systems. We will start from the ground up, first focusing on cutting-edge technology, moving on to data structures and algorithms, and then building systems for large-scale deployments.
Entry Requirements (not applicable to Visiting Students)
||Other requirements|| None
Information for Visiting Students
|High Demand Course?
Course Delivery Information
|Not being delivered|
On completion of this course, the student will be able to:
- Describe and justify the differences between large-scale data management and general distributed computing and the need for customisation.
- Deploy state-of-the-art hardware concepts like persistent memory in a stand-alone application or in the context of a data management system and justify that the deployment is appropriate.
- Build scalable data management applications using multi-core and heterogeneous architectures to provide a single-server massively parallel deployments.
- Implement state-of-the-art query processing techniques (e.g. code generation) in either a stand-alone application or in the context of a managed runtime and be able to identify appropriate contexts for such techniques.
- Implement state-of-the-art distributed data management techniques in either a stand-alone application or in the context of a data processing system and describe the storage and query processing algorithms involved.
|Graduate Attributes and Skills
|Keywords||Database systems,Data management,Scalability,Hardware,Software Engineering
|Course organiser||Dr Stratis Viglas
Tel: (0131 6)50 5183
|Course secretary||Mr Gregor Hall
Tel: (0131 6)50 5194