Postgraduate Course: Querying Large Graphs (INFR11121)
|School||School of Informatics
||College||College of Science and Engineering
|Credit level (Normal year taken)||SCQF Level 11 (Postgraduate)
||Availability||Available to all students
|Summary||The course will focus on querying large graphs in general, and social networks in particular, an emerging area of research. Topics covered by the course include queries, complexity and algorithms for graph-structured data, parallel programming models for graphs beyond MapReduce, techniques for making large graphs small, and approximation techniques for querying large graphs with performance guarantees. Applications of graph queries and graph query systems will also be discussed. The course content is dynamic and continuously updated to cover the state-of-the-art in the area of querying large graphs.
* Graph queries: reachability, trajectory queries, regular path queries, kNN join, keyword search in graph databases, and graph pattern matching based on subgraph isomorphism, graph simulation and their revisions for social network analyses; the complexity and sequential algorithms for evaluating these queries.
* Parallel models and algorithms for answering graph queries: MapReduce, Bulk Synchronous Parallel model (BSP), the vertex-centric model of GraphLab, and GRAPE by combining data-partitioned parallelism, partial evaluation, message passing and incremental computation; Parallel algorithms and complexity for graph query evaluation.
* Techniques for querying large graphs, by making large graphs small: boundedly evaluable graph queries, query-preserving graph compression, query answering using graph views, and bounded incremental graph query answering.
* Approximation techniques for querying large graphs: query-driven approximation; resource-bounded data-driven approximation; top-k diversified algorithms for answering graph queries.
* Applications of graph queries: social media marketing, social community detection, knowledge fusion, and entity linking in knowledge bases.
* Graph query engines: Neo4j, Pregel, Giraph++, GraphLab, and GraphX
This course is to study current research and development issues in connection with querying large graphs in general, and social networks in particular. It will cover the following topics.
Entry Requirements (not applicable to Visiting Students)
|Prohibited Combinations|| Students MUST NOT also be taking
Querying and Storing XML (INFR11103)
||Other requirements|| The course assumes a strong computer science background (algorithm design, graph theory, proofs of intractability). An emphasis on data management is welcome (query languages).
Information for Visiting Students
Course Delivery Information
|Not being delivered|
On completion of this course, the student will be able to:
- Demonstrate an understanding of various graph queries on real-life graphs, their complexity and algorithms.
- Demonstrate an understanding of cutting-edge techniques for querying large graphs, including bounded evaluability of queries, parallel programming models for graph queries beyond MapReduce, query preserving graph compression, query answering using graph views, and bounded incremental query answering.
- Demonstrate an understanding of state-of-the-art approximation techniques for querying large graphs with performance guarantees, such as query-driven and data-driven approximation.
- Design, implement and evaluate an algorithm for querying large graphs.
- Write a project report and present the project in class.
|As querying large graphs is an emerging research area, no textbook on the subject is yet available. As such, the course will provide lecture notes and a list of research papers, both prepared by the lecturer.|
|Graduate Attributes and Skills
|Keywords||Database systems,Data management,Social networks,large graphs,parallel programming models,query
|Course organiser||Prof Wenfei Fan
Tel: (0131 6)51 3818
|Course secretary||Ms Katey Lee
Tel: (0131 6)50 2701