| 
 Postgraduate Course: Accelerated Natural Language Processing (INFR11125)
Course Outline
| School | School of Informatics | College | College of Science and Engineering |  
| Credit level (Normal year taken) | SCQF Level 11 (Postgraduate) | Availability | Available to all students |  
| SCQF Credits | 20 | ECTS Credits | 10 |  
 
| Summary | The course will synthesise ideas from linguistics and computer science to provide students with a fast-paced introduction to the field of natural language processing. The course will cover the most widely-used theoretical and computational models of language, including both statistical and nonstatistical approaches. The course will familiarise students with a wide range of linguistic phenomena with the aim of appreciating the complexity, but also the systematic behaviour of natural languages like English, the pervasiveness of ambiguity, and how this presents challenges in natural language processing. In addition, the course introduce the most important algorithms and data structures that are commonly used to solve many NLP problems.
 The course will cover formal models for representing and analysing the syntax and semantics of words, sentences, and discourse. Students will learn how to analyse sentences algorithmically, using hand-crafted and automatically induced treebank grammars, and how to build interpretable semantic representations. The course will also cover a number of standard models and algorithms that are used throughout language processing. Examples include n-gram and Hidden Markov Models, the EM algorithm, and dynamic programming algorithms such as chart parsing.
 
 ***This course replaces INFR11059 Advanced Natural Language Processing***
 |  
| Course description | Part I: Words * Inflectional and derivational morphology
 * Finite state methods and Regular expressions
 * Word Classes and Parts of speech
 * Sequence Models (n-gram and Hidden Markov models, smoothing)
 * The Viterbi algorithm, Forward Backward, EM
 
 Part II: Syntax
 * Syntactic Concepts (e.g., constituency, subcategorisation, bounded and unbounded dependencies, feature representations)
 * Analysis in CFG - Greedy algorithms---Shift-reduce parsing
 * Divide-and-conquer algorithms---CKY
 * Chart parsing
 * Lexicalised grammar formalisms (e.g., TAG, CCG, dependency grammar)
 * Statistical parsing (PCFGs, dependency parsing)
 
 Part III: Semantics, Discourse, Dialogue and Applications
 * logical semantics and compositionality
 * Semantic derivations in grammar
 * Lexical Semantics (e.g., word senses, semantic roles)
 * Discourse and dialogue (e.g., anaphora, speech acts)
 * Text classification and sentiment analysis
 * Other applications (e.g., machine translation, question answering)
 
 Methodological topics, interspersed throughout:
 * Issues in annotation and evaluation
 * Machine learning approaches (e.g., Maximum Entropy models, neural networks)
 
 Relevant QAA Computing Curriculum Sections: Not yet available
 |  
Entry Requirements (not applicable to Visiting Students)
| Pre-requisites |  | Co-requisites |  |  
| Prohibited Combinations | Students MUST NOT also be taking    
Foundations of Natural Language Processing (INFR09028) OR   
Foundations of Natural Language Processing (INFR10078) 
 | Other requirements | Students with little or no previous programming experience must also register for Computer Programming for Speech and Language Processing. 
 Labs and assignments require programming in Python at a level designed for students who are learning Python simultaneously with this course.
 |  
Information for Visiting Students 
| Pre-requisites | None |  
		| High Demand Course? | Yes |  
Course Delivery Information
|  |  
| Academic year 2020/21, Available to all students (SV1) | Quota:  None |  | Course Start | Semester 1 |  Timetable | Timetable | 
| Learning and Teaching activities (Further Info) | Total Hours:
200
(
 Lecture Hours 30,
 Seminar/Tutorial Hours 5,
 Supervised Practical/Workshop/Studio Hours 8,
 Summative Assessment Hours 2,
 Programme Level Learning and Teaching Hours 4,
Directed Learning and Independent Learning Hours
151 ) |  
| Assessment (Further Info) | Written Exam
40 %,
Coursework
60 %,
Practical Exam
0 % |  
 
| Additional Information (Assessment) | The coursework component of the assessment will consist of: 
 - Brief weekly assignments and other engagement criteria (e.g., short answer peer-reviewed questions, discussion summaries)
 - One or two timed tests
 - Two longer assignments in which parts of an NLP system will be implemented and the results analyzed.
 |  
| Feedback | Not entered |  
| Exam Information |  
    | Exam Diet | Paper Name | Hours & Minutes |  |  
| Main Exam Diet S1 (December) |  | 2:00 |  |  
 
Learning Outcomes 
| On completion of this course, the student will be able to: 
         Identify, construct, and analyse examples of different kinds of ambiguity in natural language (e.g., ambiguity in part-of-speech, word sense, syntactic attachment). Explain how ambiguity presents a problem for computational analysis, and some of the ways it can be addressed.Describe and apply standard sequence and classification models; describe parsing and search algorithms for different levels of analysis (e.g. morphology, syntax, and semantics) and simulate each algorithm step-by-step with pen and paper.For a range of NLP tasks, outline a processing pipeline for that task, including standard data sets, models, algorithms, and evaluation methods. Given a particular pipeline or part of the pipeline, identify potential strengths and weaknesses of the suggested dataset/method (including both technical and ethical issues, where appropriate), and provide examples to illustrate.Implement parts of the NLP pipeline with the help of appropriate support code and/or tools. Evaluate and interpret the results of implemented methods on natural language data sets.Recognize the interdisciplinary nature of the field and constructively engage in both self-study and peer-learning with other students from diverse backgrounds. |  
Reading List 
| Jurafsky and Martin, Speech and Language Processing, 2nd edition, 2008. |  
Contacts 
| Course organiser | Dr Sharon Goldwater Tel: (0131 6)51 5609
 Email: s.goldwater@ed.ac.uk
 | Course secretary | Ms Lindsay Seal Tel: (0131 6)50 2701
 Email: lindsay.seal@ed.ac.uk
 |   |  |