THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2026/2027

Timetable information in the Course Catalogue may be subject to change.

University Homepage

DRPS : Course Catalogue : Edinburgh Futures Institute : Edinburgh Futures Institute

Postgraduate Course: Text Mining for Social Research (Online) (EFIE11487)

Course Outline
School	Edinburgh Futures Institute	College	College of Arts, Humanities and Social Sciences
Credit level (Normal year taken)	SCQF Level 11 (Postgraduate)
Course type	Online Distance Learning	Availability	Available to all students
SCQF Credits	20	ECTS Credits	10
Summary	There is increasing demand for researchers and policy-makers to have basic familiarity with the skills required to conduct text mining and analysis. This is a hands on programming course that requires no previous experience or skills, taking novices from their different disciplines to a point where they can do useful analytical processes with textual data. As part of the Future Governance pathway, this is a data-led course using contemporary or archived materials, such as political speeches, public records or media sources.
Course description	During this course you will learn from scratch the theory and practice of analysing text documents with code. The course is suitable for participants who have no prior experience of text analysis or programming in Python. At the end of this course students will know the basics of the theory behind text mining and will have skills to prepare, search, analyse and create visualisations from text documents at scale. The contexts and examples we will use will be relevant to social research. The practical parts of this course are taught in the programming language Python. Initially, the core basic Python skills are introduced and the students are taught how to set up their virtual programming environment. Participants will then learn how to read in textual files and carry out the initial processing required for text manipulation. The course will also cover concordances, frequency distributions, lexical dispersions, collocations, part-of-speech tagging, named entity recognition as well as network creation and draw on sample datasets relevant to social and political research. This then allows us to introduce the more complex analysis and visualisation techniques required to extract information from large text datasets. Through worked examples, pair programming exercises and a final project, learners will produce original pieces of work involving the practical skills they acquired. Edinburgh Futures Institute (EFI) - Online Hybrid Course Delivery Information: The Edinburgh Futures Institute will teach this course in a way that enables online and on-campus students to study together. To enable this, the course will use technologies to record and live-stream student and staff participation during their teaching and learning activities. Students should note that their interactions may be recorded and live-streamed (see the Lecture Recording and Virtual Classroom policies for more details). There will, however, be options to control whether or not your video and audio are enabled. You will need access to a personal computing device for this course. Most activities will take place in a web browser, unless otherwise stated. We recommend using a device with a screen, a physical keyboard, and internet access.

Entry Requirements (not applicable to Visiting Students)
Pre-requisites		Co-requisites
Prohibited Combinations		Other requirements	None

Information for Visiting Students
Pre-requisites	None
High Demand Course?	Yes

Course Delivery Information

Academic year 2026/27, Available to all students (SV1)		Quota: 0
Course Start	Semester 2
Timetable	Timetable
Learning and Teaching activities (Further Info)	Total Hours: 200 ( Lecture Hours 5, Seminar/Tutorial Hours 2, Supervised Practical/Workshop/Studio Hours 13, Programme Level Learning and Teaching Hours 4, Directed Learning and Independent Learning Hours 176 )
Assessment (Further Info)	Written Exam 0 %, Coursework 100 %, Practical Exam 0 %
Additional Information (Assessment)	The course will be assessed by means of the following components: 1) Individual Coursework (Text Analysis Project) (100%) Project and report (blog posts, including a data analysis with visualisations, a testable hypothesis and results) of 1200 words (2 blog posts of max. 600 each) plus the accompanying coding notebook. This includes a paragraph that reflects on the peer learning during the course. Learning Outcomes Assessed by Component: 1, 2, 3, 4, 5
Feedback	Feedback on any formative assessment may be provided in various formats, for example, to include written, oral, video, face-to-face, whole class, or individual. The Course Organiser will decide which format is most appropriate in relation to the nature of the assessment. Feedback on both formative and summative in-course assessed work will be provided in time to be of use in subsequent assessments within the course. Feedback on the summative assessment(s) will be provided in written form via Learn, the University of Edinburgh's Virtual Learning Environment (VLE). Formative Feedback Opportunity: Formative feedback is ongoing feedback which monitors learning and is intended to improve performance in the same course, in future courses, and also beyond study. Students will receive the following feedback: - Solutions to in-course programming tasks will be provided. - In person coding feedback. - Written feedback by academic staff on the final project submission.
No Exam Information

Learning Outcomes
On completion of this course, the student will be able to: Demonstrate a critical understanding of the main areas of study linked to the use of technology in text and data mining. Explain and use key technologies and formats used in data analysis. Develop original and creative responses to data driven problems. Demonstrate their ability to deliver - in verbal and written form - coherent, balanced arguments surrounding the use of data. Work in a peer relationship and make an identifiable contribution to change and development and/or new thinking.

Reading List
Essential Reading: Ignatow, Gabe, and Rada Mihalcea. Text mining: A guidebook for the social sciences. Sage Publications, 2016. https://dx.doi.org/10.4135/9781483399782.n1 Chapter 1: Social Science and the Digital Text Revolution (essential reading) Chapter 2: Research Design Strategies, Levels of Analysis, p. 18 (recommended reading) Chapter 5: Basic Text Processing (recommended reading) Chapter 7: Text Analysis Methods from the Humanities and Social Sciences, Visualisations Tools, pp. 83-86 (recommended reading) Chapter 12: Information Extraction, Entity Extraction, p. 130 (recommended reading) Holy, Dirk. Text analysis in Python for social scientists : prediction and classification. Cambridge University Press, 2020. https://www-cambridge-org.ezproxy.is.ed.ac.uk/core/elements/text-analysis-in-python-for-social-scientists/BFAB0A3604C7E29F6198EA2F7941DFF3 (you need to be logged into your university account or the university VPN to access this link online) Chapter 1: Prerequisites Chapter 2: What's in a Word Chapter 3: Regular Expressions How to write an engaging blog, The University of Edinburgh, https://information-services.ed.ac.uk/learning-technology/learning-and-teaching-technologies/academic-blogging-service/introduction-to-5 Further Reading: Lacey, Nichola. Python by Example: Learning to Program in 150 Challenges. Cambridge University Press, 2019. https://doi-org.ezproxy.is.ed.ac.uk/10.1017/9781108591942 Bird, Steven, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.", 2009. SpaCy API documentation, https://spacy.io/api

Additional Information
Graduate Attributes and Skills	Critical Thinking: The course fundamentally develops this through teaching students to question how computational methods shape research findings, evaluate the validity of computational text analysis and assess the importance and limitations of text mining methods. Problem Solving: This is central to the course, as students learn to translate social research questions into computational approaches, work within the constraints and uncertainties of messy real-world text data and synthesise technical methods with domain knowledge. Data and digital literacy: This is core to the course, teaching how to interpret and question text mining results, understand data quality issues and the use appropriate digital tools for text analysis. Reflection: This is embedded in the Q&A sessions and the assignment, through students reviewing their analytical choices, considering methodological decisions and their implications, and recognising the strengths and limitations of different approaches. Curiosity: This is encouraged through exposure to new computational methods, allowing students to explore different analytical approaches and bridging technical and social research perspectives. Communication: This is developed through explaining technical concepts to non-technical audiences and presenting findings from text analysis. Adaptivity: This is required as students navigate new technical tools, manage the learning curve of programming/technical skills and work with other students from different backgrounds and expertise. Collaboration: The course uses paired work and encourages collaboration with peers throughout. Inclusivity and Individuality are supported and encouraged through the pedagogical approach of the teaching team and in class discussions.
Keywords	Text and Data Mining,Natural Language Processing,Social Research,Programming

Contacts
Course organiser	Mrs Clare Llewellyn Tel: Email: cllewell@exseed.ed.ac.uk	Course secretary	Mr Matt Bryant Tel: Email: Matt.Bryant@ed.ac.uk

Navigation

Help & Information

Search DPTs and Courses

Regulations

Degree Programmes

Courses

Humanities and Social Science

Science and Engineering

Medicine and Veterinary Medicine

Other Information

Combined Course Timetable

Important Information