Fundamentals of Quantitative Text Analysis for Social Scientists

Fundamentals of Quantitative Text Analysis for Social Scientists
Start date:
Buy now

What you'll learn

What you'll learn

Not sure where to start or which course to take next? Check out our handy learning pathways.

What you'll learn on this self-paced online course

Learn how to analyze large amounts of textual data by applying your R programming skills to an efficient, powerful and easy-to-use method - quantitative text analysis.

This course is perfect for social scientists who want to understand the theory and assumptions that underpin quantitative text analysis, whilst developing their R programming skills via practical examples of analysis with real texts.

Not familiar with R? Take our Introduction to R course first.

By the end of this course you will be able to:

  • Understand the theoretical basis for Quantitative Text Analysis

  • Survey methods for systematically extracting quantitative information from text for social scientific purposes

  • Identify texts and units of texts for analysis

  • Convert texts into matrices for quantitative analysis

  • Analyze these matrices in order to generate inferences using quantitative or statistical methods

10 hours to learn
3 months access



Enroll a group
Looking to upskill a group of 5 or more learners or get access for your institution? Find out more

Inconvenient Start Date?
Register interest for future dates

Course modules

Course modules

There are four modules in this course

Introduction to Text Analysis and Conceptual Foundations

This module introduces students to the types of questions that text analysis can answer, the tools that the course will use to answer them, and offers examples of analyses using text-as-data.

The Basics of Working with Textual Data

This module discusses how to obtain textual data and how to get it into a format that is suitable to analyze. It finishes with an example of using complexity and readability measures.

Examining Individual Word Occurrences

This module teaches students how to summarize texts in a corpus by looking at the occurrence of individual words using tools such as keywords in context and dictionaries.

Comparing across Texts

This final module introduces students to models that allow for comparison across texts within a corpus by examining their word usage, including building dictionaries and creating scales.

Try it out

Try it out

Try it out



What our learners say






While our R scripting will be at a fairly basic level, you should have some familiarity with R in order to succeed in this course as it will be challenging to learn R and text analysis at the same time. A basic knowledge of statistics would also be helpful.



Frequently asked questions

How is the course organized?

The course is organized into a set of four interactive learning modules, and you should work through the modules sequentially. The modules contain a number of topic pages, each including a video to walk you through the concept and interactive text to reinforce what was covered in the video, quick questions and knowledge checks.

What other types of activities does the course include?

There are three additional types of activity throughout the course to facilitate deeper learning:

  1. Match: These activities require you to have a go at a task offline, then select the correct solution
  2. Guided: These are multi-part match activities so you do a part of the task then submit your solution, which unlocks feedback on your attempt and the next part of the task
  3. Structured: This is a more extended offline task, which you should attempt before seeing the Tutor’s solution.

The vast majority of topics in the course are fundamentally practical. You are strongly encouraged to recreate and run the code as you work through them, and complete knowledge checks and activities.

How long will I have access to the course for?

You will have 3 months' access to this course.

What software do I need for this course?

You will need to have R installed to work through this course and it is essential that your version of R is up to date. It should be version 3.5.2 or above.

You will need to install the quanteda package and the quanteda.corpora package. Quanteda can be downloaded from CRAN and should be version 1.4 or above.


The quanteda.corpora package can't be installed from CRAN and you will need to install the devtools package from CRAN and then install quanteda.corpora from GitHub. You can use the code below to do this.



You should also install the readtext package from CRAN as we will be using that to read text files into R.


Do I need to buy any of this software?

No they are either open source or have community (free) versions

What do I need to participate on this course?

A computer or laptop with the suggested software and a modern browser e.g. Internet Explorer 10+ or the latest versions of Chrome and Firefox.

Can I do this course on my mobile device?

While you can access the course on your mobile device, go through the content and answer questions, you will need a desktop or laptop computer to practice and complete the activities that require you to write and/or test code.

Can't find what you're looking for? Contact Us



V2 Course Page Tag