Why choose Fundamentals of Quantitative Text Analysis for Social Scientists by Lily Mehrbod, SAGE Campus Development Editor.

Have you ever needed to analyse hundreds of documents, spent days going through only a fraction of them, and then thrown your pencil up in despair as you scream “there must be a better way!” If so, quantitative text analysis may be for you. It allows you to treat large amounts of text as data, and trawl through vast quantities of literature with computers at a scale that would be unimaginable were it done by humans alone. 

Our Fundamentals of Quantitative Text Analysis course will give you the tools and theoretical basis you need to start processing large amounts of text for your own research project. 

Professor Jon Slapin, a political scientist at the University of Essex will show you how to easily extract meaningful data and insights from your documents. The course uses the quanteda R package, developed by Ken Benoit at the London School of Economics, and each page of the course features a video which takes you step by step through the all the code being used, guiding you through every stage of the process. 

Tailored particularly to the work of social scientists, and incorporating social science examples throughout, by the end of the course you’ll be able to answer such questions as:

  • How do I find out how often a word appears in a text, or a set of texts?

  • Was President Trump’s inaugural speech easier to understand for the average American than President Obama’s?

  • How can I identify nebulous concepts such as positivity or negativity from a vast range of texts?

  • Are some words more left-wing or right-wing than others?

  • How can I ensure context is not lost when running code on my texts?

The course is designed to be able to fit in around you other commitments and is divided into four modules:

In Module 1 you’ll cover:

  • Introduction explaining course purpose: goals and objectives

  • Conceptual foundations of text analysis

  • Quantitative text analysis as a field and the development of the field

  • Logistics and software - required setup and work files

  • A basic example of performing a text analysis

In Module 2 you’ll cover:

  • Where to obtain textual data

  • Formatting and working with text files

  • Practical considerations of indexing and metadata

  • Units of analysis: strategies for selecting units of analysis

  • Overview and examination of complexity and readability measures

In Module 3 you'll cover:

  • Keywords in context Coverage and examples of KWIC

  • Consideration of concordance and dictionaries

  • Detecting and identifying collocations

  • Stemming: An in-depth discussion of text types, tokens, and equivalencies

  • Stop words and feature weighting: An in-depth discussion of text types, tokens, and equivalencies

In Module 4 you’ll cover:

  • Euclidean distance and its use in comparing texts

  • Cosine similarity and its use in comparing texts

  • General principles and rationale for dictionaries

  • External dictionaries: How to add a third party dictionary

  • How to create your own dictionary

  • Overview of wordscores

  • Implementing in R a basic model