Learn how to analyze large amounts of textual data by applying your R programming skills to an efficient, powerful and easy-to-use method - quantitative text analysis.
This course is perfect for social scientists who want to understand the theory and assumptions that underpin quantitative text analysis, whilst developing their R programming skills via practical examples of analysis with real texts.
Not familiar with R? Take our Introduction to R course first.
By the end of this course you will be able to:
Understand the theoretical basis for Quantitative Text Analysis
Survey methods for systematically extracting quantitative information from text for social scientific purposes
Identify texts and units of texts for analysis
Convert texts into matrices for quantitative analysis
Analyze these matrices in order to generate inferences using quantitative or statistical methods
Instructor
[[Instructor]]
Price
$[[price]]USD
Buying for more than 10 people? Find out more
Inconvenient Start Date?
Register interest for future dates
This module introduces students to the types of questions that text analysis can answer, the tools that the course will use to answer them, and offers examples of analyses using text-as-data.
This module discusses how to obtain textual data and how to get it into a format that is suitable to analyze. It finishes with an example of using complexity and readability measures.
This module teaches students how to summarize texts in a corpus by looking at the occurrence of individual words using tools such as keywords in context and dictionaries.
This final module introduces students to models that allow for comparison across texts within a corpus by examining their word usage, including building dictionaries and creating scales.
I enjoyed the course videos and resources, which were simple and knowledgeable, in spite of my very basic experience on R and text mining.
While our R scripting will be at a fairly basic level, you should have some familiarity with R in order to succeed in this course as it will be challenging to learn R and text analysis at the same time. A basic knowledge of statistics would also be helpful.
The course is organized into a set of four interactive learning modules, and you should work through the modules sequentially. The modules contain a number of topic pages, each including a video to walk you through the concept and interactive text to reinforce what was covered in the video, quick questions and knowledge checks.
There are three additional types of activity throughout the course to facilitate deeper learning:
The vast majority of topics in the course are fundamentally practical. You are strongly encouraged to recreate and run the code as you work through them, and complete knowledge checks and activities.
You will have 3 months' access to this course.
We recommend completing the course in the first 4 weeks, during which you will have access to learning support provided by Nicole, your subject matter expert (SME). She will be on hand to answer any questions, or help you if you get stuck.
After the learning support period, you’ll still have access to the course materials but you won’t receive support from the SME and if there is a course forum, you will not be able to ask any questions. SAGE Campus will help you with any IT or platform issues you might have throughout the course.
You will need to have R installed to work through this course and it is essential that your version is 3.4.1 or above.
You will need to install the quanteda package and the quantedaData package. Quanteda can be downloaded from CRAN and should be version 9.9.6.5 or above. The quantedaData package can't be installed from CRAN and you will need to install the devtools package from CRAN and then install quantedaData from github. You can use the code below to do this.
install.packages("devtools") devtools::install_github("kbenoit/quantedaData")
You should also install the readtext package from CRAN as we will be using that to read text files into R.
install.packages("readtext")
No they are either open source or have community (free) versions
A computer or laptop with the suggested software and a modern browser e.g. Internet Explorer 10+ or the latest versions of Chrome and Firefox.
While you can access the course on your mobile device, go through the content and answer questions, you will need a desktop or laptop computer to practice and complete the activities that require you to write and/or test code.
Can't find what you're looking for? Contact Us
Jonathan Slapin is Professor in the Department of Government at the University of Essex and Director of the Essex Summer School in Social Science Data Analysis.
He joined Essex in 2015, having previously held faculty positions at the University of Houston, Trinity College, Dublin and the University of Nevada, Las Vegas. He holds a PhD from the University of California, Los Angeles and a BA from Rutgers University.
His main research and teaching interests are in quantitative comparative politics, political institutions, and quantitative text analysis. His research frequently employs formal theory and quantitative methods to explore legislative behaviour, political parties, and democratic representation.
His most recent book, co-authored with Sven-Oliver Proksch, is entitled “The Politics of Parliamentary Debate: Parties, Rebels and Representation” and is published by Cambridge University Press.
Other research has appeared in leading political science journals such as the American Journal of Political Science, and the British Journal of Political Science.