This course is one of a suite of courses that make up our Fundamentals of Data Science for Social Scientists package for institutional purchase.
Our bytesize courses from our Fundamentals of Data Science for Social Scientists package are a series of short courses that teach core data science skills to people who are eager to learn, but short on time.
In Bytesize: Cleaning Data and Preprocessing you will learn how to prepare data so that it is in a format that can be recognized by the coding function in R or Python.
By the end of this bytesize course, you will be able to:
Define what cleaning and preprocessing data are
Explain why these steps are necessary
Identify common cleaning tasks and possible solutions
Use regular expressions to standardize text
Join multiple data sources together
Incorporate best practices into your data science workflow
To successfully complete this course, you will need some knowledge of Python and/or R. We assume that participants can assign variables, direct the flow of control using conditionals, define their own functions and read files. These topics are all covered in the Introduction to Data Science with Python and Introduction to Data Science with R courses that also come as part of the Fundamentals of Data Science for Social Scientists package that these bytesize come in. If you complete one of those courses first, you will have the knowledge you need for this course.
This course comes in our Fundamentals of Data Science for Social Scientists package for institutional purchase, along with nine other courses. You can find out about and enquire about purchasing the full package here.
You can choose to purchase a 6 or 12 month subscription to the Fundamentals of Data Science for Social Scientists package that this course is part of. Learners will have full access to this and all of the other courses for the duration of the subscription.
Your institution’s subscription will be tailored depending on the number of learners you have. We can accommodate anything from 10-100 learners.
Yes! If you would like to see the courses before purchasing your institution’s subscription you can trial free demos of the courses. Simply ask for this when you make your enquiry.
Only if you want to! All that’s needed to complete the courses is a web browser and an internet connection. We do all our programming using JupyterHub, which means that you can code in your browser window.
Jupyter notebooks offer a seamless integration of code with explanatory markdown text. It will allow you to read the narrative of the programming task, and write code of your own to fit into the larger narrative. If you’d like to find out more about Jupyter notebooks, see our blog post 'What is a Jupyter notebook'.
However, if you wish to install R or Python, please do!
All of our courses offer a certificate of completion signed by your instructor. You will be able to download this certificate, from the Learning Platform, when you complete the course.
Can't find what you're looking for? Contact Us