Cleaning Messy Data — Sage Campus
Messy Data image.JPG
Messy Data image.JPG

What you'll learn


Cleaning Messy Data

SCROLL DOWN

What you'll learn


Cleaning Messy Data

This course will introduce the fundamentals of cleaning messy data. It will provide a clear understanding about what messy data sets are and why they need to be cleaned, as well as giving lots of practical examples for cleaning data sets.

This course will help learners to:

  • Recognize when data are messy and require cleaning

  • Apply cleaning methods to messy datasets

  • Understand how cleaning messy data contributes to good data management

  • Perform quality control of data

Language: English

Time to complete: 3 hours

Level: Beginner

Instructor: Dr. Alessandra Vigilante

How to access: Sage Campus is a digital library product. If you are a librarian, find out how to get Sage Campus for your university. If you are faculty, a researcher, or a student, recommend Sage Campus to your library.

Course modules


Course modules


 There are 3 modules in this course:

 

1. Help! My Data Are Messy

Even the most organized person can make mistakes when recording and saving data. At first, datasets can look clean and reproducible but as soon as we try to add more data or use them for analysis or visualization purposes, issues begin to arise, and we find ourselves needing to clean the data! In this module, you will learn what messy data are, and why it’s so important to recognize and clean them as soon as possible (and avoid them in the future!).

2. Why Clean Messy Data?

Messy data will waste your time, will confuse your collaborators, and will certainly negatively impact your analysis and your research output.
In this module, we’ll explain why it’s so important to have clean data you can trust, both to obtain reliable results and for creating sustainable and interoperable datasets.

3. How Can I Clean My Messy Data?

Most of the time, quantitative data are recorded and saved in text files using a spreadsheet program. Excel isn’t the only spreadsheet program, but it’s arguably the most used one. Free spreadsheet programs include LibreOffice Calc and Apple Numbers for Apple users. This module will provide background information on different spreadsheet programs and share key skills that can be used to manually clean messy data.

Try the course


Try the course


Try it out

Students, researchers and faculty can try all Sage Campus courses today by signing up for a 7-day free trial below. 30-day institutional trials are set up via your institution’s library, so recommend us to your library to request a campus-wide trial.

Who it's for


Who it's for


Who it’s for

This course is aimed at all learners who work with large data sets that need to be cleaned and reformatted before processing, from undergraduates to early career researchers.

Other courses


Other courses


Browse our other data science skills courses

Settings


Settings


[ {"navLabel":"What you'll learn", "navSection": 1 }, {"navLabel":"Register interest", "navSection": 2 }, {"navLabel":"Who it's for", "navSection": 3 }, {"navLabel":"Other courses", "navSection": 4 } ]