Intro


SCROLL DOWN

Intro


 

Fundamentals of data science for social scientists

Suite of online courses for institutional purchase

Overview of course package


Overview of course package


Full data science e-Learning solution for your institution

Upskilling a group of students, researchers, faculty or staff in data science can be daunting and time consuming. Finding quality courses with relevant, applicable content that suit groups with varying skill sets and schedules is no easy feat.

Fundamentals of Data Science for Social Scientists is a full suite of online courses designed to solve your institution’s pedagogical needs and get your team or group mobilized and practicing the latest computational methods. The learning path covers everything from the big picture of data science, getting started with programming skills in R and Python, through to more advanced ‘bytesize’ data management and analysis topics. There’s something to offer everyone.

SAGE Campus created the courses in partnership with the Social Science Data Lab (D-Lab) at the University of Berkeley, so you can get courses by leading expert instructors, with the trusted SAGE quality, and an unparalleled online learning experience.

English
The full package consists of 10 courses and takes 100+ hours to learn
6 months access

Partner institution
The D-Lab at the University of Berkeley

Course instructors
Dr. Claudia von Vacano
Dr. Christopher Hench
Geoff Bacon

Price
Pricing depends on the number of learners and your institution’s requirements. Contact us for a quote.


Icon_55px_10.png

Enrollment is easy

Provide us with a list of learners and we’ll sort the rest! We provide learners with their own log-in details.

Courses available 24/7

Learners with busy and different schedules or varying abilities can learn at their
own pace.

Icon_55px_24.png

Clear learning pathway

Learners just starting out can take their time and those who are more advanced can skip to intermediate courses.


Icon_55px_20.png

Choice of programming language and topics
The choice of programming language and bytesize topics allows learners to focus on what applies to their research.

Prerequisites


Prerequisites


2.png

Time to complete: This section consists of one introductory course on the theory of data science and it’s applicability to social science and takes 5-9 hours to complete.

Prerequisites: Suitable for all. A basic foundation in statistics would be helpful but is not essential.

3.png

Time to complete: This section consists of two courses, where learners can choose to learn a programming language of their choice; Python (which takes 33 hours), or R (which takes 35 hours).

Prerequisites: No prior programming experience is required but completing the introductory course is helpful.

1.png

Time to complete: This section consists of 7 standalone “bytesize” courses so learners can choose topics most relevant to them. The courses take a total of 30 hours combined.

Prerequisites: Require knowledge of R or Python. Learners are taught the required knowledge in the beginner section.

All courses


All courses


All courses

1. Introduction to Data Science for Social Scientists (introductory)

This introductory course gives an understanding of data science methods and tools, all from a social science perspective. By the end of the course learners will:

  • Understand how data science is changing social science.
  • Have knowledge of data science tools used for social science research.
  • Understand the value of open-source programming languages, specifically R and Python.
  • Know about Jupyter notebooks, a browser-based tool for creating interactive documents with live code.

See the full syllabus at the course page here.

2. Introduction to Data Science with R (beginner)

This course will teach learners to program in the R programming language, master the fundamentals of R and learn practical skills that are directly applicable to social science research.

It covers Jupyter Notebooks, variables, data types, data structures, plotting, statistical testing and more.

See the full syllabus at the course page here.

3. Introduction to Data Science with Python (beginner)

This course will teach learners to program in the Python programming language, master the fundamentals of Python and learn practical skills that are directly applicable to social science research.

It covers Jupyter notebooks, variable assignment, functions and variables, programming style and more.

See the full syllabus at the course page here.

4. Bytesize: Collecting Data from the Web (intermediate)

Teaches learners how to extract data from web resources appropriate to their research questions. Special attention will be given to how to obtain permission from hosts, and proper etiquette when using APIs and scraping. By the end of this bytesize course, learners will be able to:

  • Explain in simple terms how the internet works.
  • Define and use an API to collect data from the web.
  • Explain the difference between using an API and web scraping.
  • Recognize potential legal issues surrounding web scraping.
  • Use a programming language to collect web data.

See the full syllabus at the course page here.

5. Bytesize: Cleaning Data and Preprocessing (intermediate)

Teaches learners how to prepare data so that it is in a format that can be recognized by the coding function in R or Python. By the end of this bytesize course, learners will be able to:

  • Define what cleaning and preprocessing data are.
  • Explain why these steps are necessary.
  • Identify common cleaning tasks and possible solutions.
  • Use regular expressions to standardize text.
  • Join multiple data sources together.
  • Incorporate best practices into your data science workflow.

See the full syllabus at the course page here.

6. Bytesize: Data Formats (intermediate)

Teaches learners what formats data comes in, and how they should structure their own data if they collect it themselves. By the end of this bytesize course, learners will be able to:

  • Name the most popular data formats used today.
  • Explain the difference between data formats.
  • Summarize the reasons for storing data in these different formats.
  • Decide which format a dataset should be kept in.
  • Read and write these formats with Python or R.

See the full syllabus at the course page here.

7. Bytesize: Network Analysis (intermediate)

Teachers learners how to model explicit relationships, how to examine the statistical properties of relationships in co-mention networks, and how contextualize the statistical properties of a network. By the end of this bytesize course, learners will be able to:

  • Understand what constitutes a social network.
  • Identify and describe different levels of analysis.
  • Use the open source software tool GEPHI to do your own network analyses.
  • Correctly interpret network properties.

See the full syllabus at the course page here.

8. Bytesize: Data Visualization (intermediate)

Teachers learners effective presentation methods for various data types and variables, and how to create their own visualizations in Jupyter notebooks in Python or R. By the end of this bytesize course, learners will be able to:

  • Explain two popular visualization theories: Marks and Visual Variables, and Gestalt.
  • Enumerate a hierarchy of visual perception.
  • Apply these theories to assess the effectiveness of visualizations.
  • Describe how to convey an entire story using an infographic.
  • Use visualizations for exploratory data analysis.
  • Write code to produce geographic visualizations.

See the full syllabus at the course page here.

9. Bytesize: Machine Learning (intermediate)

Teaches learners the basics of machine learning and core organizational concepts of classification and regression, data preprocessing and fitting a model to a training dataset. By the end of this bytesize course, learners will be able to:

  • Identify machine learning applications.
  • Explain the importance of preprocessing your data for machine learning.
  • Explain the rationale for splitting data into training, cross-validation and test sets.
  • Understand and apply basic ideas about algorithm construction and configuration settings.
  • Situate ensemble methods in the broader machine learning environment.

See the full syllabus at the course page here.

10. Bytesize: Text Analysis (intermediate)

Teavhers learners the building blocks that serve as the foundation for computational text analysis. By the end of this bytesize course, learners will be able to:

  • List and justify or criticize common preprocessing steps.
  • Explain the "bag of words" (BoW) model.
  • Define TFIDF value.
  • Define "n-gram" and explain how it improves our language model.
  • Create features suitable for a classification model.
  • Correctly interpret a topic model.

See the full syllabus at the course page here.

FAQs


FAQs


How long will my institution have access to the courses?

Your institution can get a 6 month subscription to Fundamentals of Data Science for Social Scientists. Learners will have full access to all of the courses for the duration of the subscription.

How many learners can access the courses?

Your institution’s subscription will be tailored depending on the number of learners you have. We can accommodate anything from 10-100 learners.

How do I pay?

We will invoice you for your subscription.

How often will you communicate with learners from my institution?

We can arrange to send regular reminders to your learners about the courses and are able to tailor these emails to suit your institution’s needs. When you set up your subscription with us, we can discuss your communication needs and our recommendations for keeping learners engaged.

How do I know how my group is doing?

We can provide overall reports of how your group is doing at interim periods throughout your subscription. Alternatively, we can provide access to the learner platform where you can see learner reports and can pull the data you need.

Can I try out the courses first?

Yes! If you would like to see the courses before purchasing your institution’s subscription you can trial free demos of the courses. Simply ask for this when you make your enquiry.

Does my institution or our learners need to install any software?

Only if you want to! All that’s needed to complete the courses is a web browser and an internet connection. We do all our programming using JupyterHub, which means that you can code in your browser window.

Jupyter notebooks offer a seamless integration of code with explanatory markdown text. It will allow you to read the narrative of the programming task, and write code of your own to fit into the larger narrative. If you’d like to find out more about Jupyter notebooks, see our blog post 'What is a Jupyter notebook'.

However, if you wish to install R or Python, please do!

About D-LAB


About D-LAB


About SAGE Campus

SAGE is passionate about social science and is dedicated to finding new ways to support social science researchers. We understand that the rise of big data and new technology is set to revolutionize social science. Researchers of all kinds need an array of additional skills and experience to take advantage of this opportunity, while maintaining integrity of the research process.

SAGE Campus courses are designed to support the development of these new skills. We’ve delivered courses to nearly 600 academics and practitioners in the field of social science, so we’re well equipped to help your colleagues or trainees achieve important milestones within your institution.

d-lab.jpg

About D-LAB

This course was developed by the Social Science Data Lab (D-Lab) at the University of California, Berkeley. The Jupyter notebooks and videos were developed and produced by the D-Lab. The JupyterHub was configured with support from the Jupyter Project, SAGE, and the D-Lab.

The D-Lab gives researchers access to the cutting edge of the data revolution. Operating as a hub for broad-ranging and multidisciplinary data-intensive social sciences, it advances research excellence by helping the community integrate latest software, technology, and methods into their research practices.