This course is one of a suite of courses that make up our Fundamentals of Data Science for Social Scientists package for institutional purchase.
Our bytesize courses from our Fundamentals of Data Science for Social Scientists package are a series of short courses that teach core data science skills to people who are eager to learn, but short on time.
In Bytesize: Machine Learning you will learn the basics of machine learning and core organizational concepts of classification and regression, data preprocessing and fitting a model to a training dataset.
By the end of this bytesize course, you will be able to:
Identify machine learning applications.
Explain the importance of preprocessing your data for machine learning.
Explain the rationale for splitting data into training, cross-validation and test sets.
Understand and apply basic ideas about algorithm construction and configuration settings.
Situate ensemble methods in the broader machine learning environment.
Instructor
Dr. Claudia von Vacano
Dr. Christopher Hench
Geoff Bacon
Price
This course is one of a suite of ten courses that make up our Fundamentals of Data Science for Social Scientists package for institutional purchase.
Find out more here.
I wish I had taken this course years ago when I started learning R. I'm much more confident now and I'm not even halfway through the course.
Thank You!
To successfully complete this course, you will need some knowledge of Python and/or R. We assume that participants can assign variables, direct the flow of control using conditionals, define their own functions and read files. These topics are all covered in the Introduction to Data Science with Python and Introduction to Data Science with R courses that also come as part of the Fundamentals of Data Science for Social Scientists package that these bytesize come in. If you complete one of those courses first, you will have the knowledge you need for this course.
This course comes in our Fundamentals of Data Science for Social Scientists package for institutional purchase, along with nine other courses. You can find out about and enquire about purchasing the full package here.
You can choose to purchase a 6 or 12 month subscription to the Fundamentals of Data Science for Social Scientists package that this course is part of. Learners will have full access to this and all of the other courses for the duration of the subscription.
Your institution’s subscription will be tailored depending on the number of learners you have. We can accommodate anything from 10-100 learners.
Yes! If you would like to see the courses before purchasing your institution’s subscription you can trial free demos of the courses. Simply ask for this when you make your enquiry.
Only if you want to! All that’s needed to complete the courses is a web browser and an internet connection. We do all our programming using JupyterHub, which means that you can code in your browser window.
Jupyter notebooks offer a seamless integration of code with explanatory markdown text. It will allow you to read the narrative of the programming task, and write code of your own to fit into the larger narrative. If you’d like to find out more about Jupyter notebooks, see our blog post 'What is a Jupyter notebook'.
However, if you wish to install R or Python, please do!
All of our courses offer a certificate of completion signed by your instructor. You will be able to download this certificate, from the Learning Platform, when you complete the course.
Can't find what you're looking for? Contact Us
Dr. Claudia von Vacano is the Executive Director of the D-Lab and the Digital Humanities at the University of California, Berkeley, and is on the board of the Social Science Matrix. She received a Master’s degree from Stanford University in Learning, Design, and Technology. Her doctorate is in Policy, Organizations, Measurement, and Evaluation from UC Berkeley. The Phi Beta Kappa Society, the Andrew W. Mellon Foundation, the Rockefeller Brothers Foundation, and the Thomas J. Watson Foundation, among others, have recognized her scholarly work and service contributions.
Christopher is the Program Development Lead for the D-Lab and Digital Humanities at the University of California, Berkeley. He teaches Python, Bash, and Git workshops and consults on text analysis and web scraping. He is a PhD Candidate in German Literature and Medieval Studies at UC Berkeley and a Data Science Fellow at the Berkeley Institute for Data Science (BIDS). Christopher is interested in computational approaches to formal analyses of lyric and epic poetry. His research has been supported by the Fulbright Program and the DAAD.
Geoff Bacon is a PhD student in the Language and Cognition Lab and Graduate Student Researchers at D-Lab at UC Berkeley. His research focuses on two questions: why languages look the way they do and how people learn to express temporal semantics. To study these, Geoff uses probabilistic models of language, including Bayesian and neural network models, which he programs in Python. At undergrad Geoff studied linguistics, classics and Arabic.