Learning a programming language like Python can seem daunting for social scientists and other researchers who don’t have a technical background. Dmitrijs Martinovs, our amazing assistant at SAGE Campus, tells us about his personal experience of switching from SPSS to Python to better conduct analyses when he was undertaking his research masters in social policy.
How I first heard about Python
I first heard about Python one dull winter evening, when a friend of mine came over for a good old chinwag before he set off to travel the world. This guy’s interests should not be understated; he is a self-taught software developer, who was working in a start-up at the time, and subsequently sold his shares when the start-up was acquired. I was studying statistics at the time, as part of my research masters in social policy, and he was the only friend who could relate to all the interesting things I was learning.
The critical moment came when I mentioned that I use SPSS statistical software to carry out all my analytical calculations. Looking at our conversation through his ‘software developer-lens’, my friend asked why I don’t learn a programming language to carry these out instead. He explained that using a programming language could be free, and that I’d have full control over all aspects of a particular analytical technique. This is how I found out about Python, “the most novice-friendly programming language”, my friend told me.
How I found learning Python
Three months later, I took my friend’s advice and bought an introductory online course to Python. I had no prior knowledge of programming languages so needed to learn the basics as a prerequisite to a second course on Python specifically for data analytics, using Jupyter notebooks and pandas.
I chose the online course as it was relatively cheap, self-paced and with downloadable material, so I could fit it in around my workload and lifestyle. It contained a number of projects and exercises throughout where I could practice and test what I’d learned. However, unlike the Campus course, it wasn’t tailored to social science so the examples were hard to relate to and some of the exercises weren’t applicable to what I’d conduct in my research. Since joining Campus and continuing my learning with Campus’ online courses, I’ve found the unique social science focus crucial in truly grasping these skills.
Overall, my friend was right, I found Python is not that hard to learn. It’s intuitive and pretty straightforward. I’ve learned fundamental concepts such as variables, operation ordering, floats, string methods, comparison operators, IF statements and more. And I’m now able to do tasks I had never imagined I’d be capable of doing! I can write a basic code, which can make the machine interact with a user and make calculations. Furthermore, since taking the Campus course, I’ve actually gained practical skills that I can directly use in social science research.
Benefits of using Python
For me, the most important benefit of learning Python was learning about the different open source libraries it has to offer. In particular, libraries such as pandas, that allow social scientists to carry out data analysis. Pandas allows social scientists to aggregate data, carry out simple and multiple linear regressions, logistic regressions, chi-squared tests, t-tests - you name it!
The beauty of this is that, if a research paper is based on a statistical analysis conducted in a programming language, it fulfills two of the main principles of any scientific research; reproducibility and transparency. By disclosing their code, a researcher allows others to run the very same code for their analysis; meaning the study can be reproduced with high precision. The code also makes all the details of the analysis transparent, enabling you to see exactly how others came to their conclusions.
However, this only benefits researchers who understand Python. So, instead of only stating what type of statistical analysis and what software was used in the research, researchers must also reveal the mechanics behind their calculations. This opens the research up to more rigorous scrutiny by their peers. Code can also be adapted to the type of data at hand, allowing it to carry out potentially more creative and sophisticated analyses as part of other research. In addition, using a programming language in social research opens the door to the analysis of big data, which is increasingly becoming available in the modern digital world.
Would I recommend Python to other researchers?
Yes! All in all, my experience learning Python has been fulfilling and empowering. Learning Python truly makes me feel like I am able to subject the machine to do amazing tasks for me, rather than using software with a third-party user interface. I couldn’t imagine that one dull day could turn into such an eye-opener for my research.
By Dmitrijs Martinovs