We asked Phillip Brooker, an interdisciplinary researcher in the field of social media analytics, and social science expert on Introduction to Python for Social Scientists, for his advice on using data science methods in social science research.
Phillip has background in sociology and sociological research methods, and co-convenes the Programming-as-Social-Science (PaSS) network which explores computer programming as a subject and methodological tool for social research and teaching. So if you’re thinking about using computational social science methods, listen up, you’re in good hands!
Before we talk about how others can integrate data science with their social science research, can you tell us what inspired you to do what you do?
I'm from a sociological background, but I've always been an interdisciplinary researcher working between social science and computing in various ways. So when I fell into working on Chorus with Tim Cribbin and Julie Barnett in 2012, that sparked off a lot of thinking about digital social science and the role of social media data as a relatively new thing to deal with. One of the interesting things about digital data is the sense in which in some ways it resembles the kinds of data we've always worked with in social science, and others, it's almost completely new. So, there's always work to be done in terms of finding ways to make sense of this methodologically.
What advice would you give to someone new to data science?
I think the main thing is not to think of 'data science' as having its own set of core methods and techniques. Maybe some would argue with me on that, but I think the best work in this sort of area happens when people bring their own specific disciplinary knowledge and skills to it, rather than try to 'learn/do data science' as if it were a cohesive and tightly-bounded discipline. So, my advice would be to put some effort into figuring out what your perspective on data science would be, and what sense you can make of it from your own philosophical/conceptual/methodological background.
What's your favourite data science tool and why?
My answer is enormously biased; predictably, I'll say Chorus and Python :-)
Chorus because it opens out the research process so you can see it as a non-linear iterative space where research questions, analytic frameworks, platforms, tools, data and visualisation (and everything else!) are topicalised and made visible as the software forces you to confront them constantly (which is, in my mind, a very methodologically sound way of working). And Python because it's an open-ended toolkit within which there is scope for so much innovation (and which is relatively new as a research skills amongst the social science community, so there are lots of potential ways to take it).
What data science skills should researchers be building on to boost their research?
To my mind, actually acquiring technical skills is not really the biggest problem - it's certainly not trivial, but it's possible to learn how to do these things; this is especially so with the wealth of new courses and books that are coming out, which sort of shows how rapidly the field is developing (exciting!). I think the biggest problem is in thinking through how to work with digital data in sensible, robust ways. Of course, this is enormously context-dependent: what counts as sensible and robust for one disciplinary orientation might not for another.
So, I'd say the most important skills to focus on are the things that underpin the research you want to do with digital and/or 'Big' data. For me, this has involved probing such things as the philosophical backgrounds of my research, my discipline, and my methodological commitments (see for instance a recent paper called What Would Wittgenstein Say About Social Media?). And moreover, thinking about my research on this level has also shaped how I have picked up and learned the more technical aspects, like learning how to code for instance.
For me, foundational and technical issues are pretty much completely inseparable from one another. Which is to say, that in order to apply a skill like Python to my research I need to see how it fits in with these broader pictures. For you, integrating technical and foundational knowledge may involve doing different things, but in a general sense, exploring these kinds of issues is an enormously important thing to do. Having an understanding of and a position on these issues will help you figure out how to do this kind of work your way, for the tasks you want to do.