// Replace title block colour with text shadow

This post is a guest blog by Dr. James Allen-Robertson, instructor on our Collecting Social Media Data course. James is a Computational Social Scientist in the Department of Sociology, University of Essex. His work focuses on utilizing data science tools, including large-scale text analysis, social network analysis and computer-vision as aids to explore social science problems.


The methodology behind the method

When I first began dabbling in computational social science, particularly around studying social media, I turned first and foremost to the open-source community. Whether it was data collection, analysis, or visualisation, the internet was awash with tutorials designed to get you quickly and efficiently up to speed in implementing that linguistic model, pulling obscene amounts of data from Twitter, or getting the font on your x-axis label just right.

Whilst ‘Googling’ remains the greatest skill for day-to-day work in computational social science, these tutorials only got me so far. Often, I would sigh as, having been guided step by step in how to implement a technique, the result would just be presented without any kind of consideration of what it meant. Sometimes one wondered why we had even bothered, given that what was produced was seemingly devoid of any relationship to reality.

Don’t get me wrong, what is provided by these communities, as a form of service to their discipline, is of monumental value and still key to what I do now, but what is missing is the methodology behind the method. I had learned how to gather data but what kind of sample had I gathered? I had implemented a model but was it rigorous, were the parameters justifiable. I could gather massive amounts of data from a wide variety of sources, but was it ethical to collect it, was it ethical to use it? Methodologically speaking did any of this make sense at all?!

 Why do we need to consider these things, even if not academics?

  1. Porting the knowledge of academia to business/policy etc.

  2. Drawing on expertise within the academy to inform conclusions drawn.

  3. Make better decisions and head-off your assumptions.

These were of course questions of great importance for me as an academic researcher, but I have no doubt that they are important considerations beyond the academy as well. The methods of social research and the theory of research design are paramount to producing good, valid, and valuable results whatever sector you’re coming from and whatever the application of these skills. Being able to go beyond simply implementing a computational method, but to understand its value, to understand the biases inherent in the processes, and to approach them in a way that is ethically sound is key.

Collecting Social Media Data

To me, this is where courses like those on SAGE Campus fit into the ecosystem of learning about cutting-edge methods and techniques. The course that I produced in collaboration with SAGE Campus, Collecting Social Media Data, draws on the methodological expertise of academic researchers to help you better understand these deeper questions around research design and implementation when it comes to social media data.

The course begins with an in-depth consideration of how ethical standards apply in this space, and how social media platforms complicate the age-old questions around participant consent or even the notion of a participant at all. We discuss how the expectations of researchers, users, and the platforms converge to shape our understanding of ethical practice in this space and offer a framework for thinking through these questions in a systematic way.

N0P0152 SAGE Campus - Collecting social media data_Email 250x250.jpg

Planning your data collection and approaches

The course then addresses the planning of a project and understanding what kinds of questions might be answerable through social media data. Whilst social media data can provide researchers a major opportunity for creative and insightful work, it is not without its problems and biases which can impact upon the conclusions you can draw. Equally, whilst there is a vast range of different types of data available across different platforms, their utility is not infinite and it is important to understand what kinds of data can be gleaned from social media, and how a project might expand beyond the data these platforms provide.

Having done much thinking about the project, how then do we actually, practically, do the project? The act of data collection may be more diverse than you might imagine and we consider a range of approaches depending on the kind of project you are conducting. Perhaps the scale and meta-data provided by the platform API will perfectly supply your project with the kinds of data it needs. However, you may find a more considered manual approach to the collection will give you insights that the API just cannot provide down its pipeline. We consider multiple approaches, as well as how the approaches may shape the realities of your project, particularly when it comes to the design and limits of a platform API.

Tools for social media research

Finally, we consider the tools available to the emergent new social media researcher and how to judge them in relation to your project’s needs. There is a vast world of toolkits available to researchers ranging in user accessibility and in their features. Some may want to take the programmatic approach, building their own solution from the ground up using a programming language like Python or R. Others may find that their questions can easily be answered by pre-built toolsets, offering a quicker entry into the field. However, not all tools are created equal and we offer a range of points worth considering when you are selecting what tool to use, encompassing not just methodological factors, but also practical ones.

For me this has been the steepest learning curve, particularly when it came to scale. I swiftly learned that perfectly good techniques and tools began to buckle and bend as my sample size grew, leading me to seek out different tools and approaches – this was in fact the impetus that got me into computational social science in the first place!


Find out about our Collecting Social Media Data course and sign up to our demo hub to try a free module.

Libraries can get a full 30 day institution-wide trial to SAGE Campus. Recommend us to your library or request a trial if you are an administrator via this form.