I’ve been spending time learning Python since January, and it’s creating new problems. For example, suddenly I want to do things with Python. I want to write a program to process titles and filenames of media archives records from an Excel spreadsheet, and find the matching media files which are stored on a network drive. I need to read a few thousand PBCore XML records and convert them to JSON. I want to take the JSON output from Google Speech-to-Text transcripts, and convert it to WebVTT files. But I can take on only one new project right now, and here it is.
I want to understand how social media data is harvested, processed, analyzed, and used for marketing and political communications. I’m motivated by recent revelations concerning the use of social media data in political campaigns using so-called psychographic techniques. While claims for the pragmatics of psychographics could be a subject for other research, I intend to focus on the details and methods of data processing that form the technical basis of using social media metadata for psychographic purposes. I am interested in the role of programming in accessing, collecting, processing, and using social media data, and the specific tools and workflow that enable this work.
An annotated bibliography may useful as a starting point for further exploration of significant recent events, such as the operations of Cambridge Analytica in political campaigns in the United Kingdom, the United States, Africa, and elsewhere. I believe there is an interesting story to be told about how data was collected, processed, and used in these events, and specifically how computer science and programmers enabled this work. My hope is to present this story in a way non-programmers can understand, using multimedia storytelling techniques. While my research could result in a presentation in that form, it’s possible that an annotated bibliography is a better fit within the scope and timeline of this research project.
With that as a preamble, I’ll begin here with journal entries reflecting my research successes and stumbles. I’m going to focus primarily on Facebook Open Graph data and the process of (ahem) “working with it” to do “various things.” I don’t have the programming chops to write a useful tool kit, but obviously that work has already been done by others. I’m going to lean on my academic research skills, and my own social networks (IRL) to figure out what I can. Let’s roll.