Critical data modeling and the basic representation model – Annotation & Notes

Chart showing Race disparities in US criminal justice system, late 2010s

Data models are foundational to information processing, and in the digital world they stand in for the real world. When machines are used to make algorithmically-informed decisions, their algorithms are informed by the data models they use. And the data structured by data models is numerical of necessity, since machines must perform logical operations, and not creative interpretations. It follows that data used in machine operations are machine-language translations of real-world phenomena, expressed in a data model designed for efficient processing. It should not be surprising then that as information systems increasingly make decisions that affect people and communities, their operations are in a very direct sense an extension of the messy human world. This has resulted in information systems that reflect human racism, sexism, and many otherisms, with real-world harm to individuals and communities. But given the black-box nature of “machine learning” algorithms, how do we know what happens inside the black box? How can we document machine bias so as to design algorithms that don’t perpetuate social harms?

Annotation – A Bundle of Open Source Resources for Social Media Data Mining and Analysis

command line graphic of hand tools

We've reached the final annotation in our series on "Social Media Data Collection, Processing, and Use in Research, Marketing, and Political Communication." Toward the end of the project my research drifted from traditional academic sources to investigative journalism. We now veer further off-track into blog posts and GitHub repos. Some videos and a course syllabus on Data Science for Social Systems. Tools, documentation, and related sources that don't fit neatly into any particular box. This isn't so much an annotation as a grab bag of annotated links. I apologize in advance.

Annotation: A Survey on Sentiment Analysis and Opinion Mining Techniques

A elephant running

The phrase "sentiment analysis" is high on the list of search terms for anyone seeking to understand how to process social media data. It's a component of Natural Language Processing (NLP), where a machine extracts (somewhat) accurate meaning from human language and textual information. This seems really hard, unless I'm wrong, because the whole AI field and NLP seem to be moving forward fast once again. Here's an annotation of a journal article that provides a decent overview.

Annotation: Interview with Ian Brooks

Crimson Hexagon dashboard

I met Ian Brooks and his family through the theatre program at Champaign's Central High School, where our kids participated in numerous performances and the whole wonderful high school theatre thing. I knew he was doing research in the use of social media to inform public health interventions, as a faculty member of the University of Illinois iSchool and associate of the National Center for Supercomputing Applications. So when I took on this project he was the first person I wanted to consult for insight.