I posted this short piece in 2015 on Medium where you can still find it. I'm republishing it here because, somewhat ironically given its topic of preservation, I'm less than fully confident that Medium will still be around in a few years, at least in its current "open" form. Exposure to archival practice came from my struggles as a media producer in the emerging digital age. I began designing websites with streaming and downloadable multimedia in 1997, and quickly realized that without an archival plan the situation was becoming hopeless. I saw how quickly technology was changing, and suspected that the media we published on the web at that time would be unplayable within a few years. And the challenge of preserving the audiovisual record has only grown larger since I wrote this in 2015.
The analysis presented here is based on my review of existing research on privacy expectations of people who create online content. This analysis concerns the full range of user interactions on what we used to call Web 2.0 platforms, focusing on social media systems like Facebook, Twitter, Reddit, Instagram, and Amazon. User interactions include posting original content (text, photos, videos, memes, etc.), and commenting on content posted by others. Reviews on Amazon and comments on news websites count as online content in this analysis. Photos uploaded to photo-sharing sites and original videos posted to YouTube also count. Anything in any format created by an individual from their own original thought and creative energy, and subsequently posted by the individual on social media platforms, counts as online content. In most instances the online content or interaction contains or is traceable to personally identifiable information, even if this is unintended by the content creator.
Today we have an abundance of information resources undreamed of in past centuries, but are exposed via the Internet to more disinformation than any previous generation. Digital media technologies are being massively leveraged to spread propagandistic messages designed to undermine trust in all forms of information, and to stimulate strongly affective responses and an entrenchment of political, cultural, and social divisions. The critical demands of the digital age have outpaced development of a corresponding information literacy. Meanwhile journalists are accused by authoritarian leaders of being “enemies of the people” while facing layoffs from newsrooms no longer supported by a sustainable business model. Short of reinvention, professional journalism will be increasingly endangered and the relevance of news organizations will continue to decline. In this paper I propose a new collaborative model for news production and curation combining the expertise of librarians, journalists, educators, and technologists, with the objectives of addressing today’s information literacy deficit, bolstering the credibility and verifiability of news, and restoring reasoned deliberation in the public sphere.
The digital artifact known as Early English Books Online (EEBO) is a resource for research on British history and literature between 1473 and 1700. EEBO is a collection of 146,000 mostly English works accessible via an online database, available by subscription from ProQuest. In this article I first review the history of EEBO, which began with cataloging efforts more than a century ago, through the processes that developed the online version used by so many scholars today. I then critically review its limitations, and discuss some of the challenges and drawbacks inherent in the transformation of analog source materials into digital form, including information distortion and loss, format obsolescence, and the challenges of digital preservation.
There's nothing surprising in the results of a 2016 study conducted for Buzzfeed by Echelon Insights and Hart Research. It shows that the majority of Americans who were likely voters in 2016 were under the age of 50, and that about half of them share news links on social media every week. But the results of this study conform with a growing body of other research that shows a massive shift in how people discover news, and how trust works in a media system increasingly dominated by social media platforms.
So far in this project I've been annotating traditional academic sources. These sources explore methods of machine learning, Natural Language Processing, sentiment analysis, and the tools used to mine social media for research purposes. But the literature hasn't kept pace with the news, and social media data is being used for things other than academic research. Like maybe stealing elections. Here begins a series of three annotations of investigative reports by London's Channel 4 News. These are video stories about Cambridge Analytica and its methods and role in political campaigns in the U.S., Africa, Europe and beyond. These are of course non-traditional annotations. But I consider the source credible, and given the subject important to include.