This is an academic research project and for the most part I’m focusing on academic resources. But I’m working to understand the specific tools and methods for mining social media data in order to effectively intervene using communications campaigns. As I said in my intro to this project, I’m motivated by recent revelations concerning the use of social media data in political campaigns using so-called psychographic techniques. But most of the academic literature doesn’t go there; it merely describes high-level concepts and variants of Natural Language Processing (NLP), text mining, and sentiment analysis.
The annotation offered here adds to these concepts by introducing “community mining” and techniques for analyzing key players, roles, and strong subgroup connections of communication and influence within a larger social media network. These are key concepts for understanding how opinions within a network are formed, shared, and spread. Or as someone might have once said, it’s about influencing a group by influencing the influencers.
The author of this paper, Martin Atzmueller, is a researcher in artificial intelligence and cognitive science. In addition to a useful discussion of community mining, he suggests that social media analysis could be extended by data from ubiquitous sensors in the physical environment, including smartphones and active RFID devices. Jeffrey Pomerantz calls this “data exhaust,” and each of us leaves a wide trial of it without much thought.
My suggestion is it’s probably worth thinking more about.
Here’s today’s annotation of Martin Atzmueller’s article.
Atzmueller, Martin. “Mining Social Media: Key Players, Sentiments, and Communities.” WIREs: Data Mining & Knowledge Discovery, vol. 2, no. 5, Sept. 2012, pp. 411–19.
In this article for Wiley’s WIREs publications, cognitive science and artificial intelligence researcher Martin Atzmueller explores methods for extracting information, patterns, and knowledge from social media data combined with ubiquitous embedded systems including RFID-based applications, sensor networks, and mobile devices. He introduces the basic terms of his research, including Social Network Analysis (SNA) wherein communities are identified and mapped into sets of nodes, strongly connected by identifiable interests or needs. He then describes ways to identify and characterize “key players” in social networks, defined as “actors that are important for the network in terms of connectivity, number of contacts, and the paths that are passing through the corresponding node.” He introduces the term “role mining” as a method of discovering actor profiles with certain features such as prestige and community importance. Characterizing the communities and roles provides a map for understanding how communications moderate community attitudes and behavior.
Atzmueller next describes methods of sentiment mining and analysis, which he defines as “extracting subjective information from textual data using NLP, linguistic methods, and text mining approaches.” He cites B. Liu’s key elements of sentiment analysis as the opinion holder, the object and features of the opinion, and the positive or negative opinion orientations. Machine learning techniques such as latent semantic analysis and support-vector machines (SVM) are commonly used for developing a sentiment classification for a given text corpus.
Following on the discussion of communities and key players, the author summarizes the concept of “community mining” wherein clusters or subgroups communicate with each other in a larger network. Pattern mining and statistical approaches are used to identify these densely-connected clusters.
The author then describes the open source tool VIKAMINE, and its use in the analysis and mining of social media communities and subgroups. He states that VIKAMINE has been effective in a broad range of social media analysis scenarios, including community mining, key actor and role characterization, and pattern mining and analytics. He concludes with a discussion of “reality mining,” wherein social media analysis is combined with “everyday” sensor information including smartphones and sensor networks.