Annotation – Content Marketing Through Data Mining on Facebook Social Network

So far I’ve been looking at how researchers analyze social media networks, and perform tasks like opinion or sentiment analysis to understand how people feel and think about various subjects and entities. With this annotation we’re looking at a thing called content marketing, where influential users of a social network are first identified, then used to spread messages to their network of influence.

This isn’t a new concept in marketing, as it’s long been understood that to move large numbers of people you need to move the key people who can move them. In the field of public relations, hese key people are called opinion leaders. What’s different now is that with digital social networks like Facebook, with 2.7 billion members all leaving massive trail of data with every post, comment, and like, it has become possible to programatically find who is the most influential on any given subject.

The following annotation makes reference to the Netvizz application for mining the Facebook Graph API, and Gephi software for social network analysis. Both are open source and free to use, but after the Cambridge Analytica scandal (still unfolding), Facebook has significantly narrowed the scope of Open Graph data that can be mined. I’m still working to get a handle on the extent of the available data, and the “best practices” for getting at it. But here’s a snapshot of some research done prior to the political events of 2016, which shed some light on how this data is used to influence the influencers. Even in a huge network, a little bit of leverage in the right places can move products, and perhaps elections.


Forouzandeh, Saman, et al. “Content Marketing through Data Mining on Facebook Social Network.” Webology, vol. 11, no. 1, June 2014, pp. 1–11.

In this introduction to their research on content marketing, Forouzandeh, Soltanpanah, and Soltanpanah discuss potential advantages of using data mining techniques to identify user interests and behavior on social networks. They note that many advertising and marketing messages don’t account for user preferences and are therefore ignored. To address this disconnect, they introduce the concept of “content marketing,” which they define as “a marketing process of creating and properly distributing content in order to attract, make communication with, and understand other people to they can be motivated to do beneficial activities.” Content marketing techniques seek to understand how individuals communicate with each other, so that they can be used to distribute marketing information to influence each other.

The authors then report on a study they conducted of content marketing on Facebook. They used the Netvizz application to develop a friendship graph analyzed with Gephi software, and analyzed the communication behavior and interests of users using data mining techniques. (Note that Facebook recently implemented new limits on metadata that can be openly mined using some of these techniques, due to abuse of their terms of service by some actors.) A new Facebook account was created to connect with users, and Netvizz was used to extract Facebook Open Graph data such as Locale, Like_Count, Post_Count, and Post_Engagement_Count. The research determined that the latter two data fields were especially valuable in identifying users who wrote many posts, and whose posts attracted many likes and comments by other users. Influential users were identified in this manner. These users were asked to distribute marketing messages to other users, resulting in a larger number of shares and customer conversions.

The authors provide additional details on their workflow in identifying influential users, such as the use of Gephi software to map their Facebook connections of friends, posts, comments, and likes. This data was downloaded as an Excel file for further text analysis, and the users were scored based on measures of popularity and frequency of likes and comments on their posts. Based on their connections and behaviors, the researchers determined that these individuals can be predicted to be most influential with other users in future marketing communications.

In the next phase of the research, the authors provided a series of messages through the influencers to test the spread and reaction of their network. Using data mining techniques they determined which messages were most effective in generating positive reactions in the form of likes.

The authors suggest that content marketing can be combined with other marketing strategies such as viral marketing, where influential individuals are given products or services for free in exchange for influencing others. They conclude that content marketing has a comparative advantage over other forms of marketing in social networks, since the message is not presented as a direct commercial appeal. Instead the message is accepted from a trusted source, and thus overrides critical resistance to the marketing appeal.