Difference between revisions of "Data Mining and Manipulation"

From SI410
Jump to: navigation, search
(added intro)
Line 1: Line 1:
Data mining refers to the processes involved in analyzing data to extract key themes, ideas, and models which then can be applied to benefit the system as a whole.  It concerns a variety of disciplines ranging from mathematics, statistics, political science, and economics.  Data Mining tools such as powerful computers as well as computer software can analyze "big data" trends that can be applied to generate more revenue, cut costs, and improve user experiences.  New technologies in data mining revolved around algorithms and machine learning that can automatically detect anomalies and trends by itself.   
+
Data mining refers to the processes involved in analyzing data to extract key themes, ideas, and models which then can be applied to benefit the system as a whole.  It concerns a variety of disciplines ranging from mathematics, statistics, political science, and economics.  Data mining tools such as powerful computers as well as computer software can analyze "big data" trends that can be applied to generate more revenue, cut costs, and improve user experiences.  New technologies in data mining revolved around algorithms and machine learning that can automatically detect anomalies and trends by itself.   
  
Data Mining is a field emphasized in the field of consumerism where the user is centric to functionality.  Financial companies have long used Data Mining to determine portfolio investments, predict stock change, and inform clients of possible future decisions.  However, Facebook has recently pioneered the Data Mining industry over the past decades just by collecting data from its 1.23 billion active users.  It has especially pioneered machine learning algorithms to analyze the online behavior of user's status updates, comments, likes, and groups.  By effectively drawing patterns from the sustained usage of millions of users, Facebook designers and developers are able to get feedback on design changes and functionalities.  This in turn helps Facebook create a more user-friendlier social networking site that attracts and retains more users.
+
Data mining is a field emphasized in the field of consumerism where the user is centric to functionality.  Financial companies have long used data mining to determine portfolio investments, predict stock change, and inform clients of possible future decisions.  However, Facebook has recently pioneered the data mining industry over the past decades just by collecting data from its 1.23 billion active users.  It has especially pioneered machine learning algorithms to analyze the online behavior of user's status updates, comments, likes, and groups.  By effectively drawing patterns from the sustained usage of millions of users, Facebook designers and developers are able to get feedback on design changes and functionalities.  This in turn helps Facebook create a more user-friendlier social networking site that attracts and retains more users.
  
The concept of Data Mining has a very fine boundary concerning ethics and morality.  It is very easy to use analytical software that provides real-time feedback to help design a better user-interface.  However, Data Mining can also be used as a weapon, to unintentionally inflict harm to millions of clients.  
+
The very nature of data mining has a very fine boundary concerning ethics and morality.  While It is very easy to use analytical software that provides real-time feedback to help design a better user-interface.  data mining can also be used as a weapon, to unintentionally inflict harm to millions of clients, as in the 2010 case when Facebook manipulated its users' emotions.
  
 +
== 2010 Case of Data Manipulation ==
 +
=== Overview ===
 +
Facebook was in hot waters for a recent controversy involving data mining and manipulation.  A team of Facebook data scientists, led by Adam Kramer, sought to know if displaying all "negative" or "positive" statuses, pictures, news, and comments were to alter the behavior in the same way a user was influenced.  For users flooded with inspirational quotes, ideas, pictures, and upbeat statuses, were they more likely to conform to the behavior of the environment and even adopt the same mindset?  Thus, this team of data scientists set out on a research project that altered the News Feeds of 689,003 users.  Facebook did not need the consent from these "guinea pigs" as they had agreed to the terms and conditions outlined in the Facebook's data use policy.  It explicitly states that user information will be used "for internal operations, including troubleshooting, data analysis, testing, research and service improvement". 
  
== Algorithms and Data Manipulation ==
 
  
== 2010 Case of Data Manipulation ==
 
 
=== Overview ===
 
  
June 29: Updated with statement from Facebook, from the author of the study, and from the editor of the academic journal that published the study.
 
  
Facebook is the best human research lab ever. There’s no need to get experiment participants to sign pesky consent forms as they’ve already agreed to the site’s data use policy. A team of Facebook data scientists are constantly coming up with new ways to study human behavior through the social network. When the team releases papers about what it’s learned from us, we often learn surprising things about Facebook instead — such as the fact that it can keep track of the status updates we never actually post. Facebook has played around with manipulating people before — getting 60,000 to rock the vote in 2010 that theoretically wouldn’t have otherwise — but a recent study shows Facebook playing a whole new level of mind gamery with its guinea pigs users. As first noted by The New Scientist and Animal New York, Facebook’s data scientists manipulated the News Feeds of 689,003 users, removing either all of the positive posts or all of the negative posts to see how it affected their moods. If there was a week in January 2012 where you were only seeing photos of dead dogs or incredibly cute babies, you may have been part of the study. Now that the experiment is public, people’s mood about the study itself would best be described as “disturbed.”
+
t human research lab There’s no need to get experiment participants to sign pesky consent forms as they’ve already agreed to the site’s data use policy. A team of Facebook data scientists are constantly coming up with new ways to study human behavior through the social network. When the team releases papers about what it’s learned from us, we often learn surprising things about Facebook instead — such as the fact that it can keep track of the status updates we never actually post. Facebook has played around with manipulating people before — getting 60,000 to rock the vote in 2010 that theoretically wouldn’t have otherwise — but a recent study shows Facebook playing a whole new level of mind gamery with its guinea pigs users. As first noted by The New Scientist and Animal New York, Facebook’s data scientists manipulated the News Feeds of 689,003 users, removing either all of the positive posts or all of the negative posts to see how it affected their moods. If there was a week in January 2012 where you were only seeing photos of dead dogs or incredibly cute babies, you may have been part of the study. Now that the experiment is public, people’s mood about the study itself would best be described as “disturbed.”
  
 
The researchers, led by data scientist Adam Kramer, found that emotions were contagious. “When positive expressions were reduced, people produced fewer positive posts and more negative posts; when negative expressions were reduced, the opposite pattern occurred,” according to the paper published by the Facebook research team in the PNAS. “These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.”
 
The researchers, led by data scientist Adam Kramer, found that emotions were contagious. “When positive expressions were reduced, people produced fewer positive posts and more negative posts; when negative expressions were reduced, the opposite pattern occurred,” according to the paper published by the Facebook research team in the PNAS. “These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.”

Revision as of 03:51, 18 February 2016

Data mining refers to the processes involved in analyzing data to extract key themes, ideas, and models which then can be applied to benefit the system as a whole. It concerns a variety of disciplines ranging from mathematics, statistics, political science, and economics. Data mining tools such as powerful computers as well as computer software can analyze "big data" trends that can be applied to generate more revenue, cut costs, and improve user experiences. New technologies in data mining revolved around algorithms and machine learning that can automatically detect anomalies and trends by itself.

Data mining is a field emphasized in the field of consumerism where the user is centric to functionality. Financial companies have long used data mining to determine portfolio investments, predict stock change, and inform clients of possible future decisions. However, Facebook has recently pioneered the data mining industry over the past decades just by collecting data from its 1.23 billion active users. It has especially pioneered machine learning algorithms to analyze the online behavior of user's status updates, comments, likes, and groups. By effectively drawing patterns from the sustained usage of millions of users, Facebook designers and developers are able to get feedback on design changes and functionalities. This in turn helps Facebook create a more user-friendlier social networking site that attracts and retains more users.

The very nature of data mining has a very fine boundary concerning ethics and morality. While It is very easy to use analytical software that provides real-time feedback to help design a better user-interface. data mining can also be used as a weapon, to unintentionally inflict harm to millions of clients, as in the 2010 case when Facebook manipulated its users' emotions.

2010 Case of Data Manipulation

Overview

Facebook was in hot waters for a recent controversy involving data mining and manipulation. A team of Facebook data scientists, led by Adam Kramer, sought to know if displaying all "negative" or "positive" statuses, pictures, news, and comments were to alter the behavior in the same way a user was influenced. For users flooded with inspirational quotes, ideas, pictures, and upbeat statuses, were they more likely to conform to the behavior of the environment and even adopt the same mindset? Thus, this team of data scientists set out on a research project that altered the News Feeds of 689,003 users. Facebook did not need the consent from these "guinea pigs" as they had agreed to the terms and conditions outlined in the Facebook's data use policy. It explicitly states that user information will be used "for internal operations, including troubleshooting, data analysis, testing, research and service improvement".



t human research lab There’s no need to get experiment participants to sign pesky consent forms as they’ve already agreed to the site’s data use policy. A team of Facebook data scientists are constantly coming up with new ways to study human behavior through the social network. When the team releases papers about what it’s learned from us, we often learn surprising things about Facebook instead — such as the fact that it can keep track of the status updates we never actually post. Facebook has played around with manipulating people before — getting 60,000 to rock the vote in 2010 that theoretically wouldn’t have otherwise — but a recent study shows Facebook playing a whole new level of mind gamery with its guinea pigs users. As first noted by The New Scientist and Animal New York, Facebook’s data scientists manipulated the News Feeds of 689,003 users, removing either all of the positive posts or all of the negative posts to see how it affected their moods. If there was a week in January 2012 where you were only seeing photos of dead dogs or incredibly cute babies, you may have been part of the study. Now that the experiment is public, people’s mood about the study itself would best be described as “disturbed.”

The researchers, led by data scientist Adam Kramer, found that emotions were contagious. “When positive expressions were reduced, people produced fewer positive posts and more negative posts; when negative expressions were reduced, the opposite pattern occurred,” according to the paper published by the Facebook research team in the PNAS. “These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.”

The experiment ran for a week — January 11–18, 2012 — during which the hundreds of thousands of Facebook users unknowingly participating may have felt either happier or more depressed than usual, as they saw either more of their friends posting ’15 Photos That Restore Our Faith In Humanity’ articles or despondent status updates about losing jobs, getting screwed over by X airline, and already failing to live up to New Year’s resolutions. “*Probably* nobody was driven to suicide,” tweeted one professor linking to the study, adding a “#jokingnotjoking” hashtag.

The researchers — who may not have been thinking about the optics of a “Facebook emotionally manipulates users” study — jauntily note that the study undermines people who claim that looking at our friends’ good lives on Facebook makes us feel depressed. “The fact that people were more emotionally positive in response to positive emotion updates from their friends stands in contrast to theories that suggest viewing positive posts by friends on Facebook may somehow affect us negatively,” they write.

They also note that when they took all of the emotional posts out of a person’s News Feed, that person became “less expressive,” i.e. wrote fewer status updates. So prepare to have Facebook curate your feed with the most emotional of your friends’ posts if they feel you’re not posting often enough.

So is it okay for Facebook to play mind games with us for science? It’s a cool finding but manipulating unknowing users’ emotional states to get there puts Facebook’s big toe on that creepy line. Facebook’s data use policy — that I’m sure you’ve all read — says Facebookers’ information will be used “for internal operations, including troubleshooting, data analysis, testing, research and service improvement,” making all users potential experiment subjects. And users know that Facebook’s mysterious algorithms control what they see in their News Feed. But it may come as a surprise to users to see those two things combined like this. When universities conduct studies on people, they have to run them by an ethics board first to get approval — ethics boards that were created because scientists were getting too creepy in their experiments, getting subjects to think they were shocking someone to death in order to study obedience and letting men live with syphilis for study purposes. A 2012 profile of the Facebook data team noted, “Unlike academic social scientists, Facebook’s employees have a short path from an idea to an experiment on hundreds of millions of people.” (Update 6/30/14): Cornell University released a statement Monday morning saying its ethics board — which is supposed to approve any research on human subjects — passed on reviewing the study because the part involving actual humans was done by Facebook not by the Cornell researcher involved in the study. Though the academic researchers did help design the study — as noted when it was published — so this seems a bit disingenuous.

In its initial response to the controversy around the study — a statement sent to me late Saturday night — Facebook doesn’t seem to really get what people are upset about, focusing on privacy and data use rather than the ethics of emotional manipulation and whether Facebook’s TOS lives up to the definition of “informed consent” usually required for academic studies like this. “This research was conducted for a single week in 2012 and none of the data used was associated with a specific person’s Facebook account,” says a Facebook spokesperson. “We do research to improve our services and to make the content people see on Facebook as relevant and engaging as possible. A big part of this is understanding how people respond to different types of content, whether it’s positive or negative in tone, news from friends, or information from pages they follow. We carefully consider what research we do and have a strong internal review process. There is no unnecessary collection of people’s data in connection with these research initiatives and all data is stored securely.”

Ideally, Facebook would have a consent process for willing study participants: a box to check somewhere saying you’re okay with being subjected to the occasional random psychological experiment that Facebook’s data team cooks up in the name of science. As opposed to the commonplace psychological manipulation cooked up by advertisers trying to sell you stuff.

Friend's Lists

Ethical Concerns

2010 Case

Friend's Lists

See Also


References


http://www.britannica.com/technology/data-mining