Difference between revisions of "Data Aggregation Online"

From SI410
Jump to: navigation, search
(Description)
Line 3: Line 3:
  
 
== Description ==
 
== Description ==
[[File:code.jpeg|200px|thumb|left|Personal data can be collected with simple code]]
+
Data aggregation is any form of gathering information. Purposes of data aggregation range from large-scale projects like compiling demographic information about specific populations and very small scale, such as calculating a database's average boot-up time. Anything that can be statistically analyzed can be aggregated[1]. Traditionally, gathering information about people meant they had to be directly surveyed. There was no way of gathering certain demographic data such as age, income, marital status etc, unless methods were taken to get it through the government. With the rise of the Internet and social media sites, more and more people are willing putting their information in the public domain. Here, data miners and aggregators can gather all of one's information in one fell swoop and data that once took much time and effort to gather can take but a few moments. Moreover, any type of institution can now do this easily - they don't have to hire an outside statistic company to do the work.[[File:code.jpeg|200px|thumb|left|Personal data can be collected with simple code]]
Data aggregation is any form of gathering information. Purposes of data aggregation range from large-scale projects like compiling demographic information about specific populations and very small scale, such as calculating a database's average boot-up time. Anything that can be statistically analyzed can be aggregated[1]. Traditionally, gathering information about people meant they had to be directly surveyed. There was no way of gathering certain demographic data such as age, income, marital status etc, unless methods were taken to get it through the government. With the rise of the Internet and social media sites, more and more people are willing putting their information in the public domain. Here, data miners and aggregators can gather all of one's information in one fell swoop and data that once took much time and effort to gather can take but a few moments. Moreover, any type of institution can now do this easily - they don't have to hire an outside statistic company to do the work.  
+
  
 
== What is being aggregated ==
 
== What is being aggregated ==

Revision as of 21:20, 15 November 2011

Data aggregation is the gathering of information about some particular topic. This information can then be stored, analyzed, and used by way of statistical methods. The first instances of data aggregation came in the form of surveys, polls, interviews and public data pulls. As technology developed, data aggregation has improved tremendously with the help of the internet. All internet users leave bits of their information floating around the internet in many forms: cookies, sessions, user IDs, forum posts, social networking information. With the right technology, a savvy user cam aggregate this data, analyze it, and then implement it in business sectors such as marketing, advertising, search engine optimization, and even usability.


Description

Data aggregation is any form of gathering information. Purposes of data aggregation range from large-scale projects like compiling demographic information about specific populations and very small scale, such as calculating a database's average boot-up time. Anything that can be statistically analyzed can be aggregated[1]. Traditionally, gathering information about people meant they had to be directly surveyed. There was no way of gathering certain demographic data such as age, income, marital status etc, unless methods were taken to get it through the government. With the rise of the Internet and social media sites, more and more people are willing putting their information in the public domain. Here, data miners and aggregators can gather all of one's information in one fell swoop and data that once took much time and effort to gather can take but a few moments. Moreover, any type of institution can now do this easily - they don't have to hire an outside statistic company to do the work.
Personal data can be collected with simple code

What is being aggregated

There are pieces of information gathered that we know about. For instance, we know that any given website has at least our name and email address after we create a login with them. Other websites ask for age, gender, birthday, marital status and other various information. Facebook.com may even have information about who is in your family and who you are dating.

We purposefully leave pieces of our identity across the Internet almost without hesitation because it is so common. Nowadays, we are able to be tracked by cookies on our computers that track which websites are viewed and what items we look at, in addition to our usernames and names. This data can be aggregated to create a user profile that consists of our interests. This information can then be sold to companies that want to advertise to us or just be kept as statistics, either way many people are not aware that this data is trackable.

Pros, Cons and their Ethical Implications

Pros of online data aggregation

Each time a user signs up for a website they have to remember another username, password and/or PIN. Companies are now using data aggregation to consolidate all of this data (from banks, airlines, e-mail accounts, and various reward programs) so that users can access all their information in one convenient place. There is also the possibility to have online bill pay and stock tracking in the same place as well [2]. This becomes more and more useful as the average user signs up for more sites and as traditionally non-Internet services (such as banking and financial services) become the norm online. This also makes the host sight an attractive place navigate to while online - this is what attracts businesses to follow this path. The potential ethical problem arises in telling one site all of a person's personal data. Although seeing bill pay, bank statements and e-mail all in the same place is convenient, it means if one password is cracked, hackers have access to everything as opposed to just one thing. The user would have to give access to their personal data to third party site - thus meaning that more than just the user has the ability to access their data.

There is also the argument that the data aggregation of public information saves businesses and researchers time and money and because they are using public data, they have full rights to use it. Anyone has access to it, and gaining access is not illegal. Seeing a piece of information as a data aggregation program is the same as a friend seeing it online - where the friend can also pull information from other users - so there is not a big difference between what a friend does with the information and what an aggregation program does with it. This pro could also be seen as a con when analyzing the ethical implications. Phone books have been around for a few decades where everyone with a home phone is included so anyone else in the area can look up a phone number and address. Phone books could be opted out of and were limited to a small geographic area. When this information is put online, there is no guarantee the data is ever gone if the user wants to opt out and their information can be looked up by anyone with an internet connection. The user loses control of their personal data when it is transferred to the online world.

Cons of online data aggregation

As previously mentioned, users of the Internet knowingly and unknowingly leave pieces of themselves across multiple sites, but data aggregators have the ability to combine all of this information if it is public and/or there is a clause in the terms of agreement section that informs the user that their data could be sold or given away to another company. In essence, a 3rd party website that you have never heard of (let alone signed up for) could create a profile for you with your name, family members, (pulled from having them confirmed on Facebook), address and home phone number, (yellowpages.com - public information) your age and birthday (from signing up to get a free surprise on your birthday from another site), plus data on your interests (from tracking cookies). This profile is exactly what data aggregation can be (gathered information about someone) but when all the pieces of data are put together, it becomes glaringly obvious how much is shared online and how invasive it can feel. This can completely eliminate any feel of control over your personal data when it comes to the Internet - showing even if you monitor who you let in to your social networking circle, that does not mean the information you keep there does not get out. Ethical implications with this example follow those of the phone book example in the previous paragraph. Users lose the ability to control who has their data as sites take their public data and multiple digital copies of it are made. There is no way to be certain all of your public data has been erased from the Internet (or erased in general, not just technological mediums) if a user wants to remove their public data or opt-out of being in the 21st century phone book.

Another con of online data aggregation is what can be done with the information that has been pulled from various sources and placed in one spot. An instance of this comes from George Mason University grad student, Sean Gorman. Gorman's thesis used data aggregation to create a map of the United State's entire fiber-optic grid and where each business and industry connected to it. This left him with the ability to see where all the major hubs were in the United States and which place would cause the most damage if he took an axe and cut through the fiber-optics. All his data came from public sources thus he did nothing illegal to obtain it, and anyone who wanted to make the same map could do the same. The United States government, however, saw it as a terrorist threat, and threatened to not allow the dissertation to be published - for if it was, anyone could take that axe and cause major problems to the United States businesses and economy [3]. This illustrates the potential dangers of allowing data aggregation. Even though all the data used was public information, when all the pieces are put together they can pose a major threat to the country and this in itself is an ethical problem.


Data aggregating companies

Data Direct

Techtra


References

[1] http://searchsqlserver.techtarget.com/definition/data-aggregation

[2] http://en.wikipedia.org/wiki/Data_aggregator#Role_of_the_Internet

[3] http://si110.cms.si.umich.edu/sites/si110.cms.si.umich.edu/files/In_Out_and_Beyond/Dangerous_Dissertation.pdf