Difference between revisions of "Data Mining"

From SI410
Jump to: navigation, search
Line 4: Line 4:
 
<br>
 
<br>
 
'''Data mining''' is the act of analyzing data from various perspectives and summarizing it into useful information, which combines aspects of artificial intelligence, machine learning, statistics and database systems. Software is implemented as one of many analytical tools used to analyze data. Through data mining, data is presented to users from many different angles, in various categories and relationships. On a more technical term, data mining is the process of realizing correlations or patterns among large fields of relative databases.<ref>Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.</ref>
 
'''Data mining''' is the act of analyzing data from various perspectives and summarizing it into useful information, which combines aspects of artificial intelligence, machine learning, statistics and database systems. Software is implemented as one of many analytical tools used to analyze data. Through data mining, data is presented to users from many different angles, in various categories and relationships. On a more technical term, data mining is the process of realizing correlations or patterns among large fields of relative databases.<ref>Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.</ref>
 
 
 
==Example==
 
*A simple example of data mining is analyzing the population of the University of Michigan, and make inferences and correlations about the resulting information, such as categorizing students into various ethnic backgrounds.
 
 
*Before the term data mining came to use, many business corporations have already implemented its technology. They used high tech computers to comb through quantitative data from supermarket scanners, and analyzed the resulting data for market research purposes. This process have been immensely increasing the precision of analysis, and at the same time decreasing the cost of research.<ref>Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.</ref>
 
  
 
==Process==
 
==Process==
Line 21: Line 14:
 
===Association Rule Learning===  
 
===Association Rule Learning===  
 
Searching for general relationships between variables.
 
Searching for general relationships between variables.
 +
 
===Clustering===  
 
===Clustering===  
 
Detection groups or structures within the data that are similar.
 
Detection groups or structures within the data that are similar.
Line 30: Line 24:
 
Finding a function to model the data with the least error.
 
Finding a function to model the data with the least error.
  
===Summarization=== Providing context and reporting findings.
+
===Summarization===
 +
Providing context and reporting findings.
  
 
==Applications==
 
==Applications==
 
Data mining is commonly used in business to determine what demographics are buying what products and to try to predict customer decisions, and science to find patterns in experimental data.
 
Data mining is commonly used in business to determine what demographics are buying what products and to try to predict customer decisions, and science to find patterns in experimental data.
 +
 +
==Examples==
 +
A simple example of data mining is analyzing a large population, such as University of Michigan students, and determining simple characteristics that the data has, such as the proportion of the student body that is from each ethnic background.
 +
 +
Before the term "data mining" came into popular use, many businesses had already implemented its technology. They used powerful computers to comb through quantitative data from supermarket scanners, and analyzed the resulting data for market research purposes. This process have been immensely increasing the precision of analysis, and at the same time decreasing the cost of research.<ref>Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.</ref>
  
 
==Ethical Implications==
 
==Ethical Implications==

Revision as of 18:23, 16 December 2011

(back to index)

This illustration shows the role Data Mining plays in when processing information for business use


Data mining is the act of analyzing data from various perspectives and summarizing it into useful information, which combines aspects of artificial intelligence, machine learning, statistics and database systems. Software is implemented as one of many analytical tools used to analyze data. Through data mining, data is presented to users from many different angles, in various categories and relationships. On a more technical term, data mining is the process of realizing correlations or patterns among large fields of relative databases.[1]

Process

Data mining can only occur on a dataset large enough to contain patterns to discover. This data must be aggregated, and stored in a database. Data cleaning then takes place, to remove noisy or partially missing entries.

Data mining consists of six sub-tasks:[2]

Anomaly Detection

Identifying unusual records that may be anomalies or errors.

Association Rule Learning

Searching for general relationships between variables.

Clustering

Detection groups or structures within the data that are similar.

Classification

Applying known structures to new data.

Regression

Finding a function to model the data with the least error.

Summarization

Providing context and reporting findings.

Applications

Data mining is commonly used in business to determine what demographics are buying what products and to try to predict customer decisions, and science to find patterns in experimental data.

Examples

A simple example of data mining is analyzing a large population, such as University of Michigan students, and determining simple characteristics that the data has, such as the proportion of the student body that is from each ethnic background.

Before the term "data mining" came into popular use, many businesses had already implemented its technology. They used powerful computers to comb through quantitative data from supermarket scanners, and analyzed the resulting data for market research purposes. This process have been immensely increasing the precision of analysis, and at the same time decreasing the cost of research.[3]

Ethical Implications

Data mining is the development of models of accumulated data. Sometimes, in an attempt to build an accurate statistical model, data miners tend to pry into private information in personal data records. While data mining itself is not inherently an ethical process, it has many applications that are ethically charged.

Particularly in mining social networking sites, a lot of personal information can be, and often is, accrued about an individual. Facebook uses these techniques to sell advertisers very specific target audiences. [4] As data mining has useful applications within the medical field, patient records could also be accessed in such a way. This raises issues about patient confidentiality and breach of privacy with regard to ordinarily private areas of people's personal life.

In systems that provide data from humans for such applications, maintaining anonymity of data and informing those involved of exactly what will happen to their data and how it will be used and allowing them to opt out of the process is a good way to keep such processes ethical.

References

  1. Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.
  2. http://www.kdnuggets.com/gpspubs/aimag-kdd-overview-1996-Fayyad.pdf
  3. Palace, Bill. "What Is Data Mining?" Data Mining. Anderson Graduate School of Management at UCLA, Mar. 1996. Web. 16 Dec. 2011 <http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/index.htm>.
  4. http://www.facebook.com/advertising/

See also