Difference between revisions of "Bias in Information"

From SI410
Jump to: navigation, search
(Created page with "Nikita Badhwar Preview")
 
Line 1: Line 1:
Nikita Badhwar
+
==Information Bias==
 +
The term information in information bias refers to what is made accessible to the user when searching for knowledge. For example, when a user queries a search engine in order to find specific material about a topic, the search engine will provide results in a specific order. When searching a topic, if the first few results don’t provide the user with what he was searching for, he will retype his search into something more specific and repeat this process until he finds satisfactory results. Given this process, the first few links that appear when a user writes a query are very important. By filtering the results that are given, a search engine can introduce bias in the information by providing specific information above other information. In order to uncover the bias that exists in information, we must understand how ethical issues are embedded into the technology behind it.
 +
==Search Engines==
 +
Continuing with our example of search engines, in order to understand information bias involved we must first understand how a search engine works behind the scenes. Although a search engine is able to provide thousands of results in seconds, a lot of work occurs in the background in order to provide those results to you. In the background there are three major steps: web crawling, indexing, and the algorithm the search engine performs. In the first step a web crawler searches the World Wide Web in order to find specific documents to add to the search engine’s personal collection. Every time a document is updated or a new document is found, a crawler will add a copy of this document to a collection. This collection of documents, now kept by the search engine in a data center, can be organized and searched through based off of what a user is looking for. In the last step, the algorithm, a search engine must decide how to organize the documents to provide the user with a ranked set of results where ideally the first thing the user sees is what is most relevant to the user’s search. Before these three steps can occur, however, a user must write a query for the technology to compute results for. Typically we see this as a phrase.
  
Preview
+
Breaking down the seemingly hidden steps behind the technology allows us to assess where exactly bias can be introduced. In “Values in technology and disclosive computer ethics”, Brey discusses the idea that technology has “embedded values” which means that computers and their software are not “morally neutral”.  Somewhere in the process of their design, computers can favor specific values and therefore we must go a step further from merely studying the ethics of computer usage, to the ethics of computers themselves.
 +
 
 +
In Safiya Nobel’s “Algorithms of Opression”, she discusses the negative biases against women of color that are embedded in software systems. In her example, searching the phrase “black girls” gave drastically different results than searching the phrase “white girls”. The former gave results of vulgarity and reflected common stereotypes while the latter gave less controversial results. Taking this example, we can relate the type of bias that can occur with the three types of biases Brey discusses: preexisting, technical, and emergent. 
 +
==Preexisitng, Technical, and Emergent Bias==
 +
The first, preexisiting bias, occurs when from values and attitudes exist prior to the development of the software. In our breakdown of software systems, we can see this when an order of documents is provided after a search. If the systems algorithm always favors certain documents over others we might always receive the documents first that reflect the values of the creator of the algorithm. It is very possible that the creator of the algorithm had certain stereotypes that were made present by what first appeared when searching “black girls”. Through his or her own personal beliefs the information that was made most accessible was biased towards those beliefs as well.
 +
 
 +
The second bias, technical bias, occurs due to the limitations of the software. Due to the nature of search engines, and the way that humans use them, where often only the first results are even looked at, it is impossible to display certain results – or for humans to even see certain results. Taking it a step further, the documents that can be gathered also have certain limitations. Only the information that is available can be crawled upon and added to the collection. In many situations the information provided can lead to bias, due to the fact that there might be more information for specific things than others. There may have been a lot more data available for the second search that makes the search for “black girls” appear biased.
 +
 
 +
The last, emergent bias occurs when the system is being used in a way not intended by its designers. When a user enters a phrase, the wording of the phrase can be very important. Different words with the same meaning often have different connotations that can provide different results. For example the phrase "African American women" might lead to different results than "Black women" due to the types of articles found with each phrase. The first phrase might provide more scholarly articles with references to history whereas the second phrase might provide more articles with varying topics.

Revision as of 20:36, 15 March 2019

Information Bias

The term information in information bias refers to what is made accessible to the user when searching for knowledge. For example, when a user queries a search engine in order to find specific material about a topic, the search engine will provide results in a specific order. When searching a topic, if the first few results don’t provide the user with what he was searching for, he will retype his search into something more specific and repeat this process until he finds satisfactory results. Given this process, the first few links that appear when a user writes a query are very important. By filtering the results that are given, a search engine can introduce bias in the information by providing specific information above other information. In order to uncover the bias that exists in information, we must understand how ethical issues are embedded into the technology behind it.

Search Engines

Continuing with our example of search engines, in order to understand information bias involved we must first understand how a search engine works behind the scenes. Although a search engine is able to provide thousands of results in seconds, a lot of work occurs in the background in order to provide those results to you. In the background there are three major steps: web crawling, indexing, and the algorithm the search engine performs. In the first step a web crawler searches the World Wide Web in order to find specific documents to add to the search engine’s personal collection. Every time a document is updated or a new document is found, a crawler will add a copy of this document to a collection. This collection of documents, now kept by the search engine in a data center, can be organized and searched through based off of what a user is looking for. In the last step, the algorithm, a search engine must decide how to organize the documents to provide the user with a ranked set of results where ideally the first thing the user sees is what is most relevant to the user’s search. Before these three steps can occur, however, a user must write a query for the technology to compute results for. Typically we see this as a phrase.

Breaking down the seemingly hidden steps behind the technology allows us to assess where exactly bias can be introduced. In “Values in technology and disclosive computer ethics”, Brey discusses the idea that technology has “embedded values” which means that computers and their software are not “morally neutral”. Somewhere in the process of their design, computers can favor specific values and therefore we must go a step further from merely studying the ethics of computer usage, to the ethics of computers themselves.

In Safiya Nobel’s “Algorithms of Opression”, she discusses the negative biases against women of color that are embedded in software systems. In her example, searching the phrase “black girls” gave drastically different results than searching the phrase “white girls”. The former gave results of vulgarity and reflected common stereotypes while the latter gave less controversial results. Taking this example, we can relate the type of bias that can occur with the three types of biases Brey discusses: preexisting, technical, and emergent.

Preexisitng, Technical, and Emergent Bias

The first, preexisiting bias, occurs when from values and attitudes exist prior to the development of the software. In our breakdown of software systems, we can see this when an order of documents is provided after a search. If the systems algorithm always favors certain documents over others we might always receive the documents first that reflect the values of the creator of the algorithm. It is very possible that the creator of the algorithm had certain stereotypes that were made present by what first appeared when searching “black girls”. Through his or her own personal beliefs the information that was made most accessible was biased towards those beliefs as well.

The second bias, technical bias, occurs due to the limitations of the software. Due to the nature of search engines, and the way that humans use them, where often only the first results are even looked at, it is impossible to display certain results – or for humans to even see certain results. Taking it a step further, the documents that can be gathered also have certain limitations. Only the information that is available can be crawled upon and added to the collection. In many situations the information provided can lead to bias, due to the fact that there might be more information for specific things than others. There may have been a lot more data available for the second search that makes the search for “black girls” appear biased.

The last, emergent bias occurs when the system is being used in a way not intended by its designers. When a user enters a phrase, the wording of the phrase can be very important. Different words with the same meaning often have different connotations that can provide different results. For example the phrase "African American women" might lead to different results than "Black women" due to the types of articles found with each phrase. The first phrase might provide more scholarly articles with references to history whereas the second phrase might provide more articles with varying topics.