Difference between revisions of "Bias in Information"

From SI410
Jump to: navigation, search
Line 17: Line 17:
 
==Ethical concerns when accessing information==
 
==Ethical concerns when accessing information==
 
The search for information is an inevitable process which causes many ethical concerns to arise. These ethical concerns come from the bias involved in the search engine design, the filtering of results, and the privacy of the user.   
 
The search for information is an inevitable process which causes many ethical concerns to arise. These ethical concerns come from the bias involved in the search engine design, the filtering of results, and the privacy of the user.   
 +
 +
===Privacy===
 +
 +
Along with the process of finding optimal results, a search engine will also track certain information about a user behind the scenes. The time and date, along with the content, of each query that is searched along with the IP address of the computer searching it is all information that is stored. Although unlikely, pooling similar IP address can get a list of searches by a specific user. The IP address shared with the search engine are not of personal computers but instead of your local router. This gives specific information on geolocation and the types of searches that occur in specific locations. This use of address can be used against users in specific scenarios, for example in China the use of google is prohibited and instead provides different search engines for the country. [reference]
 +
 +
Along with the ability to ban specific phrases in certain locations, a search engine also uses past searches and the documents looked as part of their algorithms. When a document is looked at frequently it will move higher up on the list of results due to the fact that users find it relevant.
  
 
===Bias===
 
===Bias===
Line 28: Line 34:
 
Showing results to an individual is a process that is dealt with by an algorithm. Finding the most relevant documents is done so by uniquely identifying and categorizing documents based off of their subject.  
 
Showing results to an individual is a process that is dealt with by an algorithm. Finding the most relevant documents is done so by uniquely identifying and categorizing documents based off of their subject.  
  
===Privacy===
 
 
Along with the process of finding optimal results, a search engine will also track certain information about a user behind the scenes. The time and date, along with the content, of each query that is searched along with the IP address of the computer searching it is all information that is stored. Although unlikely, pooling similar IP address can get a list of searches by a specific user. The IP address shared with the search engine are not of personal computers but instead of your local router. This gives specific information on geolocation and the types of searches that occur in specific locations. This use of address can be used against users in specific scenarios, for example in China the use of google is prohibited and instead provides different search engines for the country. [reference]
 
 
Along with the ability to ban specific phrases in certain locations, a search engine also uses past searches and the documents looked as part of their algorithms. When a document is looked at frequently it will move higher up on the list of results due to the fact that users find it relevant.
 
  
 
[[Category:2019New]]
 
[[Category:2019New]]

Revision as of 19:18, 29 March 2019

Information is “the resolution of uncertainty” [reference]. The search for information along with its interpretation occurs by an observer and due to this, multiple observers searching for information on a specific topic to result to many different answers. The information that is provided to the user along with the observer’s interpretation of the given information can cause discrepancies in the final results. Bias in information is a prevalent occurrence due to the nature of how a user looks for answers. The simple act of filtering results in a specific way or only allowing certain information to be accessible to an observer can drastically change the outcome. The search for information, due to the technology readily available today, most often occurs through search engines.

Gathering Results

The First 10 results

A search for information provides a list of results as a response. This list of results consists of pages that are either retrieved as the most relevant to a specific query or pages that are sponsored to show up first. When researching a specific topic, if the first few results don’t provide the user with what he was searching for, he will retype his search into something more specific and repeat this process until he finds satisfactory results. Given this process, the first few links that appear when a user writes are very important.

Information Overload

The amount of information readily available to the public has increased to the point that a user can be provided with too much information. When this occurs we refer to it as "information overload". Information overload can be seen through the thousands of results given by a search engine, but also predates the era of modern technology and can be seen in other scenarios as well for example libraries and museums.

Search Engines

A search engine is a software system that is designed to carry out a web search on a particular query or phrase that is provided by a user. The information provided from a search can include many different types of media some of which include: articles, documents, images, videos, and infographics. Search engines provide easy access to information that can also be available in specific locations like libraries and museums.

How search engines work behind the scenes

Search engines.png

A search engine is able to provide thousands of results in second and is able to do so because of the work that occurs in the background. In the background there are three major steps: web crawling, indexing, and the algorithm the search engine performs. In the first step a web crawler searches the World Wide Web in order to find specific documents to add to the search engine’s personal collection. Every time a document is updated or a new document is found, a crawler will add a copy of this document to a collection. This collection of documents, now kept by the search engine in a data center, can be organized and searched through based off of what a user is looking for. In the last step, the algorithm, a search engine must decide how to organize the documents to provide the user with a ranked set of results where ideally the first thing the user sees is what is most relevant to the user’s search. Before these three steps can occur, however, a user must write a query for the technology to compute results for. Typically we see this as a phrase, but can also be any type of media for example a picture.

Ethical concerns when accessing information

The search for information is an inevitable process which causes many ethical concerns to arise. These ethical concerns come from the bias involved in the search engine design, the filtering of results, and the privacy of the user.

Privacy

Along with the process of finding optimal results, a search engine will also track certain information about a user behind the scenes. The time and date, along with the content, of each query that is searched along with the IP address of the computer searching it is all information that is stored. Although unlikely, pooling similar IP address can get a list of searches by a specific user. The IP address shared with the search engine are not of personal computers but instead of your local router. This gives specific information on geolocation and the types of searches that occur in specific locations. This use of address can be used against users in specific scenarios, for example in China the use of google is prohibited and instead provides different search engines for the country. [reference]

Along with the ability to ban specific phrases in certain locations, a search engine also uses past searches and the documents looked as part of their algorithms. When a document is looked at frequently it will move higher up on the list of results due to the fact that users find it relevant.

Bias

Due to the nature of a search engine, and the processes it goes through to provide results, bias can be introduced into the process in each step. In “Values in technology and disclosive computer ethics”, Brey discusses the idea that technology has “embedded values” which means that computers and their software are not “morally neutral”. Somewhere in the process of their design, computers can favor specific values.

Intentions and Consequences

Filtering Results

Showing results to an individual is a process that is dealt with by an algorithm. Finding the most relevant documents is done so by uniquely identifying and categorizing documents based off of their subject.