Difference between revisions of "Algorithmic Audits"

From SI410
Jump to: navigation, search
(Final Content)
Line 1: Line 1:
Auditing Algorithms: Adding Accountability to Automated Authority is a group of events designed to produce a white paper that will help to define and develop the emerging research community for “algorithm auditing.Algorithmic Auditing is a research design that has shown promise in diagnosing the unwanted consequences of algorithmic systems.<ref>https://auditingalgorithms.science/?page_id=89#:~:text=Auditing%20Algorithms%3A%20Adding%20Accountability%20to,in%20diagnosing%20the%20unwanted%20consequences</ref>
+
Auditing Algorithms: Adding Accountability to Automated Authority is a group of events designed to produce a white paper that will help define and develop the emerging research community for "algorithm auditing." Algorithmic Auditing is a research design that has shown promise in diagnosing the unwanted consequences of algorithmic systems. <ref>https://auditingalgorithms.science/?page_id=89#:~:text=Auditing%20Algorithms%3A%20Adding%20Accountability%20to,in%20diagnosing%20the%20unwanted%20consequences</ref>
Since the Internet was established it has run on algorithms that aren't understood by most people, therefore it has been able to function relatively unchecked, there are some regulations and laws that have been rolled out to keep the internet safe, however there is little that can be done to limit the liberties online because of the anonymity and privacy it awards. By looking at platforms as a whole and analysisng the results they're producing we are able to learn more about algorithms and how they affect our society.
+
Since developers established the Internet, it has run on algorithms that most people don't understand; therefore, it has been able to function relatively unchecked. There are some regulations and laws that have rolled out to keep the Internet safe. However, governments can do little to limit the liberties online because of the anonymity and privacy it awards. By looking at platforms as a whole and analyzing the results, they're producing, and we can learn more about algorithms and how they affect our society.
  
 
== Background ==
 
== Background ==
 
All kinds of systems have transformed into "smart" objects. At the core of all these real-time digital services lie algorithms that provide essential functions like sorting, segmentation, personalization, recommendations, and information management. Because technology and algorithms are heavily integrated into humanity, ethical issues arise because algorithms are opaque to public scrutiny and understanding. Since the 2000s, the regulation of algorithms has become a concern to governments worldwide due to its data and privacy implications, and legislation has made progress in essential areas.
 
All kinds of systems have transformed into "smart" objects. At the core of all these real-time digital services lie algorithms that provide essential functions like sorting, segmentation, personalization, recommendations, and information management. Because technology and algorithms are heavily integrated into humanity, ethical issues arise because algorithms are opaque to public scrutiny and understanding. Since the 2000s, the regulation of algorithms has become a concern to governments worldwide due to its data and privacy implications, and legislation has made progress in essential areas.
  
Additionally, computer science and similar degrees have seen an increase in undergraduates, proportional to the tech industry's growth. Awareness about the infrastructure of algorithms and digital platforms enables the public to identify ethical breaches in these systems' functionality. This awareness has sparked activism for ethics in technology. For example, [https://en.wikipedia.org/wiki/J Joy Buolamwini] identifying the bias in facial recognition technologies.
+
Additionally, computer science and similar degrees have seen an increase in undergraduates, proportional to the tech industry's growth. Awareness about algorithms and digital platforms' infrastructure enables the public to identify ethical breaches in these systems' functionality. This awareness has sparked activism for ethics in technology. For example, [https://en.wikipedia.org/wiki/J Joy Buolamwini] identifying the bias in facial recognition technologies.
  
== Types of Audits<ref>Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). Auditing Algorithms : Research Methods for Detecting Discrimination on Internet Platforms.</ref> ==
+
== Types of Audits<ref>Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms.</ref> ==
 
=== Code Audit ===
 
=== Code Audit ===
These audits use the idea of "Algorithm Transparency" where researchers acquire a copy of the algorithm and vet it for unethical platform behavior. This happens rarely because the program algorithm is considered valuable intellectual property, and tech companies are reluctant to release the code, even to researchers.  
+
These audits use the idea of "Algorithm Transparency," where researchers acquire a copy of the algorithm and vet it for unethical platform behavior. This rarely happens because the program algorithm is considered valuable intellectual property, and tech companies are reluctant to release the code, even to researchers.  
  
 
=== Noninvasive User Audit ===
 
=== Noninvasive User Audit ===
An audit where researchers approach users and ask users to share their search queries, results, and other data with the researchers. However, this method is historically unreliable and has a small sample size that is not random which makes it difficult to infer causality.  
+
An audit where researchers approach users and ask users to share their search queries, results, and other data with them. However, this method is historically unreliable and has a small sample size that is not random, which makes it difficult to infer causality.  
  
 
=== Scraping Audit ===
 
=== Scraping Audit ===
In scaping audits researchers issue repeated queries to a platform, then observe and record the results, employing programs that utilize API data mining and webpage scraping. This is effective because it can gather a large quantity of data that can be tested, however, it is not ideal because if the audit isn't strictly for research purposes, one could serve 1-10 years in prison for a CFAA violation resulting from this auditing technique.
+
In scaping audits, researchers issue repeated queries to a platform, then observe and record the results, employing programs that utilize API data mining and webpage scraping. This is effective because it can gather a large quantity of data that researchers can test. However, it is not ideal because if the audit isn't strictly for research purposes, one could serve 1-10 years in prison for a CFAA violation resulting from this auditing technique.
  
 
=== Sock Puppet Audit ===
 
=== Sock Puppet Audit ===
Sock puppet audits essentially use computer programs to impersonate users by creating false accounts. This method is effective for the investigation of sensitive topics in difficult-to-access domains, sock puppets can investigate features of systems that are not public, and penetrate groups that are difficult to pinpoint and acknowledge. A large number of sock puppets are required to derive significant findings from the audit. However, this method is still susceptible to CFAA violations requiring the purpose of the audit to be strictly for research.  
+
Sock puppet audits essentially use computer programs to impersonate users by creating false accounts. This method effectively investigates sensitive topics in difficult-to-access domains, and sock puppets can investigate features of systems that are not public and penetrate groups that are difficult to pinpoint and acknowledge. A large number of sock puppets are required to derive significant findings from the audit. However, this method is still susceptible to CFAA violations requiring the audit's purpose to be strictly for research.  
  
 
=== Collaborative or Crowdsourced Audit ===
 
=== Collaborative or Crowdsourced Audit ===
This audit style is similar to the sock puppet audit but uses real people instead of computer programs to circumvent CFAA and terms of service violations. An example of this is Amazon's Mechanical Turk which allows a large enough group of testers to produce significant results, through semi-automated crowdsourcing. The only setback is this audit requires a large budget to pay the participants.
+
This audit style is similar to the sock puppet audit but uses real people instead of computer programs to circumvent CFAA and terms of service violations. An example of this is Amazon's Mechanical Turk which allows a large enough testers group to produce significant results through semi-automated crowdsourcing. The only setback is this audit requires a large budget to pay the participants.
  
 
== Legislation ==
 
== Legislation ==
Line 62: Line 62:
  
 
=== Photo Sharing Law ===
 
=== Photo Sharing Law ===
A law put in place by U.S. Legislation, the Photo Sharing Law outlines the general regulation for photography in the U.S. This resulted from the potential security implications internally, and the privacy of citizens. The law mainly outlines these regulations:
+
A law put in place by U.S. Legislation. The Photo Sharing Law outlines the general regulation for photography in the U.S. This resulted from the potential security implications internally and the privacy of citizens. The law mainly outlines these regulations:
 
# Private property is protected and established by the owner of the property, it is illegal to refuse to abide by the owner's requests concerning photographing activity.
 
# Private property is protected and established by the owner of the property, it is illegal to refuse to abide by the owner's requests concerning photographing activity.
 
#Photographing private property from within the public domain is not illegal.
 
#Photographing private property from within the public domain is not illegal.
Line 73: Line 73:
  
 
== Ethical Implications ==
 
== Ethical Implications ==
Due to the scale of technology companies, regulation cannot keep pace, as a result general guidelines and laws for digital interactions, however as the infrastructre continues to grow, there are new ways for the system to be leveraged in new ways. As a result Algorithm Audits have become the most efficent way to asses the ethical impact of digital platforms and providers on their consumers and users. These Audits have formulated direct ways to check digital platforms and create protections for auditors, however these methods are difficult to employ and fall short in many areas of regulation. <ref>O'neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.</ref>
+
Due to the scale of technology companies, regulation cannot keep pace. As a result, general guidelines and laws for digital interactions. However, as the infrastructure continues to grow, there are new ways for the system to be leveraged in new ways. As a result, Algorithm Audits have become the most efficient way to assess digital platforms and providers' ethical impact on their consumers and users. These Audits have formulated direct ways to check digital platforms and create protections for auditors. However, these methods are difficult to employ and fall short in many areas of regulation. <ref>O'neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.</ref>
  
 
=== Research Protections ===
 
=== Research Protections ===
Because researchers attempting to audit a platform or provider can be targeted by these large tech companies, legislation has been put in place through the CFAA and U.S. Department of Justice to protect the researchers from lawsuits. This resulted from the 2018 court ruling in SANDVIG et al. V. SESSIONS, where four university researchers were attempting to scrape information on discrimination in the housing department.  
+
Because researchers attempting to audit a platform or provider can be targeted by these large tech companies, legislation has been put in place through the CFAA and U.S. Department of Justice to protect the researchers from lawsuits. This resulted from the 2018 court ruling in SANDVIG et al. V. SESSIONS, where four university researchers attempted to scrape information on discrimination in the housing department.  
  
 
===== SANDVIG et al V. SESSIONS<ref>Lee, A. (2019). Online Research and Competition under the CFAA. Available at SSRN 3259701.</ref> =====
 
===== SANDVIG et al V. SESSIONS<ref>Lee, A. (2019). Online Research and Competition under the CFAA. Available at SSRN 3259701.</ref> =====

Revision as of 16:07, 12 March 2021

Auditing Algorithms: Adding Accountability to Automated Authority is a group of events designed to produce a white paper that will help define and develop the emerging research community for "algorithm auditing." Algorithmic Auditing is a research design that has shown promise in diagnosing the unwanted consequences of algorithmic systems. [1] Since developers established the Internet, it has run on algorithms that most people don't understand; therefore, it has been able to function relatively unchecked. There are some regulations and laws that have rolled out to keep the Internet safe. However, governments can do little to limit the liberties online because of the anonymity and privacy it awards. By looking at platforms as a whole and analyzing the results, they're producing, and we can learn more about algorithms and how they affect our society.

Background

All kinds of systems have transformed into "smart" objects. At the core of all these real-time digital services lie algorithms that provide essential functions like sorting, segmentation, personalization, recommendations, and information management. Because technology and algorithms are heavily integrated into humanity, ethical issues arise because algorithms are opaque to public scrutiny and understanding. Since the 2000s, the regulation of algorithms has become a concern to governments worldwide due to its data and privacy implications, and legislation has made progress in essential areas.

Additionally, computer science and similar degrees have seen an increase in undergraduates, proportional to the tech industry's growth. Awareness about algorithms and digital platforms' infrastructure enables the public to identify ethical breaches in these systems' functionality. This awareness has sparked activism for ethics in technology. For example, Joy Buolamwini identifying the bias in facial recognition technologies.

Types of Audits[2]

Code Audit

These audits use the idea of "Algorithm Transparency," where researchers acquire a copy of the algorithm and vet it for unethical platform behavior. This rarely happens because the program algorithm is considered valuable intellectual property, and tech companies are reluctant to release the code, even to researchers.

Noninvasive User Audit

An audit where researchers approach users and ask users to share their search queries, results, and other data with them. However, this method is historically unreliable and has a small sample size that is not random, which makes it difficult to infer causality.

Scraping Audit

In scaping audits, researchers issue repeated queries to a platform, then observe and record the results, employing programs that utilize API data mining and webpage scraping. This is effective because it can gather a large quantity of data that researchers can test. However, it is not ideal because if the audit isn't strictly for research purposes, one could serve 1-10 years in prison for a CFAA violation resulting from this auditing technique.

Sock Puppet Audit

Sock puppet audits essentially use computer programs to impersonate users by creating false accounts. This method effectively investigates sensitive topics in difficult-to-access domains, and sock puppets can investigate features of systems that are not public and penetrate groups that are difficult to pinpoint and acknowledge. A large number of sock puppets are required to derive significant findings from the audit. However, this method is still susceptible to CFAA violations requiring the audit's purpose to be strictly for research.

Collaborative or Crowdsourced Audit

This audit style is similar to the sock puppet audit but uses real people instead of computer programs to circumvent CFAA and terms of service violations. An example of this is Amazon's Mechanical Turk which allows a large enough testers group to produce significant results through semi-automated crowdsourcing. The only setback is this audit requires a large budget to pay the participants.

Legislation

Since the internet became a public commodity, the value of information has increased significantly. The US justice system has seen numerous cases involving digital privacy, ownership, plagiarism, hacking, fraud, and more. In response, laws have developed to categorize and respond to common disputes.[3]

Display of Information[4]

In 1984 USCAB declared that the airline sorting algorithm created by SABRE must be known to participating airlines under the Display of Information in the Code of Federal Regulations. This regulatory provision resulted from a bias identified by airlines using SABRE's software, which involved "screen science." and recommending airlines based on privatized criteria and capital incentives. The airlines noticed that American Airlines received prioritization in SABRE's algorithms search results, which led to an investigation by the USCAB and DOJ.

The Communications Decency Act[5]

Section 230 of the Communications Decency Act states that "No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider." This section is one of the most potent active legislative acts concerning the internet and digital communications and one of its defining limitations. This provision protects service providers from the legal implications of removing content deemed obscene or offensive, even of constitutionally protected speech, as long as it is done in good faith.

US Computer Fraud and Abuse Act[6]

The United states CFAA is an act that attaches sentences to digital behaviors deemed illegal or unlawful and has clearly stated the repercussions for committing such an offense.

The Act states: “The CFAA prohibits intentionally accessing a computer without authorization or in excess of authorization but fails to define what ‘without authorization’ means. With harsh penalty schemes and malleable provisions, it has become a tool ripe for abuse and uses against nearly every aspect of computer activity.”

Crime/ Offense Years In Prison
Obtaining National Security Information 10
Accessing a Computer and Obtaining Information 1-5
Trespassing in a Government Computer 1
Accessing a Computer to Defraud and Obtain Value 5
Intentionally Damaging by Knowing Transmission 1-10
Recklessly Damaging by Intentional Access 1-5
Negligently Causing Damage and Loss by Intentional Access 1
Trafficking in Passwords 1
Extortion Involving Computers 5
Attempt and Conspiracy to Commit such an Offense 10

Photo Sharing Law

A law put in place by U.S. Legislation. The Photo Sharing Law outlines the general regulation for photography in the U.S. This resulted from the potential security implications internally and the privacy of citizens. The law mainly outlines these regulations:

  1. Private property is protected and established by the owner of the property, it is illegal to refuse to abide by the owner's requests concerning photographing activity.
  2. Photographing private property from within the public domain is not illegal.
  3. taking pictures of public places objects and structures are legal unless explicitly prohibited.
  4. Outer space requires a license to be issued by the National Oceanic and Atmospheric Administration in advance.
  5. Commercial photography will likely require a permit and proof of insurance.
  6. Photographing accident scenes and law enforcement activities is legal, as long as it doesn't hinder the operations of law enforcement, medical, emergency, or security personnel by filming.
  7. Any filming with the intent of doing unlawful harm against a subject may be a violation of the law in itself.


Ethical Implications

Due to the scale of technology companies, regulation cannot keep pace. As a result, general guidelines and laws for digital interactions. However, as the infrastructure continues to grow, there are new ways for the system to be leveraged in new ways. As a result, Algorithm Audits have become the most efficient way to assess digital platforms and providers' ethical impact on their consumers and users. These Audits have formulated direct ways to check digital platforms and create protections for auditors. However, these methods are difficult to employ and fall short in many areas of regulation. [7]

Research Protections

Because researchers attempting to audit a platform or provider can be targeted by these large tech companies, legislation has been put in place through the CFAA and U.S. Department of Justice to protect the researchers from lawsuits. This resulted from the 2018 court ruling in SANDVIG et al. V. SESSIONS, where four university researchers attempted to scrape information on discrimination in the housing department.

SANDVIG et al V. SESSIONS[8]

"On Mar. 27, the United States District Court of D.C. ruled that such actions should not be viewed as criminal under the statute, though it declined to weigh in on whether the professors' conduct would be protected under the First Amendment. Without reaching this constitutional question, the Court concludes that the CFAA does not criminalize mere terms-of-service violations on consumer websites and, thus, that plaintiffs' proposed research plans are not criminal under the CFAA," U.S. District Judge John Bates wrote in his opinion.[9]

Privacy Policies & Anonymity

If attempting to do a code, scraping, or sock puppet audit requires researchers to engage with the digital platform directly. This can be dangerous because technology companies have outlined in their privacy policies restrictions on how users can engage on the forum. Researchers can be flagged for breaking these policies and banned from the platform. This can inhibit researchers' abilities to audit large developed tech companies if they're detected and can lead to lawsuits and further complications when the research is published. These large companies deal with data and can identify a user based on their access point or device metadata, and this can make it hard even to attempt many of the audit types.

References

  1. https://auditingalgorithms.science/?page_id=89#:~:text=Auditing%20Algorithms%3A%20Adding%20Accountability%20to,in%20diagnosing%20the%20unwanted%20consequences
  2. Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms.
  3. upcounsel (2020). Internet Law: Everything You Need to Know.
  4. Code, U. S. US Code of Federal Regulations.
  5. Ardia, D. S. (2009). Free speech savior or shield for scoundrels: an empirical study of intermediary immunity under Section 230 of the Communications Decency Act. Loy. LAL Rev., 43, 373.
  6. Griffith, D. S. (1990). The computer fraud and abuse act of 1986: a measured response to a growing problem. Vand. L. Rev., 43, 453.
  7. O'neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
  8. Lee, A. (2019). Online Research and Competition under the CFAA. Available at SSRN 3259701.
  9. https://cases.justia.com/federal/district-courts/district-of-columbia/dcdce/1:2016cv01368/180080/24/0.pdf?ts=1522488415