Difference between revisions of "CAPTCHA"

From SI410
Jump to: navigation, search
Line 92: Line 92:
 
https://web.archive.org/web/20100611210259/http://recaptcha.net/aboutus.html</ref> The project is the original team’s officially recommended CAPTCHA iteration.<ref name=CAPTCHA.NET>Carnegie Mellon University. (2000–2010). The Official CAPTCHA Site. http://www.captcha.net/</ref>
 
https://web.archive.org/web/20100611210259/http://recaptcha.net/aboutus.html</ref> The project is the original team’s officially recommended CAPTCHA iteration.<ref name=CAPTCHA.NET>Carnegie Mellon University. (2000–2010). The Official CAPTCHA Site. http://www.captcha.net/</ref>
  
von Ahn received a prestigious MacArthur Fellowship in 2006, primarily in recognition and support of his work with CAPTCHA.<ref name=MacArthur>MacArthur Foundation. (2006, September 1). Luis von Ahn. https://www.macfound.org/fellows/class-of-2006/luis-von-ahn#searchresults</ref><ref name=CMUToday>Spice, B. (2006, September 18). Brilliant Young Scientist Luis von Ahn Earns $500,000 MacArthur Foundation "Genius Grant”. Carnegie Mellon Today, Pittsburgh, PA, https://www.cmu.edu/cmnews/extra/060918_ahn.html</ref> He stated that he planned to use the reward money to further his human computation ambitions.<ref name=CMUToday>Spice, B. (2006, September 18). Brilliant Young Scientist Luis von Ahn Earns $500,000 MacArthur Foundation "Genius Grant”. Carnegie Mellon Today, Pittsburgh, PA, https://www.cmu.edu/cmnews/extra/060918_ahn.html</ref> von Ahn ultimately accomplished this with his novel reCAPTCHA text-identification project (see [[http://si410wiki.sites.uofmhosting.net/index.php/CAPTCHA#Digitizing_Information|Digitizing Information]], above).<ref name=reCAPTCHA>von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008, September 12). reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321, 1465–1468. 10.1126/science.1160379</ref>
+
von Ahn received a prestigious MacArthur Fellowship in 2006, primarily in recognition and support of his work with CAPTCHA.<ref name=MacArthur>MacArthur Foundation. (2006, September 1). Luis von Ahn. https://www.macfound.org/fellows/class-of-2006/luis-von-ahn#searchresults</ref><ref name=CMUToday>Spice, B. (2006, September 18). Brilliant Young Scientist Luis von Ahn Earns $500,000 MacArthur Foundation "Genius Grant”. Carnegie Mellon Today, Pittsburgh, PA, https://www.cmu.edu/cmnews/extra/060918_ahn.html</ref> He stated that he planned to use the reward money to further his human computation ambitions.<ref name=CMUToday>Spice, B. (2006, September 18). Brilliant Young Scientist Luis von Ahn Earns $500,000 MacArthur Foundation "Genius Grant”. Carnegie Mellon Today, Pittsburgh, PA, https://www.cmu.edu/cmnews/extra/060918_ahn.html</ref> von Ahn ultimately accomplished this with his novel reCAPTCHA text-identification project (see Digitizing Information, above).<ref name=reCAPTCHA>von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008, September 12). reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321, 1465–1468. 10.1126/science.1160379</ref>
  
 
Google acquired the project in September 2009<ref name=ACQUISITION>von Ahn, L., & Cathcart, W. (2009, September 16). Teaching computers to read: Google acquires reCAPTCHA. Google. https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html</ref> and has spearheaded successive versions (reCAPTCHA v2, reCAPTCHA v3, and reCAPTCHA Enterprise).<ref name=GoogleRECAPTCHAabout>Google. reCAPTCHA: About. https://www.google.com/recaptcha/about/</ref> The company discontinued support for reCAPTCHA v1 in 2018, but continues to support and offer successive reCAPTCHAs.<ref name=v1DISCONTINUED>Google. (2021, June 1). Choosing the type of reCAPTCHA.
 
Google acquired the project in September 2009<ref name=ACQUISITION>von Ahn, L., & Cathcart, W. (2009, September 16). Teaching computers to read: Google acquires reCAPTCHA. Google. https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html</ref> and has spearheaded successive versions (reCAPTCHA v2, reCAPTCHA v3, and reCAPTCHA Enterprise).<ref name=GoogleRECAPTCHAabout>Google. reCAPTCHA: About. https://www.google.com/recaptcha/about/</ref> The company discontinued support for reCAPTCHA v1 in 2018, but continues to support and offer successive reCAPTCHAs.<ref name=v1DISCONTINUED>Google. (2021, June 1). Choosing the type of reCAPTCHA.

Revision as of 22:43, 11 February 2023

CAPTCHA is a nomenclature shorthand that refers to an assortment of automated systems used to distinguish between humans and computers. The concept was first created in 2000 by students and researchers at Carnegie Mellon University.[1][2] The earliest version of the system is credited to Luis von Ahn, Manuel Blum, Nicholas Hopper, and John Langford, whose joint academic paper introducing CAPTCHA—entitled "CAPTCHA: Using Hard AI Problems for Security"—appeared in early 2003.[3] The quartet also coined the term "CAPTCHA", which is an acronym for "Completely Automated Public Turing Test To Tell Computers and Humans Apart."[1] Since their origination, CAPTCHAs of various forms have become a ubiquitous security measure across the entirety of the internet. They are employed by Wikipedia,[4] Google,[5] and other companies.[6]
An example of a CAPTCHA implementation on Wikipedia's website. This particular CAPTCHA is used to ensure that only human users are able to create new Wikipedia accounts.[7] Image Credit: Wikpedia

CAPTCHAs are designed around the limitations of current artificial intelligence, which struggle with particular tasks that humans generally have no difficulty with. However, as artificial intelligence technology progresses, new programs are often able to reliably evade existing security measures. Computer scientists view this as a win-win—either a CAPTCHA version cannot be defeated and a site remains secure, or the program employed to evade the existing CAPTCHA has successfully solved an open problem in artificial intelligence and significantly advanced the field.[1][2] Regardless, CAPTCHAs themselves must continually evolve in order to remain a step ahead of malicious programs, which has led to the propagation of new challenge systems, such as reCAPTCHA, hCAPTCHA, and ASIRRA.

CAPTCHA and its successors have been criticized for presenting a barrier to web accessibility and generating undue challenges for users with disabilities or atypical skill and knowledge sets. Such users, in addition to potentially lacking abilities presumptively possessed by the human user audience, often actively utilize technology to enable or supplement their online experiences (e.g., screen reading or speech recognition software.) Their assistive devices, being programs, are by design regularly unable to appropriately interpret or bypass certain CAPTCHAs in order to verify personhood and access the site.

Background

For example, humans typically outperform computers on image recognition tasks, especially those that require manipulation or categorization.

Evolution of Programs

Turing Test

Alan Turing, in his 1950 paper “Computing Machinery and Intelligence”, first proposed the idea of an imitation test, for distinguishing between computers and humans.

Such tests have now become known as

Such imitation games are now commonly (and eponymously) known as Turing Tests.


Turing describes an old parlour game, the Imitation Game, and proposes that it be used as a metric of computer development and intelligence.

Reverse Turing Test

Characteristics

cogsci human features/skills/abilities?

compsci computer features/skills/abilities? - optical character recognition—must explain

Dual Uses

Digitizing Information

reCAPTCHA (see below) served an innovative dual purpose beyond that of the typical CAPTCHA program. In addition to guarding websites against bots posing as human, reCAPTCHA took advantage of the massive amounts of human labor devoted to solving CAPTCHAs to read and digitize scanned archival texts.[8][9]

About 200 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that's not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.[9]

reCAPTCHA contributed to interpreting and digitizing the New York Times’ historical archive and, after its Google acquisition, assisted with Google Books efforts.[9][10][11] The project team reports that reCAPTCHA-based text recognition is able to achieve accuracy of over 99.1% in correctly identifying individual words. Conversely, the optical character recognition software available at the time was only able to accurately identify 83.5% of words.[8]

In precisely one year of reCAPTCHA deployment, human users collectively solved over 1.2 billion CAPTCHA tasks, translating over 440 million suspicious words that had been unreadable to optical character recognition software. (The same words were presented to multiple users, whose responses were then cross-verified before a word was confirmed.) Consequently, the authors ultimately characterized reCAPTCHA as a successful use of broad human computational power.[8]

reCAPTCHA digitization efforts may be deemed as exploitative of human users who, in order to access sites and private accounts, are forced to dedicate their own unpaid time and brainpower—in however minuscule increments—to deciphering words for the archives of for-profit companies, even if the dissemination of that knowledge ultimately benefits them. Additionally, the prevalence and dominance of human computation and artificial intelligence digitization can contribute to the decline of professional human transcribers, whose jobs are rendered largely obsolete comparatively, as reCAPTCHA deployment can generate higher accuracy at lower price points.[8][10][11]

Training Artificial Intelligence

In a similar vein, image-based CAPTCHAs have been used to tag image databases and train artificial intelligence systems in image recognition.

Google self-driving car controversy?

Web Accessibility Concerns

social and cultural disparities (e.g., unfamiliarity with American traffic lights) legality? (e.g., internet inexperience, social access disparities, and/or different cultural backgrounds)

presumption of experience

age

disabilities

CAPTCHA Evasion

Artificial Intelligence Progression

Human Computation

Because humans are reliably and easily able to bypass CAPTCHAs, mass-scale human labor can be used to bypass CAPTCHAs. The fruits of their labor are then handed immediately back to computer and bot accounts, in a delivery process that averages under twenty seconds. Middlemen recruit labor from low-income countries in Asia; in particular, Russia, India, China, Vietnam, and Bangladesh were indicated as probable sources.[12] They are able to take advantage of a sizeable, low-cost workforce to sell one thousand solved CAPTCHAs for nominal, single-digit sums in United States dollars. This more or less defeats the purpose of CAPTCHAs and transforms anti-spam measures into a purely economic barrier. Furthermore, the wage and labor conditions posed to such workers present a potential ethical concern.[12][13]

Timeline and Versions

original versions still in use btw

key features? text-id, item identification, item categorization, other models?

CAPTCHA

text


reCAPTCHA

reCAPTCHA is a notable CAPTCHA service and company originally affiliated with Carnegie Mellon University's School of Computer Science.[8][14] Luis von Ahn, who was a part of the original CAPTCHA team in 2000, is credited as the project’s executive producer.[1][14] The project is the original team’s officially recommended CAPTCHA iteration.[1]

von Ahn received a prestigious MacArthur Fellowship in 2006, primarily in recognition and support of his work with CAPTCHA.[15][16] He stated that he planned to use the reward money to further his human computation ambitions.[16] von Ahn ultimately accomplished this with his novel reCAPTCHA text-identification project (see Digitizing Information, above).[8]

Google acquired the project in September 2009[17] and has spearheaded successive versions (reCAPTCHA v2, reCAPTCHA v3, and reCAPTCHA Enterprise).[18] The company discontinued support for reCAPTCHA v1 in 2018, but continues to support and offer successive reCAPTCHAs.[19]

reCAPTCHA v1

reCAPTCHA v2

reCAPTCHA v3

reCAPTCHA Enterprise


Other Versions

SQUIGL-PIX

ESP-PIX

NuCAPTCHA

hCAPTCHA

ASIRRA

text

Future Versions

enabled an open problem in artificial intelligence.

References

  1. 1.0 1.1 1.2 1.3 1.4 Carnegie Mellon University. (2000–2010). The Official CAPTCHA Site. http://www.captcha.net/
  2. 2.0 2.1 Robinson, S. (2002, December 10). Human or Computer? Take This Test. The New York Times, F1. https://www.nytimes.com/2002/12/10/science/human-or-computer-take-this-test.html
  3. von Ahn, L., Blum, M., Hopper, N.J., Langford, J. (2003). CAPTCHA: Using Hard AI Problems for Security. In: Biham, E. (eds) Advances in Cryptology — EUROCRYPT 2003. EUROCRYPT 2003. Lecture Notes in Computer Science, vol 2656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39200-9_18
  4. Wikipedia. Create Account. https://en.wikipedia.org/w/index.php?title=Special:CreateAccount
  5. Google. reCAPTCHA. https://www.google.com/recaptcha/about/
  6. Google. reCAPTCHA Help. https://support.google.com/recaptcha/?hl=en
  7. Wikipedia. https://en.wikipedia.org/w/index.php?title=Special:CreateAccount
  8. 8.0 8.1 8.2 8.3 8.4 8.5 von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008, September 12). reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321, 1465–1468. 10.1126/science.1160379
  9. 9.0 9.1 9.2 reCAPTCHA. (2009). What Is reCAPTCHA. https://web.archive.org/web/20100611210239/http://recaptcha.net/learnmore.html
  10. 10.0 10.1 Gugliotta, G. (2011, March 28–29). Deciphering Old Texts, One Woozy, Curvy Word at a Time. The New York Times, D3. https://www.nytimes.com/2011/03/29/science/29recaptcha.html
  11. 11.0 11.1 Stone, B. (2009, September 16). Google Buys Service That Uses Humans to Digitize Books. The New York Times. https://archive.nytimes.com/bits.blogs.nytimes.com/2009/09/16/google-buys-service-that-uses-humans-to-digitize-books/
  12. 12.0 12.1 Motoyama, M., Levchenko, K., Kanich, C., McCoy, D., Voelker, G., & Savage, S. (2010). Re: CAPTCHAs – Understanding CAPTCHA-Solving Services in an Economic Context. Proceedings of the 19th USENIX Security Symposium, Washington, DC, USA. https://klevchen.ece.illinois.edu/pubs/mlkmvs-usesec10.pdf
  13. International Labour Organization. "World Employment and Social Outlook: Trends 2019." (2019, February 13). United Nations. https://www.ilo.org/wcmsp5/groups/public/---dgreports/---dcomm/---publ/documents/publication/wcms_670542.pdf
  14. 14.0 14.1 reCAPTCHA. (2009). About Us. https://web.archive.org/web/20100611210259/http://recaptcha.net/aboutus.html
  15. MacArthur Foundation. (2006, September 1). Luis von Ahn. https://www.macfound.org/fellows/class-of-2006/luis-von-ahn#searchresults
  16. 16.0 16.1 Spice, B. (2006, September 18). Brilliant Young Scientist Luis von Ahn Earns $500,000 MacArthur Foundation "Genius Grant”. Carnegie Mellon Today, Pittsburgh, PA, https://www.cmu.edu/cmnews/extra/060918_ahn.html
  17. von Ahn, L., & Cathcart, W. (2009, September 16). Teaching computers to read: Google acquires reCAPTCHA. Google. https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html
  18. Google. reCAPTCHA: About. https://www.google.com/recaptcha/about/
  19. Google. (2021, June 1). Choosing the type of reCAPTCHA. https://developers.google.com/recaptcha/docs/versions