CAPTCHA

From SI410
Revision as of 18:52, 26 January 2023 by Cplotner (Talk | contribs)

Jump to: navigation, search
CAPTCHA is a nomenclature shorthand that refers to an assortment of automated systems used to distinguish between humans and computers. The concept was first created in 2000 by students and researchers at Carnegie Mellon University.[1][2] The earliest version of the system is credited to Luis von Ahn, Manuel Blum, Nicholas Hopper, and John Langford, whose joint academic paper introducing CAPTCHA—entitled "CAPTCHA: Using Hard AI Problems for Security"—appeared in early 2003.[3] The quartet also coined the term "CAPTCHA", which is an acronym for "Completely Automated Public Turing Test To Tell Computers and Humans Apart."[1] Since their origination, CAPTCHAs of various forms have become a ubiquitous security measure across the entirety of the internet. They are employed by Google,[4][5] Wikipedia,[6] and other companies.
An example of a CAPTCHA implementation on Wikipedia's website. This particular CAPTCHA is used to ensure that only human users are able to create new Wikipedia accounts.

CAPTCHAs are designed around the limitations of current artificial intelligence, which struggle with particular tasks that humans generally have no difficulty with. However, as artificial intelligence technology progresses, new programs are often able to reliably evade existing security measures. Computer scientists view this as a win-win—either a CAPTCHA version cannot be defeated and a site remains secure, or the program employed to evade the existing CAPTCHA has successfully solved an open problem in artificial intelligence and significantly advanced the field.[1][2] Regardless, CAPTCHAs themselves must continually evolve in order to remain a step ahead of malicious programs, which has led to the propagation of new challenge systems, such as reCAPTCHA, hCAPTCHA, and ASIRRA.

CAPTCHA and its successors have been criticized for presenting a barrier to web accessibility and generating undue challenges for users with disabilities or atypical skill and knowledge sets. Such users, in addition to potentially lacking abilities presumptively possessed by the human user audience, often actively utilize technology to enable or supplement their online experiences (e.g., screen reading or speech recognition software.) Their assistive devices, being programs, are by design regularly unable to appropriately interpret or bypass certain CAPTCHAs in order to verify personhood and access the site.

PEER REVIEW SECTION 2022-01-26! If you are participating in this exercise today, please add your work and comments here with the direct section edit! Thank you!

Background

For example, humans typically outperform computers on image recognition tasks, especially those that require manipulation or categorization.

Evolution of Programs

Turing Test

Reverse Turing Test

Dual Uses

Digitizing Information

reCAPTCHA served an innovative dual purpose beyond that of the typical CAPTCHA program. In addition to guarding websites against bots posing as human, reCAPTCHA took advantage of the massive amounts of human labor devoted to solving CAPTCHAs to read and digitize scanned archival texts.[7][8]

About 200 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that's not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.[8]

reCAPTCHA contributed to interpreting and digitizing the New York Times’ historical archive and, after its Google acquisition, assisted with Google Books efforts.[8][9][10] The project team reports that reCAPTCHA-based text recognition is able to achieve accuracy of over 99.1% in correctly identifying individual words. Conversely, the optical character recognition software available at the time was only able to accurately identify 83.5% of words.[7]

In precisely one year of reCAPTCHA deployment, human users collectively solved over 1.2 billion CAPTCHA tasks, translating over 440 million suspicious words that had been unreadable to optical character recognition software. (The same words were presented to multiple users, whose responses were then cross-verified before a word was confirmed.) Consequently, the authors ultimately characterized reCAPTCHA as a successful use of broad human computational power.[7]

Training Artificial Intelligence

In a similar vein, image-based CAPTCHAs have been used to tag image databases and train artificial intelligence systems in image recognition.

Google self-driving car controversy?

Web Accessibility Concerns

social and cultural disparities (e.g., unfamiliarity with American traffic lights) legality? (e.g., internet inexperience, social access disparities, and/or different cultural backgrounds)

presumption of experience

age

disabilities

Artificial Intelligence Progress & Computer Evasion

Timeline and Versions

original versions still in use btw

CAPTCHA

text


reCAPTCHA

reCAPTCHA is a notable CAPTCHA service and company originally affiliated with Carnegie Mellon University's School of Computer Science.[7][11] Luis von Ahn, who was a part of the original CAPTCHA team in 2000, is credited as the project’s executive producer.[1][11] The project is the original team’s officially recommended CAPTCHA iteration.[1]

macarthur grant??

Google acquired the project in September 2009[12] and has spearheaded successive versions (reCAPTCHA v2, reCAPTCHA v3, and reCAPTCHA Enterprise).[13] The company discontinued support for reCAPTCHA v1 in 2018, but continues to support and offer successive reCAPTCHAs.[14]

reCAPTCHA v1

reCAPTCHA v2

reCAPTCHA v3

reCAPTCHA Enterprise


Other Versions

SQUIGL-PIX

ESP-PIX

NuCAPTCHA

hCAPTCHA

ASIRRA

text

Future Versions

enabled an open problem in artificial intelligence.

References

  1. 1.0 1.1 1.2 1.3 1.4 Carnegie Mellon University. (2000–2010). The Official CAPTCHA Site. http://www.captcha.net/
  2. 2.0 2.1 Robinson, S. (2002, December 10). Human or Computer? Take This Test. The New York Times, F1. https://www.nytimes.com/2002/12/10/science/human-or-computer-take-this-test.html
  3. von Ahn, L., Blum, M., Hopper, N.J., Langford, J. (2003). CAPTCHA: Using Hard AI Problems for Security. In: Biham, E. (eds) Advances in Cryptology — EUROCRYPT 2003. EUROCRYPT 2003. Lecture Notes in Computer Science, vol 2656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39200-9_18
  4. Google. reCAPTCHA. https://www.google.com/recaptcha/about/
  5. Google. reCAPTCHA Help. https://support.google.com/recaptcha/?hl=en
  6. Wikipedia. Create Account. https://en.wikipedia.org/w/index.php?title=Special:CreateAccount
  7. 7.0 7.1 7.2 7.3 von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008, September 12). reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321, 1465–1468. 10.1126/science.1160379
  8. 8.0 8.1 8.2 reCAPTCHA. (2009). What Is reCAPTCHA. https://web.archive.org/web/20100611210239/http://recaptcha.net/learnmore.html
  9. Gugliotta, G. (2011, March 28–29). Deciphering Old Texts, One Woozy, Curvy Word at a Time. The New York Times, D3. https://www.nytimes.com/2011/03/29/science/29recaptcha.html
  10. Stone, B. (2009, September 16). Google Buys Service That Uses Humans to Digitize Books. The New York Times. https://archive.nytimes.com/bits.blogs.nytimes.com/2009/09/16/google-buys-service-that-uses-humans-to-digitize-books/
  11. 11.0 11.1 reCAPTCHA. (2009). About Us. https://web.archive.org/web/20100611210259/http://recaptcha.net/aboutus.html
  12. von Ahn, L., & Cathcart, W. (2009, September 16). Teaching computers to read: Google acquires reCAPTCHA. Google. https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html
  13. Google. reCAPTCHA: About. https://www.google.com/recaptcha/about/
  14. Google. (2021, June 1). Choosing the type of reCAPTCHA. https://developers.google.com/recaptcha/docs/versions