Difference between revisions of "Wikipedia Bots"

From SI410
Jump to: navigation, search
Line 7: Line 7:
  
 
=== Policy ===
 
=== Policy ===
 +
Due to the potential for misuse, bots must adhere to Wikipedia's bot policy. Bot's must first be approved for usage, which requires the owner to prove both the bot's benignness and use to the site. <ref>https://en.wikipedia.org/wiki/Wikipedia:Bot_policy</ref> A certain degree of transparency is required of the bot. A bot operator must submit a request to the Bot Approvals Group. There exists a trade off between the strictness of web policies and user freedom. 
  
 
=== Benefits to Info-sphere ===
 
=== Benefits to Info-sphere ===
 
+
Bots can improve the quality of Wikipedia articles, by doing tasks like spell checking articles. They can aid in dealing with two issues that the website faces: vandalism and copyright infringement. Wikipedia's policy of open editing, allows for a range of disruptive actions including editing pages to include false information, including offensive content, and deleting existing high quality content. Cluebot<ref>https://en.wikipedia.org/wiki/User:ClueBot_NG</ref> is a highly active bot responsible for detecting vandalism. Cluebot uses machine learning
 +
to classify edits vandalism. This can produce false positives and any machine learning algorithm is susceptible to bias, as the algorithm must be trained on human data which may include bias. Another issue that bots can help assuage is that articles can include text copied directly from copy written sources. 
 
=== Danger to Info-sphere ===
 
=== Danger to Info-sphere ===

Revision as of 06:15, 15 March 2019

Wikipedia is a user-edited online encyclopedia . As of March, 2019 Wikipedia contains over five million [1] English articles, as well as many more in 302 other languages[2]. Wikipedia depends on volunteers to grow and maintain the site. Maintenance of the site is facilitated by internet bots, which have their own user accounts. In 2014, there were 274 active bots and they accounted for 15% of all edits on Wikipedia. [3] The bots demonstrate how autonomous agents can improve the info sphere they inhabit. However, due to their potential to also cause wide damage to info sphere if left unchecked, there are policies concerning the bots, set in place to safe guard the website.

Internet bots

An internet bot is a software application that runs automated tasks on the web. They allow for repetitive tasks be performed faster and at a larger scale than they would be done manually. Internet bots are integral part of the web. For example, all search engines depend on crawlers who jump from link to link to index sites. Bots can roughly be divided into two categories: good actors like the search engine crawlers and malicious bots, which pose a large threat to cyber security. Bots can work together in what is known as a botnet to perform large scale attacks, like a distributed denial of service (DDoS) attack. The Mirai_Botnet is an infamous example of such an attack. Even good intentioned bots can cause damage, as there can be unintended consequences. A crawler that neglects to obey the robots exclusion protocol can overload a web server by making too many requests for the server to handle. Two notable properties of bots that enable their potential for unintended consequences are that they can be long running and autonomous. A bot can run continuously for years and can independently make decisions.

History of Bots on Wikipedia

Policy

Due to the potential for misuse, bots must adhere to Wikipedia's bot policy. Bot's must first be approved for usage, which requires the owner to prove both the bot's benignness and use to the site. [4] A certain degree of transparency is required of the bot. A bot operator must submit a request to the Bot Approvals Group. There exists a trade off between the strictness of web policies and user freedom.

Benefits to Info-sphere

Bots can improve the quality of Wikipedia articles, by doing tasks like spell checking articles. They can aid in dealing with two issues that the website faces: vandalism and copyright infringement. Wikipedia's policy of open editing, allows for a range of disruptive actions including editing pages to include false information, including offensive content, and deleting existing high quality content. Cluebot[5] is a highly active bot responsible for detecting vandalism. Cluebot uses machine learning to classify edits vandalism. This can produce false positives and any machine learning algorithm is susceptible to bias, as the algorithm must be trained on human data which may include bias. Another issue that bots can help assuage is that articles can include text copied directly from copy written sources.

Danger to Info-sphere

  1. https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia five million
  2. https://en.wikipedia.org/wiki/List_of_Wikipedias
  3. ip=35.3.51.231&id=2641613&acc=ACTIVE%20SERVICE&key=93447E3B54F7D979%2E0A17827594E6F2C8%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&__acm__=1552621378_e7b064c4e0c4e92a28e12ac3a1ac3ce1
  4. https://en.wikipedia.org/wiki/Wikipedia:Bot_policy
  5. https://en.wikipedia.org/wiki/User:ClueBot_NG