Wikipedia Bots

From SI410
Revision as of 18:05, 15 March 2019 by Clyn (Talk | contribs)

Jump to: navigation, search

Wikipedia is a user-edited online encyclopedia . As of March, 2019 Wikipedia contains over five million [1] English articles, as well as many more in 302 other languages[2]. Wikipedia depends on volunteers to grow and maintain the site. Maintenance of the site is facilitated by internet bots, which have their own user accounts. In 2014, there were 274 active bots and they accounted for 15% of all edits on Wikipedia. Cite error: Closing </ref> missing for <ref> tag since a bot is interactive, autonomous, and adaptable, bots can be classified as Artificial Agents and further classified as moral agents if they make morally quantifiable decisions.

Bot Policy

Due to the potential for misuse, bots must adhere to Wikipedia's bot policy. Bot's must first be approved for usage, which requires the owner to prove both the bot's benignness and usefulness to the site. [3] A certain degree of transparency is required of the bot, which includes disclosing details of what tasks the bot is completing and the capacity at which it functions. A bot operator must submit a request to the Bot Approvals Group. There exists a trade off between the strictness of web policies and user freedom. Wikipedia's policy can be contrasted with Twitter's bot policy, which is more encouraging of bots and does not require approval. This open policy has allowed for a number of malware and spam bots. [4]

Benefits to Info-sphere

Bots can improve the quality of Wikipedia articles, by doing tasks like spell checking articles, and linking to other articles. They can aid in dealing with two issues that the website faces: vandalism and copyright infringement. Wikipedia's policy of open editing, allows for a range of disruptive actions including editing pages to include false information, including offensive content, and deleting existing high quality content. Cluebot[5] is a bot responsible for detecting vandalism, that is highly active, having made over 2 million edits[6]. Cluebot uses machine learning to classify edits vandalism. This can produce false positives and any machine learning algorithm is susceptible to bias, as the algorithm must be trained on human data which may include bias. Another issue that bots can help assuage is that articles can include text copied directly from copy written sources. Bots can compare edits with copywritten material, flagging duplicates. Since bots are able to perform these checking tasks more feasibly than humans, Wikipedia is able to balance a policy of openness, which enables its large scale with associated issues of user freedom.

Dangers to Info-sphere

Bots pose an array of dangers and ethical dilemmas to Wikipedia. First, an intentionally malicious bot could vandalize articles at a much faster rate, than any human. A well intentioned bot, may produce false positives and make invalid reverts to articles. In 2010, 13% of wikipedia editors were female. [7] Reverting an edit can discourage future editing, which can exacerbate the gender ratio issue, as women may more strongly experience the effects of criticism. Another potential issue is opaqueness of policy. Cluebot's machine vandalism detection algorithm uses a neural network, which compared with other algorithms presents more of black box, where it is difficult to discern how features are utilized by the algorithm. [8], which would be problematic if there was hidden bias in the algorithm. Such an example of bias would. be a profiling algorithm which is weighted more heavily against anonymous users. [9]. Additionally, enforcing an anti vandalism policy, requires a moral judgement. For example, for a bot to remove profane language, it must be determined what language is profane. A anti profanity policy may then restrict free speech. Another category of bots, bots which create new articles or add content to preexisting articles presents the issue of truth and trust online. A number of users trust Wikipedia as a reliable source of information. An early bot, Rambot, used public data bases to create articles on U.S. cities. Due to errors in the data, 2,000 articles were corrupted. [10]. Misinformation violates the trust that users have in the website.
  1. https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia
  2. https://en.wikipedia.org/wiki/List_of_Wikipedias
  3. https://en.wikipedia.org/wiki/Wikipedia:Bot_policy
  4. https://arxiv.org/pdf/1801.06863.pdf
  5. https://en.wikipedia.org/wiki/User:ClueBot_NG
  6. http://files.grouplens.org/papers/geiger13levee-preprint.pdf
  7. https://web.archive.org/web/20100414165445/http://wikipediasurvey.org/docs/Wikipedia_Overview_15March2010-FINAL.pdf
  8. https://www.researchgate.net/publication/5595919_Are_Artifi_cial_Neural_Networks_Black_Boxes
  9. https://link.springer.com/article/10.1007/s10676-015-9366-9
  10. https://www.researchgate.net/publication/249689493_Wisdom_of_the_Crowd_or_Technicity_of_Content_Wikipedia_as_a_Sociotechnical_System