Algorithms

From SI410
Revision as of 16:03, 19 April 2019 by Jmpezz (Talk | contribs)

Jump to: navigation, search
Algorithm.png

An algorithm is a set of precise steps and distinct states used to express the detailed structure of a program or the order of events occurring in a system.[1] Algorithms are involved in many aspects of daily life and as well as complex computer science concepts. They often use repetition of operations which allows people and machines to complete tasks more efficiently. They can help turn systematic, and rather tedious, tasks into faster more automated processes.

Processes like cooking a meal or reading a manual to assemble a new piece of furniture are examples of algorithm use in everyday life.[2] Algorithms are grounded in logic. The increase in their logical complexity through advancements in technology and human effort have provided the foundations for technological concepts such as artificial intelligence and machine learning.[3] The influence of algorithms is pervasive, and in computer science has lead to an increase in ethical concerns such as bias, privacy, and accountability.


History

The earliest known algorithms stem back to 1700 BC when the Egyptians created them for quick multiplication and division.[4] From there, ancient Babylonians in 300 BC created algorithms to track farming and livestock using square roots. On a basic level, an algorithm is a system working through different iterations of a process.[5] Following this, steady advancement gave birth to fundamental mathematical algorithm families like algebra, shaping the field of mathematics with its all-purpose formulas. The man often accredited as “The Father of Algebra,” Muhammad ibn Mūsa al-Khwarizmī, was also the one who gave English the word “algorithm” around 850 AD, as he wrote a book Al-Khwarizmi on the Hindu Art of Reckoning, which in Latin translates to Algoritmi de Numero Indorum. The English word "algorithm" was adopted from this title.

Many educational trainings teach mathematical formulas, such as rudimentary multiplication and division methods,[6] that serve as connections between explicit algorithms and mathematics.

A myriad of other fundamental algorithms have been developed throughout history, ranging from pure mathematics to important computer science stalwarts, extending from ancient times up through modern day. The computer revolution lead to algorithms that can filter and personalize results based on the user.

The Algorithm Timeline outlines the advancements in the development of the algorithm as well as a number of the well known algorithms developed from the Medieval Period until modern day.

Computation

Another cornerstone for algorithms as known today relates to Alan Turing and his contributions to cognitive and computer science. Turing conceptualized the concept of cognition, and designed ways to emulate human cognition with machines. This process turned the human thought process into mathematical algorithms, and led to the development of Turing Machines, which capitalized on these theoretical algorithms to perform unique functions, and this led to the development of the first computers, and, subsequently, the entire field of computer science.[7] As their name suggests, computers utilized specific rules, or algorithms, to compute, and it is these machines (or sometimes people)[8] that most often relate to the concept of algorithms that is used today. With the advent of mechanical computers, the computer science field paved the way for algorithms to run the world as they do now, calculating and controlling an immense quantity of facets of daily life.

Advancements In Algorithms

In the years following Alan Turing’s contributions, computer algorithms have increased in magnitude and complexity. Advanced algorithms, such as artificial intelligence which utilizes machine learning capabilities, have been developed.[9] This level of algorithmic improvement has provided the foundation for even more technological advancement which has allowed society to evolve greatly.

MachineLearning.png

The machine learning process shown above describes how machine learning algorithms can provide more features and functionality to artificial intelligence.

Classifications

There are many different classifications of algorithms, some of which tend to be more well-suited for particular families of computational problems than others. In many cases, the algorithm one chooses to make for a given problem will have tradeoffs dealing with time complexity and memory usage.

Recursion

A recursive algorithm is one in which the function calls itself, hence forming a recursive relation with itself. It is an algorithmic approach in which one solves an issue by breaking a problem onto smaller and smaller sub-problems until the sub-problem becomes so small the answer is obvious, this is the base case. Then using the solution (base case) of this said sub-problem, one can find the solution of the sub-problem that invoked the call to itself. An example of such recursive relationship is finding the factorial of a number n - which is n * factorial of n - 1. Eg. fact(3) = 3 * fact(2) and fact(2) = 2 * fact(1) where fact(1) is the base case since it is obvious the factorial of 1 is 1. Hence fact(2) = 2 * 1 = 2, so fact(3) = 3 * 2 = 6. A recursive function keeps using itself as an input until a base case is reached where it will stop and propagate back out the recursive stacks. A base case must be present, otherwise the recursive function will never stop calling itself. Because recursion incurs lots of function calls it is one of the main source of stack overflow whereby the program has to save more stacks than space available before even reaching the base case. As a result, some recursive algorithms are known as tail recursive in which recursive calls happens at the end of the function allowing only one stack to be saved. For more information on recursion read Recursion.

Serial, Parallel or Distributed

Serial algorithm is an algorithm in which calculation is done in sequential manners on one core, it follows defined order in solving a problem. Parallel algorithms utilizes the fact that modern computers have more than one cores, so that computations that are not interdependent could be calculated on separate cores. This is also referred to as multi-threaded algorithm where each thread is a series of executing commands, and they are inter-weaved to ensure correct output without deadlock. A deadlock occurs when there are interdependence between more than one threads such that none of the threads can continue until one of the other threads continues. Parallel algorithm is important in that it allows more than one programs to run at once by leveraging a computer's available resources otherwise would not have been possible with serial algorithm. Finally, distributed algorithm is similar to parallel algorithm in that it allows multiple programs to run at once with the exception that instead of leveraging multiple cores in a single computer it leverages multiple computers that communicates through a computer network. Similar to how parallel algorithm builds on serial algorithm with the added complexity of synchronizing threads to prevent deadlock and ensure correct outputs, distributed algorithm builds on parallel algorithm with the added complexity of managing communication latency and defining order since it is impossible to synchronize every computer's clock in a distributed system without significant compromises. A distributed algorithm provides extra reliability in that data are stored in more than one location so that one failure would not result in loss of data and by doing computation in multiple computer it can potentially have even faster computational speed.

Deterministic vs Non-Deterministic

Deterministic algorithms are those that solve problems with exact precision and ordering so a given input would produce the same output every time. Non-deterministic algorithms either have data races or utilize certain randomization, so the same input could have a myriad of outputs. Non-deterministic algorithms can be represented by flowcharts, programming languages, and machine language programs. However, they vary in that they are capable of using a multiple-valued function whose values are the positive integers less than or equal to itself. In addition, all points of termination are labeled as successes or failures. The terminology "non-deterministic" does not imply randomness, of rather a kind of free will[10]

Exact vs Approximation

Some algorithms are implemented to solve for the exact solutions of a problem, whereas some problems are implemented for an approximation or heuristic. Approximation is important in which heuristics could provide an answer that is good enough such that the excessive computational time necessary to find the actual solution is not warranted since one would get minimal gains while expanding a great deal of resources. An example where an approximation is warranted is the traveling salesman's algorithm in which has computational complexity of O(n!), so an heuristic is necessary since some high values of n are not even possible to calculate.

Design Paradigms

Paradigm is also a good one to distinguish algorithms since they serve to solve different issues

Brute Force

A brute force algorithm is the most "naive" approach one can take in attempting to solve a particular problem. A solution is reached by searching through every single possible outcome before arriving at an answer. In terms of complexity or Big-O notation, brute force algorithms typically represent the highest order complexity compared to other potential solutions for a given problem. While brute force algorithms may not be considered the most efficient option for solving computational problems, they do offer reliability as well as a guarantee that a solution to a given problem will eventually be found.

Divide and Conquer

A divide and conquer algorithm divides a problem into smaller sub-problems then conquer each smaller problems before merging them together to solve the original problem. An example of divide and conquer is merge sort whereby a list is split into smaller sorted lists then merged together to sort the original list.

Dynamic Programming

Dynamic programming takes advantage of overlapping subproblems to more efficiently solve a larger computational problem.

Backtracking

A backtracking algorithm is similar to brute force with the exception that as soon as it reaches a node where a solution could never be reached from said node on, it prunes all the subsequent node and backtracks to the closest node that has the possibility to be right.


Complexity and Big-O Notation

Complexity Graph

Since each computer has different computational time, it is illogical to measure the efficiency of an algorithm based on how fast it runs on a particular computer. However, using the fact that each operation must take some time to compute, the efficiency of an algorithm can be measured by checking how well it grows with more inputs. In other words, how much does the computational time deteriorate with increasing inputs. Since this form of measurement merely intends to see how well an algorithm grows, the constants are left out since with high inputs these constants are negligible anyways. Big-O notation specifically describes the worst-case scenario and measures the time or space the algorithm uses[11]. Big-O notation can be broken down into order of growth algorithms such as O(1), O(N), and O(log N), with each notation representing different orders of growth. The later, log algorithms -commonly referred to as logarithms, are bit more complex than the rest, log algorithms take a median from a data set and compare it to a target value, the algorithm continues to halve the data as long as the median is higher or lower than the target value[11].

Ethical Dilemmas

With the relevance of algorithms as well as their sheer magnitude, ethical dilemmas were bound to arise. There is a vast list of potential ethical issues relating to algorithms and computer science, including issues of privacy, data gathering, and bias.

Bias

Algorithms are designed by humans who are biased by the society they live in. Therefore, humans inherently code their own biases into algorithms.

Joy Buolamwini, a graduate computer science student at MIT, experienced an example of this. The facial recognition software she was working on was unable to detect her face, as she had a skin tone that had not been accounted for in the facial recognition algorithm. This is because the software had used machine learning with a dataset that was not diverse enough, and as a result, the algorithm did not recognize her face as it was supposed to.[12] Safiya Noble discusses instances of algorithmic search engines reinforcing racism in her book, "Algorithms of Oppression".[13] Bias like this occurs in countless algorithms, be it through insufficient machine learning data sets, or the algorithm developers own fault, among other reasons, and it has the potential to cause legitimate problems even outside the realm of ethics.

COMPAS is algorithm written to determine whether a criminal is likely to re-offend using information like age, gender, and previously committed crimes. Tests have found it to be more likely to incorrectly evaluate black people than white people because it has learned on historical criminal data, which has been influenced by our biased policing practices.[14]

Jerry Kaplan is a research affiliate at Stanford University’s Center on Democracy, Development and the Rule of Law at the Freeman Spogli Institute for International Studies, where he teaches “Social and Economic Impact of Artificial Intelligence.” According to Kaplan, algorithmic bias can even influence whether or not a person is sent to jail. A 2016 study conducted by ProPublica indicated that software designed to predict the likelihood an arrestee will re-offend incorrectly flagged black defendants twice as frequently as white defendants in a decision-support system widely used by judges. Ideally, predictive systems should be wholly impartial and therefore be agnostic to skin color. However, surprisingly, the program cannot give black and white defendants who are otherwise identical the same risk score, and simultaneously match the actual recidivism rates for these two groups. This is because black defendants are re-arrested at higher rates that their white counterparts (52% versus 39%), at least in part due to racial profiling, inequities in enforcement, and harsher treatment of black people within the justice system. [15]

Privacy And Data Gathering

The ethical issue of privacy is also highly relevant to the concept of algorithms. Information transparency [16] is an import point regarding these issues. In popular social media algorithms, user information is often probed without the knowledge of the individual, and this can lead to problems. It is often not transparent enough how these algorithms receive user data, resulting in often incorrect information which can affect both how a person is treated within social media, as well as how outside agents could view these individuals given false data. Algorithms can also often infringe on a user’s feelings of privacy, as data can be collected that a person would prefer to be private. Data brokers are in the business of collecting peoples in formation and selling it to anyone for a profit, like data brokers companies often have their own collection of data. In 2013, Yahoo was hacked, leading to the leak of data pertaining to approximately three billion users.[17] The information leaked contained data relating to usernames, passwords, as well as dates of birth. Privacy and data gathering are common ethical dilemmas relating to algorithms and are often not considered thoroughly enough by algorithm’s users.

The Filter Bubble

Personalized, Online Filter Bubbles

Algorithms can be used to filter results in order to prioritize items that the user might be interested in. On some platforms, like Amazon, people can find this filtering useful because of the useful shopping recommendations the algorithm can provide. However, in other scenarios, this algorithmic filtering can become a problem. For example, Facebook has an algorithm that re-orders the user's news feed. For a period of time, the technology company prioritized sponsored posts in their algorithm. This often prioritized news articles, but there was no certainty on whether these articles came from a reliable source, simply the fact that they were sponsored. Facebook also uses its technology to gather information about its users, like which political party they belong to. This combined with prioritizing news can create a Facebook feed filled with only one party's perspective. This phenomenon is called the filter bubble, which essentially creates a platform centered completely around its user's interests.

Many, like Eli Pariser, have questioned the ethical implications of the filter bubble. Pariser believes that filter bubbles are a problem because they prevent users from seeing perspectives that might challenge their own. Even worse, Pariser emphasizes that this filter bubble is invisible, meaning that the people in it do not realize that they are in it. [18] This creates a huge lack of awareness in the world, allowing people to stand by often uninformed opinions and creating separation, instead of collaboration, with users who have different beliefs.

Because of the issues Pariser outlined, Facebook decided to change their algorithm in order to prioritize posts from friends and family, in hopes of eliminating the effects of the potential filter bubble.

Agency And Accountability

Algorithms generally make "decisions" based on the steps they were designed to follow and the information that they have. This can often lead to algorithms as autonomous agents[19], taking decision making responsibilities out of the hands of real people. This can be useful from an efficiency standpoint, as these autonomous agents are capable of making decisions and taking actions at a far greater frequency and tempo than humans possible could, and it is this efficiency that materializes the baseline for algorithm use in the first place. From an ethical standpoint, this type of agency raises many complications, specifically regarding accountability. It is no secret that many aspects of life are run by algorithms. Even events like applying to jobs are often drastically effected by these processes. Information like age, race, status, along with other qualifications, are all fed to algorithms, which then take agency and decide who moves further along in the hiring process and who is left behind. Disregarding inherent biases in this specific scenario, this process still serves to reduce the input of real humans and decrease the number of decisions that they have to make, and what is left over is the fact that autonomous agents are making systematic decisions that have extraordinary impact on people's lives.

While the results of the previous example may only culminate to the occasional disregard of a qualified applicant or resentful feelings, this same principle can be much more influential. Consider autonomous vehicles, or self-driving cars, for instance. These are highly advanced algorithms that are programmed to make split second decisions with the greatest possible accuracy. In the case of the well-known "Trolley Problem"[20], these agents are forced to make a decision jeopardizing one party or another. This decision can easily conclude in the injury or even death of individuals, all at the discretion of a mere program.

In situations like this, where one must be held responsible from the eyes of the law, society, or from an ethical viewpoint, the issue of accountability is raised. It is rarely feasible to simply attribute the culpability to the program itself, as it can not be reasonably held accountable in the same vein that a human might. While some may believe the responsibility then falls to the developer of the program making the decisions[21], and others may place the blame on the users or humans involved in the situation (e.g. the people who caused the autonomous vehicle to be placed in this situation... think jaywalkers or similar offenders), regardless, the ethical implications of such a circumstance are perplexing and wide-reaching, with sparsely an easy solution to be found.

These complications will continue to arise, especially as algorithms continue to make autonomous decisions at grander scales and rates. Determining responsibility for the decisions these agents make will continue to be a vexing process, and will no doubt shape in some form many of the advanced algorithms that will be developed in the coming years.

Intentions And Consequences

The ethical issues that are common in algorithm implementations can be both deliberate and unintentional. The bias in the facial recognition software that Joy Buolamwini used was an unintentional mistake that arose from a lack of machine learning data. Issues of this sort could lead to problems as facial recognition algorithms become more commonplace. It is important to consider possible faults like this when developing new algorithms, especially as algorithms become increasingly important in defining human lives. Conversely, Yahoo’s algorithm intentionally procured and stored the data of its users, and while this is not necessarily malicious in its own right, this was an issue when its information was leaked to the public. Data gathering and privacy within algorithms is an important ethical concept to be considered by developers so that other ethical fallouts are avoided in the future.

Algorithms are present throughout human society, and they control the way the world is run. There are many ethical concerns to be considered, and with more advanced algorithms in development every day, it becomes increasingly necessary for these concerns to be monitored and handled appropriately.

References

  1. Cormen, Thomas H. et al. Introduction to Algorithms. MIT Press, 2009.
  2. “What Are Algorithms?” The Economist, The Economist Newspaper, 29 Aug. 2017, www.economist.com/the-economist-explains/2017/08/29/what-are-algorithms.
  3. McClelland, Calum. “The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning.” Medium, IoT For All, 4 Dec. 2017, medium.com/iotforall/the-difference-between-artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991.
  4. Stepanov, A. and Rose, D. From Mathematics to Generic Programming: The First Algorithm. Chapter 2. 2014. http://www.informit.com/articles/article.aspx?p=2264460
  5. “A Brief History of Algorithms (and Why It's so Important in Automation, Machine Learning, and Everyday Life).” e27, e27.co/brief-history-algorithms-important-automation-machine-learning-everyday-life-20161207/.
  6. Algorithms for Multiplying and Dividing Whole Numbers. www.ms.uky.edu/~rwalker/ma201/3.4.pdf.
  7. "Alan Turing: Algorithms, Computation, Machines.” Brooklyn Institute for Social Research, thebrooklyninstitute.com/items/courses/alan-turing-algorithms-computation-machines/.
  8. “Human Computers.” NASA, NASA, crgis.ndc.nasa.gov/historic/Human_Computers.
  9. “The History of Artificial Intelligence.” Science in the News, 28 Aug. 2017, sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/.
  10. " FLOYD, ROBERT W. “Non-Deterministic Algorithms.” Carnegie Institute of Technology , Nov. 1996, p. 1–17."
  11. 11.0 11.1 Bell, Rob. “A Beginner's Guide to Big O Notation.” A Beginner's Guide to Big O Notation - Rob Bell, rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/.
  12. Buolamwini, Joy. “How I'm Fighting Bias in Algorithms – MIT Media Lab.” MIT Media Lab, www.media.mit.edu/posts/how-i-m-fighting-bias-in-algorithms/.
  13. Noble, Safiya "Algorithms of Opression" https://en.wikipedia.org/wiki/Algorithms_of_Oppression
  14. Pro Publica Larson, Mattu, Kirchner, Angwin, May 23, 2016
  15. Kaplan, J. (2018, December 17). Why your AI might be racist. Retrieved from https://www.washingtonpost.com/opinions/2018/12/17/why-your-ai-might-be-racist/?utm_term=.3017d6f2e958
  16. Turilli, Matteo, and Luciano Floridi. “The Ethics of Information Transparency.” Ethics and Information Technology, vol. 11, no. 2, 2009, pp. 105–112., doi:10.1007/s10676-009-9187-9.
  17. Griffin, Andrew. “Yahoo Admits It Accidentally Leaked the Personal Details of Half the People on Earth.” The Independent, Independent Digital News and Media, 4 Oct. 2017, www.independent.co.uk/life-style/gadgets-and-tech/news/yahoo-hack-details-personal-information-was-i-compromised-affected-leak-a7981671.html.
  18. Eli Pariser. “The Filter Bubble.” Penguin Books, 2012.
  19. “Autonomous Agent.” Autonomous Agent - an Overview | ScienceDirect Topics, www.sciencedirect.com/topics/computer-science/autonomous-agent.
  20. Roff, Heather M. “The Folly of Trolleys: Ethical Challenges and Autonomous Vehicles.” Brookings, Brookings, 17 Dec. 2018, www.brookings.edu/research/the-folly-of-trolleys-ethical-challenges-and-autonomous-vehicles/.
  21. Martin, Kirsten. “Ethical Implications and Accountability of Algorithms.” SpringerLink, Springer Netherlands, 7 June 2018, link.springer.com/article/10.1007/s10551-018-3921-3.