Algorithms

From SI410
Revision as of 17:54, 11 February 2021 by WikiSysop (Talk | contribs)

Jump to: navigation, search
Algorithm.png
Back • ↑Topics • ↑Categories

An Algorithm is defined as a set of precise steps and distinct states used to express the detailed structure of a program or the order of events that occurred in a system [1].Algorithms are involved in many aspects of daily life and in complex computer science concepts. They often use repetition of operations to allow people and machines to execute tasks more efficiently by executing tasks faster and using fewer resources such as memory. On a basic level, an algorithm is a system working through different iterations of a process[2]. They can help turn systematic and tedious tasks into fast, automated processes. Large companies particularly value robust algorithms because their infrastructure depends on efficiency to remain profitable on a massive scale[3].

Processes like cooking a meal or reading a manual to assemble a new piece of furniture are examples of algorithms in everyday life[4]. Algorithms are grounded in logic. The increase in their logical complexity via advancements in technology and human effort have provided the foundations for technological concepts such as artificial intelligence and machine learning[5]. The influence of algorithms is pervasive and in computer science so it leads to an increase in ethical concerns in the areas of bias, privacy, and accountability.


History

The earliest known algorithms stem back to 1700 BC when the Egyptians created them for quick multiplication and division[6]. Since then, ancient Babylonians in 300 BC created algorithms to track farming and livestock using square roots. Following this, steady advancement gave birth to fundamental mathematical algorithm families like algebra, shaping the field of mathematics with its all-purpose formulas. The man often accredited as “The Father of Algebra,” Muhammad ibn Mūsa al-Khwarizmī, was also the one who gave English the word “algorithm” around 850 AD, as he wrote a book Al-Khwarizmi on the Hindu Art of Reckoning, which in Latin translates to Algoritmi de Numero Indorum. The English word "algorithm" was adopted from this title.

A myriad of fundamental algorithms have been developed throughout history, ranging from pure mathematics to important computer science stalwarts, extending from ancient times up through the modern day. The computer revolution led to algorithms that can filter and personalize results based on the user.

The Algorithm Timeline outlines the advancements in the development of the algorithm as well as a number of the well-known algorithms developed from the Medieval Period until modern day.

Computation

Another cornerstone for algorithms comes from Alan Turing and his contributions to cognitive and computer science. Turing conceptualized the concept of cognition and designed ways to emulate human cognition with machines. This process turned the human thought process into mathematical algorithms and it led to the development of Turing Machines. It capitalized on these theoretical algorithms to perform unique functions and the development of computers. As their name suggests, computers utilized specific rules or algorithms to compute and it is these machines (or sometimes people)[7] that most often relate to the concept of algorithms that is used today. With the advent of mechanical computers, the computer science field paved the way for algorithms to run the world as they do now by calculating and controlling an immense quantity of facets of daily life. To this day, Turing machines are a main area of study in the theory of computation.

Advancements In Algorithms

In the years following Alan Turing’s contributions, computer algorithms increased in magnitude and complexity. Advanced algorithms, such as artificial intelligence is defined as utilizing machine learning capabilities.[8] This level of algorithmic improvement provided the foundation for more technological advancement. MachineLearning.png

The machine learning process shown above describes how machine learning algorithms can provide more features and functionality to artificial intelligence.

Classifications

There are many different classifications of algorithms, some are more well-suited for particular families of computational problems than others. In many cases, the algorithm one chooses to make for a given problem will have tradeoffs dealing with time complexity and memory usage.

Recursive Algorithms

A recursive algorithm is an algorithm that calls itself with decreasing values in order to reach a pre-defined base case solution. The base case solution determines the values that are sent back up the recursive stack to determine the final outcome of the algorithm. It follows the principle of solving subproblems to solve the larger problem since once the base case solution is reached, the algorithm works upwards to fit the solution into the larger subproblem. The base case must be present, otherwise the recursive function will never stop calling itself, creating an infinite loop. Since recursion involves numerous function calls, it is one of the main sources of stack overflow. With recursive calls, programs have to save more stacks despite a lack of available space. Further, some recursive functions require additional computations even after the recursive call, adding to the consumption speed and memory. 'Tail Recursive' functions are an efficient solution to this, wherein recursive calls happen at the very end of the function, allowing only one stack to be saved throughout the function calls.

Due to the recurring memory stack frames that are created with each call, recursive algorithms generally require more memory and computation power. However, they are still viewed as simplistic and succinct ways to write elaborate algorithms.

Serial, Parallel or Distributed

A serial algorithm is an algorithm in which calculation is done in sequential manners on one core, it follows a defined order in solving a problem. Parallel algorithms utilizes the fact that modern computers have more than one cores, so that computations that are not interdependent could be calculated on separate cores. This is referred to as multi-threaded algorithm where each thread is a series of executing commands and they are inter-weaved to ensure correct output without deadlock. A deadlock occurs when there are interdependence between more than one thread so that none of the threads can continue until one of the other threads continues. Parallel algorithm is important that it allows more than one program to run at the same time by leveraging a computer's available resources otherwise would not have been possible with serial algorithm. Finally, distributed algorithm is similar to parallel algorithm in that it allows multiple programs to run at once with the exception that instead of leveraging multiple cores in a single computer it leverages multiple computers that communicates through a computer network. Similar to how parallel algorithm builds on serial algorithm with the added complexity of synchronizing threads to prevent deadlock and ensure correct outputs, distributed algorithm builds on parallel algorithm with the added complexity of managing communication latency and defining order since it is impossible to synchronize every computer's clock in a distributed system without significant compromises. A distributed algorithm provides extra reliability in that data are stored in more than one location so that one failure would not result in loss of data and by doing computation in multiple computer it can potentially have even faster computational speed.

Deterministic vs Non-Deterministic

Deterministic algorithms are those that solve problems with exact precision and ordering so a given input would produce the same output every time. Non-deterministic algorithms either have data races or utilize certain randomization, so the same input could have a myriad of outputs. Non-deterministic algorithms can be represented by flowcharts, programming languages, and machine language programs. However, they vary in that they are capable of using a multiple-valued function whose values are the positive integers less than or equal to itself. In addition, all points of termination are labeled as successes or failures. The terminology "non-deterministic" does not imply randomness, of rather a kind of free will[9]

Exact vs Approximation

Some algorithms are implemented to solve for the exact solutions of a problem, whereas some problems are implemented for an approximation or heuristic. Approximation is important in which heuristics could provide an answer that is good enough such that the excessive computational time necessary to find the actual solution is not warranted since one would get minimal gains while expanding a great deal of resources. An example where an approximation is warranted is the traveling salesman's algorithm in which has computational complexity of O(n!), so an heuristic is necessary since some high values of n are not even possible to calculate.

Brute Force or Exhaustive Search

A brute force algorithm is the most "naive" approach one can take in attempting to solve a particular problem. A solution is reached by searching through every single possible outcome before arriving at an answer. In terms of complexity or Big-O notation, brute force algorithms typically represent the highest order complexity compared to other potential solutions for a given problem. While brute force algorithms may not be considered the most efficient option for solving computational problems, they do offer reliability as well as a guarantee that a solution to a given problem will eventually be found.

An example of a brute force algorithm would be trying all combinations of a 4-digit passcode, in order to crack into a target's smartphone.

Divide and Conquer

A divide and conquer algorithm divides a problem into smaller sub-problems then conquer each smaller problems before merging them together to solve the original problem. In terms of efficiency and the Big-O notation, the Divide and Conquer fares better than Brute Force but is still relatively inefficient compared to other more complex algorithms. An example of divide and conquer is merge sort wherein a list is split into smaller sorted lists and then merged together to sort the original list.

Examples of Divide and Conquer algorithms would be the sorting algorithm Merge Sort, and the searching algorithm Binary Search[10].

Dynamic Programming

Dynamic programming takes advantage of overlapping subproblems to more efficiently solve a larger computational problem. The algorithm first solves less complex subproblems and stores their solutions in memory. Then more complex problems will find these solutions using some method of lookup to find the solution and implement it in the more complex problem to find its own solution. The method of lookup enables solutions to be computed once and used multiple times. This method reduces the time complexity from exponential to polynomial.

An example of a common problem that can be solved by Dynamic Programming is the 0-1 Knapsack Problem.

Backtracking

A backtracking algorithm is similar to brute force with the exception that as soon as it reaches a node where a solution could never be reached from said node on, it prunes all the subsequent node and backtracks to the closest node that has the possibility to be right. Pruning in this context means neglecting the failed branch as a potential solution branch in all further searches, reducing the scope of the possible solution set and eventually guiding the program to the right outcome.

An example of some problems that can be solved by algorithms that take advantage of backtracking is solving Sudoko, or the N-Queens Problem.

Greedy Algorithm

An intuitive approach to design algorithms that don’t always yield the optimal solution. This approach required a collection of candidates, or options, in which the algorithm selects in order to satisfy a given predicate. Greedy algorithms can either favor the least element in the collection or the greatest in order to satisfy the predicate[11].

An example of a greedy algorithm may take the form of selecting coins to make change in a transaction. The collection includes the official coins of the U.S. currency (25 cents, 10 cents, 5 cents, and 1 cent) and the predicate would be to make change of 11 cents. The greedy algorithm will select the greatest of our collection to approach 11, but not past it. The algorithm will first select a dime, then a penny, then end when 11 cents has been made.

Complexity and Big-O Notation

Complexity Graph

Measuring the efficiency of an algorithm is standardized by checking how well it grows with more inputs. Computer scientists calculate how much computational time increases with an increasing number of inputs. Since this form of measurement merely intends to see how well an algorithm grows, the constants are left out since with high inputs these constants are negligible anyways. Big-O notation specifically describes the worst-case scenario and measures the time or space the algorithm uses[12]. Big-O notation can be broken down into order of growth algorithms such as O(1), O(N), and O(log N), with each notation representing different orders of growth. The later, log algorithms -commonly referred to as logarithms, are bit more complex than the rest, log algorithms take a median from a data set and compare it to a target value, the algorithm continues to halve the data as long as the median is higher or lower than the target value[12]. An algorithm with a higher Big-O is less efficient at large scales, for example in general a O(N) algorithm will run slower than a O(1) algorithm, and this difference will be more and more apparent, the larger the number of inputs.

Artificial Intelligence Algorithms

Clustering

Clustering is a Machine Learning technique in which, data is segregated into groups called clusters through an algorithm, given a set of data points. These clustering algorithms classify the data based on various criteria, but the fundamental premise is that data points with similarities will belong in the same group, that must be dissimilar to other groups. There are numerous clustering algorithms including K-Means clustering, Mean-Shift Clustering, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) Clustering, EM (Expectation Maximization) Clustering, Agglomerative Hierarchical Clustering. [13]

K-Means Clustering

K-means clustering is the most widely known and used algorithm out of all the clustering algorithms. It involves pre-determining a target number - k, which represents the number of centroids needed in the dataset. A centroid refers to the predicted center of the cluster. It then identifies the data points nearest to the center to form each cluster, while keeping k as small as possible. [14] K-Means clustering is considered to be a fast algorithm, due to the minimal computations it requires. It has a Big-O complexity of O(n). [13]

Mean-Shift Clustering

Mean-Shift Clustering, also known as Mode-Seeking, is an algorithm where datapoints are grouped into clusters, through the process of iteratively shifting all the points towards their mode. The mode of a dataset is defined as the most occurring value in that particular dataset, or in graphical terms, the point where the density of data points is the highest. Therefore, the algorithm moves, or "shifts" each point closer to its closest centroid, the direction of which is determined by the density of the nearby points. Therefore, each iteration of the program moves each point closer to where all the other points are, eventually forming a cluster center. The key difference between Mean-Shift and K-Means clustering is that K-Means requires the number k to be set beforehand, whereas the Mean-Shift algorithm creates clusters on the go without necessarily determining how many will be formed. [15] Usually, the Big-O complexity of such an algorithm is O(Tn^2), where T refers to the number of iterations in the algorithm. [16]

DBSCAN Clustering

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is an algorithm that groups together nearby datapoints based on a measure of distance, often Euclidean distance and a minimal number of points. The algorithm also differentiates outliers if they are in low density areas. The algorithm requires two parameters - eps and minPoints. The decision on what to set these parameters as depends from dataset to dataset, and requires a fundamental understanding of the context of the dataset being used. The eps should be picked based on the dataset distance and while generally small eps values are desirable, if the set value is too small there is a danger that a portion of the data will go unclustered. Conversely, if the value set is too large, too many of the points might get grouped into the same cluster. The minPoints parameter is usually derived from the parameter D, following that minPoints ≥ D + 1 where D measures the number of dimensions in the data. [17] The average run-time complexity of a DBSCAN algorithm is O(n log n) whereas it's worst-case complexity can be O(n^2).

EM Clustering

Expectation Maximization or EM Clustering is similar to the K-Means clustering technique. The EM Clustering method takes forward the basic principles of K-Means Clustering in two primary ways: 1) The EM Clustering method aims to calculate what datapoints belong in what cluster through one or more probability distributions, instead of trying to simply calculate and maximize the difference in mean points. 2) The overall purpose of the algorithm is to maximize the chance or the likelihood of belonging to a cluster in the data.

Essentially, the EM clustering method approximates the distribution of each point based on different probability distributions and at the end of it, each observation has a certain level of probability of belonging to a particular cluster. Ultimately, the resulting clusters are analyzed based on which clusters have the highest classification probabilities. [18]

Agglomerative Hierarchical Clustering

Also known as AGNES (Agglomerative Nesting), the Agglomerative Hierarchical Clustering Technique also creates clusters based on similarity. To start, this method takes each object and treats it as if it were a cluster. It then merges each of these clusters in pairs, until one huge cluster consisting of all the individual clusters has been formed. The result is represented in the form of a tree, which is called a dendrogram. The manner in which the algorithm works is called the "bottom-up" technique. Each data entry is considered an individual element or a leaf node. At each following stage, the element is joined with its closest or most-similar element to form a bigger element or parent node. The process is repeated over and over until the root node is formed, with all of the subsequent nodes under it.

The opposite of this technique is through the "top-down" method, which is implemented in an algorithm called "Divisive Clustering". This method starts at the root node, and at each iteration nodes are split or "divided" into two separate nodes, based off the ranking of dissimilarity within the clusters. This is done until every node has been divided, leaving individual clusters or leaf nodes. [19]

Deep Learning and Neural Networks

Neural networks are a collection of algorithms that utilize many of the concepts mentioned above while taking their capabilities a step further through deep learning. On a high level, the purpose of a neural network is to interpret raw input data through a machine perception and return patterns in the data, through techniques above such as K-means clustering or Random Forests. To do so, a neural network requires datasets to train on and thus models its interpretations on the training sets in a machine learning process. Where neural networks differ is its ability to be “stacked” to engage in deep learning. Each process is held in nodes that can be likened to neurons in a human brain. When data is encountered, many separate computations occur that can be weighted to produce the desired output. How many “layers”, or the depth, of a neural network increases its capabilities and complexity multiplicatively. [20]

Ethical Dilemmas

With the relevance of algorithms as well as their sheer magnitude, ethical dilemmas were bound to arise. There is a vast list of potential ethical issues relating to algorithms and computer science, including issues of privacy, data gathering, and bias.

Bias

Given that people are the creators of algorithms, code can inherit bias from its coder or its initial source data.

Joy Buolamwini and Facial Recognition

Joy Buolamwini, a graduate computer science student at MIT, experienced a case of this. The facial recognition software she was working on failed to detect her face, as she had a skin tone that had not been accounted for in the facial recognition algorithm. This is because the software had used machine learning with a dataset that was not diverse enough, and as a result, the algorithm failed to recognize her.[21] Safiya Noble discusses instances of algorithmic search engines reinforcing racism in her book, "Algorithms of Oppression".[22] Bias like this occurs in countless algorithms, be it through insufficient machine learning data sets, or the algorithm developers own fault, among other reasons, and it has the potential to cause legitimate problems even outside the realm of ethics.

Bias in Criminalization

COMPAS is algorithm written to determine whether a criminal is likely to re-offend using information like age, gender, and previously committed crimes. Tests have found it to be more likely to incorrectly evaluate black people than white people because it has learned on historical criminal data, which has been influenced by our biased policing practices.[23]

Jerry Kaplan is a research affiliate at Stanford University’s Center on Democracy, Development and the Rule of Law at the Freeman Spogli Institute for International Studies, where he teaches “Social and Economic Impact of Artificial Intelligence.” According to Kaplan, algorithmic bias can even influence whether or not a person is sent to jail. A 2016 study conducted by ProPublica indicated that software designed to predict the likelihood an arrestee will re-offend incorrectly flagged black defendants twice as frequently as white defendants in a decision-support system widely used by judges. Ideally, predictive systems should be wholly impartial and therefore be agnostic to skin color. However, surprisingly, the program cannot give black and white defendants who are otherwise identical the same risk score, and simultaneously match the actual recidivism rates for these two groups. This is because black defendants are re-arrested at higher rates that their white counterparts (52% versus 39%), at least in part due to racial profiling, inequities in enforcement, and harsher treatment of black people within the justice system. [24]

Job Applicants

Many companies employ complex algorithms in order to review and sift the thousands of resumes they will receive each year. Sometimes these algorithms display a bias which can result in people with a specific racial background or gender being recommended over others. An example of this was an Amazon AI algorithm that preferred men over women in recommending people for an interview. The algorithm employed machine learning techniques and over time taught itself to prefer men over women due to a variety of factors [25]. A major problem facing machine learning algorithms is the unpredictability in their models and what they will begin to teach themselves. It was apparent Amazon engineers did not intend for their algorithm to be bias towards men but an error resulted in this happening. Additionally, this was not a quick fix as the algorithm had been in place for years and began to pick up this bias and after analysis after a period time, it was discovered.

Privacy And Data Gathering

The ethical issue of privacy is also highly relevant to the concept of algorithms. Information transparency [26] is an import point regarding these issues. In popular social media algorithms, user information is often probed without the knowledge of the individual, and this can lead to problems. It is often not transparent enough how these algorithms receive user data, resulting in often incorrect information which can affect both how a person is treated within social media, as well as how outside agents could view these individuals given false data. Algorithms can also often infringe on a user’s feelings of privacy, as data can be collected that a person would prefer to be private. Data brokers are in the business of collecting peoples in formation and selling it to anyone for a profit, like data brokers companies often have their own collection of data. In 2013, Yahoo was hacked, leading to the leak of data pertaining to approximately three billion users.[27] The information leaked contained data relating to usernames, passwords, as well as dates of birth. Privacy and data gathering are common ethical dilemmas relating to algorithms and are often not considered thoroughly enough by algorithm’s users.

The Filter Bubble

Personalized, Online Filter Bubbles

Algorithms can be used to filter results in order to prioritize items that the user might be interested in. On some platforms, like Amazon, people can find this filtering useful because of the useful shopping recommendations the algorithm can provide. However, in other scenarios, this algorithmic filtering can become a problem. For example, Facebook has an algorithm that re-orders the user's news feed. For a period of time, the technology company prioritized sponsored posts in their algorithm. This often prioritized news articles, but there was no certainty on whether these articles came from a reliable source, simply the fact that they were sponsored. Facebook also uses its technology to gather information about its users, like which political party they belong to. This combined with prioritizing news can create a Facebook feed filled with only one party's perspective. This phenomenon is called the filter bubble, which essentially creates a platform centered completely around its user's interests.

Many, like Eli Pariser, have questioned the ethical implications of the filter bubble. Pariser believes that filter bubbles are a problem because they prevent users from seeing perspectives that might challenge their own. Even worse, Pariser emphasizes that this filter bubble is invisible, meaning that the people in it do not realize that they are in it. [28] This creates a huge lack of awareness in the world, allowing people to stand by often uninformed opinions and creating separation, instead of collaboration, with users who have different beliefs. Because of the issues Pariser outlined, Facebook decided to change their algorithm in order to prioritize posts from friends and family, in hopes of eliminating the effects of the potential filter bubble.

Filter Bubble in Politics

Another issue that these Filter Bubbles create are echo chambers; Facebook, in particular, filters out [political] content that one might disagree with, or simply not enjoy [29]. The more a user "likes" a particular type of content, similar content will continue to appear, and perhaps content that is even more extreme. This was clearly seen in the 2016 election when without realizing it, voters developed tunnel vision. Rarely did their Facebook comfort zones expose them to opposing views, and as a result they eventually became victims to their own biases and the biases embedded within the algorithms.[30] Later studies produced visualizations to show how insular the country was at the time of the election on social media and the large divide between the two echo chambers with almost no ties to each other.[31]

From Research Gate: An example of algorithmic echo chambers that contribute to the polarization of political beliefs

Corrupt Personalization

Algorithms have the potential to become dangerous, with their most serious repercussions being the threat to democracy that is extensive personalization. Algorithms such as Facebook's are corrupt in the practice of "like recycling" that they partake in. In Christian Sandvig's article title Corrupt Personalization, he notes that Facebook has defined a "like" in two ways that the users do not realize. The first is that "anyone who clicks on a "like" button is considered to have "liked" all future content from that source," and the second is that "anyone who "likes" a comment on a shared link is considered to "like" wherever that link points to" [32]. As a result, posts that you "like" wind up becoming ads on your friends' pages claiming that you like a certain item or thing. You are not able to see these posts and, because they do not appear on your news feed, you do not have the power to delete them. This becomes a threat to one's autonomy, for even if they wanted to delete this post, they can not. Furthermore, everyone is entitled to the ability to manage the public presentation of their own self-identity, and in this corrupt personalization, Facebook is giving users new aspects of their identity that may or may not be accurate.

Agency And Accountability

Algorithms make "decisions" based on the steps they were designed to follow and the input they received. This can often lead to algorithms as autonomous agents[33], taking decision making responsibilities out of the hands of real people. Useful in terms of efficiency, these autonomous agents are capable of making decisions in a greater frequency than humans. Efficiency is what materializes the baseline for algorithm use in the first place.

From an ethical standpoint, this type of agency raises many complications, specifically regarding accountability. It is no secret that many aspects of life are run by algorithms. Even events like applying to jobs are often drastically effected by these processes. Information like age, race, status, along with other qualifications, are all fed to algorithms, which then take agency and decide who moves further along in the hiring process and who is left behind. Disregarding inherent biases in this specific scenario, this process still serves to reduce the input of real humans and decrease the number of decisions that they have to make, and what is left over is the fact that autonomous agents are making systematic decisions that have extraordinary impact on people's lives. While the results of the previous example may only culminate to the occasional disregard of a qualified applicant or resentful feelings, this same principle can be much more influential.

The Trolley Problem in Practice

Consider autonomous vehicles, or self-driving cars, for instance. These are highly advanced algorithms that are programmed to make split second decisions with the greatest possible accuracy. In the case of the well-known "Trolley Problem"[34], these agents are forced to make a decision jeopardizing one party or another. This decision can easily conclude in the injury or even death of individuals, all at the discretion of a mere program.

The issue of accountability is then raised, in a situation such as this, because in the eyes of the law, society, and ethical observers, there must be someone held responsible. Attempting to prosecute a program would not be feasible in a legal situation, due to not being able to have a physical representation of the program, like you would a person. However, there are those such as Frances Grodzinsky and Kirsten Martin [35], who believe that the designers of an artificial agent, should be responsible for the actions of the program. [36] Others contend this point by saying that the blame should be attributed to the users or persons directly involved in the situation.

These complications will continue to arise, especially as algorithms continue to make autonomous decisions at grander scales and rates. Determining responsibility for the decisions these agents make will continue to be a vexing process, and will no doubt shape in some form many of the advanced algorithms that will be developed in the coming years.

Intentions and Consequences

The ethical consequences that are common in algorithm implementations can be either deliberate or unintentional. Instances where an algorithm's intent and outcome differs is noted below.

YouTube Radicalization

Scholar and technosociologist Zeynep Tufekci has claimed that "YouTube may be one of the most powerful radicalizing instruments of the 21st century."[37] As YouTube algorithms aim maximize the amount of time that viewers spend watching, it inevitably discovered that the best way to do this was to show videos that slowly "up the stakes" of the subject being watched - from jogging to ultramarathons, from vegetarianism to veganism, from Trump speeches to white supremacist rants.[37] Thus, while the intention of Youtube is to keep viewers watching (and bring in advertisement money), they have unintentionally created a site that shows viewers more and more extreme content - contributing to radicalization. Such activity circles back to and produce filters and echo chambers.

Facebook Advertising

By taking a closer look at Facebook's algorithm that serves up ads to its users, gender and racial bias is obviously prominent. Using demographic and racial background as factors, Facebook's decides which ads are served up to its users. A team from Northeastern tested to see the algorithm bias in action and by running identical ads with slight tweaks to budget, images, and headings, the ads reached vastly different audiences. Results such as minorities receiving a higher percentage of low-cost housing ads, and women receiving more ads for secretary and nursing jobs [38]. Although the intent of Facebook may be to reach people that these ads are intended for, the companies that are signing up to advertise with Facebook have stated they did not anticipate this type of filtering when paying for their services. Additionally, although Facebook may believe it is win-win for everyone since advertisers will be getting more interactions with certain audiences targeted, and people will be happy to see ads more relatable to them shown, it is incredibly discriminatory to target people based on factors that are uncontrollable. Facebook needs to adjust its targeted advertising practices by removing racial and gender factors in their algorithms in order to prevent perpetuating stereotypes and placing people in certain boxes by race and gender. This type of algorithm goes against many ethical principles and is important that powerful technology companies are not setting poor examples for others.

See also

References

  1. Cormen, Thomas H. et al. (2009). Introduction to Algorithms;'. MIT Press.
  2. Lim, Brian (December 7, 2016). [e27.co/brief-history-algorithms-important-automation-machine-learning-everyday-life-20161207/ "A Brief History of Algorithms (and Why It's so Important in Automation, Machine Learning, and Everyday Life)"] e27.
  3. Rastogi, Rajeev, and Kyuseok Shim (1999). "Scalable algorithms for mining large databases." Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM.
  4. T.C. (August 29, 2017). "What Are Algorithms?" The Economist. Retrieved April 28, 2019.
  5. McClelland, Calum. (December 4, 2017). "The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning". Medium. Retrieved April 28, 2019.
  6. Stepanov, A. and Rose, D. (2014). From Mathematics to Generic Programming: The First Algorithm. "Chapter 2". Retrieved April 28, 2019.
  7. [crgis.ndc.nasa.gov/historic/Human_Computers "Human Computers"]. NASA Cultural Resources. Retrieved April 28, 2019.
  8. Anyoha, Rockwell (August 28, 2017). "The History of Artificial Intelligence". Science in the News. Harvard. Retrieved April 28, 2019.
  9. " Floyd, Robert W. (November 1996) Non-Deterministic Algorithms. Carnegie Institute of Technology. pp. 1–17."
  10. "Divide and Conquer Algorithms". Geeks for Geeks. Retrieved April 28, 2019.
  11. "Greedy Algorithms". Brilliant Math & Science Wiki. Retrieved April 28, 2019.
  12. 12.0 12.1 [ Bell, Rob. "A Beginner's Guide to Big O Notation". Retrieved April 28, 2019.
  13. 13.0 13.1 Seif, George (February 5, 2018). "The 5 Clustering Algorithms Data Scientists Need To Know". Retrieved April 28, 2019.
  14. Garbade, Michael J. (September 12, 2018). "Understanding K-Means Clustering in Machine Learning". Towards Data Science. Retrieved April 28, 2019.
  15. "Meanshift Algorithm for the Rest of Us (Python)", May 14, 2016. Retrieved April 28, 2019.
  16. Thirumuruganathan, Saravanan (April 1, 2010). "Introduction To Mean Shift Algorithm". Retrieved April 28, 2019.
  17. Salton do Prado, Kelvin (April 1, 2017). "How DBSCAN works and why should we use it?". Towards Data Science. Retrieved April 28, 2019.
  18. "Expectation Maximization Clustering" RapidMiner Documentation. Retrieved April 28, 2019.
  19. "HIERARCHICAL CLUSTERING IN R: THE ESSENTIALS/Agglomerative Hierarchical Clustering". DataNovia. Retrieved April 28, 2019.
  20. A Beginner's Guide to Neural Networks and Deep Learning. (n.d.). Retrieved April 27, 2019, from https://skymind.ai/wiki/neural-network
  21. Buolamwini, Joy. [www.media.mit.edu/posts/how-i-m-fighting-bias-in-algorithms/ "How I'm Fighting Bias in Algorithms"]. MIT Media Lab. Retrieved April 28, 2019.
  22. Noble, Safiya. Algorithms of Oppression.
  23. Larson, Mattu; Kirchner, Angwin (May 23, 2016). "How We Analyzed the COMPAS Recidivism Algorithm". Propublica. Retrieved April 28, 2019.
  24. Kaplan, J. (December 17, 2018). "Why your AI might be racist". Washington Post. Retrieved April 28, 2019.
  25. Dastin, Jeffrey. “Amazon Scraps Secret AI Recruiting Tool That Showed Bias against Women.” Reuters, Thomson Reuters, 9 Oct. 2018, af.reuters.com/.
  26. Turilli, Matteo, and Luciano Floridi (2009). "The Ethics of Information Transparency." Ethics and Information Technology. 11(2): 105-112. doi:10.1007/s10676-009-9187-9.
  27. Griffin, Andrew (October 4, 2017) "Yahoo Admits It Accidentally Leaked the Personal Details of Half the People on Earth." The Independent. Retrieved April 28, 2019.
  28. Pariser, Eli. (2012). The Filter Bubble. Penguin Books.
  29. Knight, Megan (November 30, 2018). "Explainer: How Facebook Has Become the World's Largest Echo Chamber". The Conversation. Retrieved April 28, 2019.
  30. El-Bermawy, Mostafa M. (June 3, 2017). "Your Filter Bubble Is Destroying Democracy". Wired. Retrieved April 28, 2019.
  31. MIS2: Misinformation and Misbehavior Mining on the Web - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Social-media-platforms-can-produce-echo-chambers-which-lead-to-polarization-and-can_fig4_322971747 [accessed 23 Apr, 2019]
  32. Sandvig, Christian. “Corrupt Personalization.” Social Media Collective, 27 June 2014, socialmediacollective.org/2014/06/26/corrupt-personalization/
  33. “Autonomous Agent.” Autonomous Agent - an Overview | ScienceDirect Topics, www.sciencedirect.com/topics/computer-science/autonomous-agent.
  34. Roff, Heather M. “The Folly of Trolleys: Ethical Challenges and Autonomous Vehicles.” Brookings, Brookings, 17 Dec. 2018, www.brookings.edu/research/the-folly-of-trolleys-ethical-challenges-and-autonomous-vehicles/.
  35. Martin, Kirsten. “Ethical Implications and Accountability of Algorithms.” SpringerLink, Springer Netherlands, 7 June 2018, link.springer.com/article/10.1007/s10551-018-3921-3.
  36. "The ethics of designing artificial agents" by Frances S. Grodzinsky et al, Springer, 2008.
  37. 37.0 37.1 [1] Tufekci, Zeynep. “YouTube, the Great Radicalizer.” The New York Times, The New York Times, 10 Mar. 2018, www.nytimes.com/2018/03/10/opinion/sunday/youtube-politics-radical.html.
  38. Hao, Karen. “Facebook's Ad-Serving Algorithm Discriminate by Gender and Race.” MIT Technology Review, 5 Apr. 2019, www.technologyreview.com/.