Predictive Analytics

From SI410
Jump to: navigation, search
Predictive analytics and its wide range of uses [1]
Predictive analytics integrates techniques of computer science and statistics, such as regression analysis and machine learning algorithms to forecast future events.[2] It invokes discrimination and calibration processes on vast data sets that allow these models to predict a wide range of events spanning from potential risks and costs in finance to wildlife, and even an individual’s future behavior.[3] With the combination of rapid advancements in this technology and the emergence of big data extrapolation, predictive analytics has flourished in all industries.[4][5][6] Healthcare treatment recommendations are generated using predictive analytics,[7] as well as assessments of candidates for hire,[8] and by law enforcement to help anticipate potential crimes and criminals.[9] In finance, analysts predict financial models such as options pricing and the trend lines of mature companies. As this technology improves and evolves, data analytics and artificial intelligence will continue to grow in capability and application potential. However, with predictive analytics becoming increasingly prevalent in decision-making processes that have direct and potentially life-changing impacts on people’s lives, ethical concerns regarding algorithmic bias, transparency, and data privacy are revealed.[10]


Diagram showing the workflow of a predictive model generating a patient risk assessment [11]


Healthcare analytics refers to the systematic use of health data and related business insights developed through the application of data analytics to drive fact-based decision-making for planning, management, measurement, and learning in healthcare.[12] In the context of healthcare, it can be used to identify high-risk patients and prescribe treatment, reducing unnecessary hospitalizations or readmissions. Researchers have also developed analytical models to predict future patient behavior based on past behavior. These model provides accurate predictions of no-show patients and assists clinics in developing operational mitigation strategies such as overbooking appointment slots. Such models can even be used to generate unique, optimal solutions for clinical planning and scheduling decisions to improve patient service at hospitals.[13]

Predictive analytics has also been used to study Parkinson's disease.[14] Additionally, some support its use for creating models that would be able to predict which people are at a higher risk of developing chronic diseases, so as to identify such diseases earlier on.[15]

Human Resources

In the field of human resources, predictive analytics can be used to forecast openings within companies and to predict which employees may be a liability.[16] The use of data analytics in human resources has seen a recent surge in popularity, as companies use data analytics to design, evaluate, and implement new management policies; this also means that the traditional methods of using experience, intuition, and guesswork to guide human resources strategy are falling to the wayside.[17]

A study conducted in 2019 on 4,800 individuals across companies in a variety of industries determined that roughly one-quarter to one-third of all companies used predictive analytics in human resources. The study also found that the industry that used predictive analytics in human resources the most is the financial services sector, with 32% of companies applying analytics. Technology (software), oil and energy, and healthcare and pharmaceuticals, all had over 25% of companies in such industries applying analytics to human resources.[18]

Law Enforcement

In law enforcement, predictive modeling techniques, referred to as "PredPol", have been used by the Los Angeles Police Department (LAPD). Officers at this department state that it is used as a supplementary tool, rather than a replacement for their normal rotations. Additionally, the PredPol system makes predictions solely based on the reported crimes and not on the demographic or identifying information of individuals involved in such crimes, to reduce potential biases.[19] However, in 2020, the LAPD decided to stop using the controversial program.[20]


In finance, stock options are the right, but not the obligation to buy or sell stocks at an agreed price on or before a particular date. Individuals choose between buying shares at an agreed-upon price on or before an expiration date, which is called a "call option" or they can sell shares at an agreed-upon price on or before an expiration date, which is called a "put option."[22] For example, if you buy a call option, at the price or "strike price," of $120 that expires in 30 days, then that means that owning this call option will allow you to purchase stock at $120 per share (strike price) anytime within the next 30 days (expiration date) no matter where the stock price is at. So, if the price has gone up to $135 within the next 30 days, then you are allowed to buy the stock at $120, even though it is trading at $135. A put option is the inverse of this "contract" in that whichever price you sign up to pay, if the price goes down, then you are allowed to sell your stock at the original higher price before the expiration date.

Predictive analytics is used in the path integral approach to financial modeling and options pricing by utilizing algorithms that generate Gaussian path integrals to represent the transition probability density used for the prediction of positive and/or negative options pricing slopes. This approach can theoretically optimize returns on investments if trained on sufficient and accurate data. These methods are derived from random procedures, such as the Monte Carlo simulation, and are designed to mimic the numerical entropy that is natural to the stock market. [23]

Machine Learning

Machine learning is a very good example of predictive analytics.
Machine Learning Diagram by Karen Hao
Machine learning is a branch of artificial intelligence that uses statistics and patterns found in data sets to increase future program accuracy. [24] A good example of machine learning is a recommended page. At first, the recommended page may not be tailored to a person’s liking. As they use their computer more, the algorithm gets a better idea of the person’s interests and recommends things they are more interested in. There are many types of machine learning that include and exclude human supervision. This allows machine learning algorithms to be applied in a wide range of situations. Some examples of its application are digital assistants learning the user’s voice, Chatbots that interpret text and provide suitable responses, and self–driving cars. [25]

Ethical Challenges

Bias and Discrimination

Ever since the rise of the computer gaming industry brought back the resurgence of neural networks, experts have argued that deep learning is a highly effective way to train an artificial intelligence system. Designed to mimic the way a human brain thinks and makes decisions, a network of thousands or millions of individual processing nodes are connected together in a neural net, enabling an algorithm to train itself to perform a task when given a prepared training data set.[26] However, according to Barocas and Selbst from Cornell University and UCLA respectively, “an algorithm is only as good as the data it works with".[27] Zarsky, a professor and vice dean at Haifa University, argues that algorithms trained on biased data sets will not only inherit pre-existing biases but also generate novel patterns of unfair bias and discrimination, reinforcing these patterns in their decision-making processes.[28] An algorithm may even interpret inequalities in historical data as sensible patterns, which further reinforces existing societal biases. Bias and discrimination in the programmer can also affect if the algorithm itself is biased. Algorithms are built to execute the code of the programmer. Therefore, when a programmer writes an algorithm that has biased ideologies on how to read data and prioritize what is important and not, it affects how the program will run. Another way algorithms can be biased is how the human brain works. The human brain loves to fill in the gaps when there is a lack of context. When algorithms inherit this property it can lead to missing key patterns the algorithm was designed to find. [29] These unintended consequences when applied in situations like driver safety in self-driving cars and sensitive government areas like risk assessment in the criminal justice system, may cause far reaching complications. For example, a woman named Elaine Herzberg was struck and killed by a self-driving car. The car miss identified the woman as a car until too late. The computer was not allowed to take evasive measures and handed over the control to manual override. The driver wasn't paying attention and the woman was hit. [30] Another example is of inmates in Broward County, Florida where over 18,000 inmates were given a risk assessment value that potentially had racist bias over-valuing the risk of African Americans. [31] Detecting and addressing unfair bias and discrimination in algorithms for predictive analytics is particularly difficult as it often occurs due to unintended consequences from the algorithm's use, and not the purposeful actions of an ill-intentioned programmer.[27]


Some assert that transparency as an ethical issue is in opposition to other ethical interests such as privacy.[32] Algorithmic transparency means that the algorithm should have its details accessible and also comprehensible to humans analyzing them; accessible information that is not decipherable is not useful.[33] Most modern, sophisticated artificial intelligence systems are trained via deep learning, using extensive neural networks reaching up to fifty layers deep.[26] As each layer adds complexity, Sloan and Warner say that the human comprehensibility of these networks is affected and thus their transparency too.[34]

Predictive Privacy

The term “predictive privacy” refers to the ethical challenges posed by the ability of algorithms to predict sensitive information about an individual using information derived from data sets of other individuals.[10] In 2019, the Electronic Privacy Information Center (EPIC) raised this ethical concern in their official complaint to the Federal Trade Commission (FTC) against HireVue, a recruiting-technology company, stating that “the company’s use of unproven artificial-intelligence systems that scan people’s faces and voices [constitutes] a wide-scale threat to American workers".[8] Mühlhoff’s definition of a violation of predictive privacy is “if sensitive information about [a] person or group is predicted against their will or without their knowledge on the basis of data of many other individuals, provided that these predictions lead to decisions that affect anyone’s...freedom".[10] However, predictive privacy can still be violated regardless of the prediction’s accuracy, especially when systems for data collection and processing are designed such that subjects cannot provide meaningful or informed consent.[35]


  1. "Predictive Analytics:What It Is & Why It's Important?". Edupristine, 2021,
  2. Theodoridis, Sergios. Machine Learning : A Bayesian and Optimization Perspective. Elsevier Science & Technology, 2015, doi:10.1016/C2013-0-19102-7.
  3. Nyce, Charles. "Predictive Analytics White Paper." The Digital Insurer, American Institute for CPCU, 2007,
  4. Shah, Nilay D, et al. “Big Data and Predictive Analytics: Recalibrating Expectations.” JAMA : The Journal of the American Medical Association, vol. 320, no. 1, American Medical Association, 2018, pp. 27–28, doi:10.1001/jama.2018.5602.
  5. Nathan (September 2, 2008), "Insurers Shift to Customer-focused Predictive Analytics Technologies", Insurance & Technology, archived from the original on July 22, 2012, retrieved July 2, 2012
  6. Fletcher, Heather (March 2, 2011), "The 7 Best Uses for Predictive Analytics in Multichannel Marketing", Target Marketing
  7. Cohen, I. G., et al. "The Legal And Ethical Concerns That Arise From Using Complex Predictive Analytics In Health Care." Health Affairs, vol. 33, no. 7, 2014, pp. 1139-47, doi:10.1377/hlthaff.2014.0048.
  8. 8.0 8.1 Harwell, Drew. "Rights group files federal complaint against AI-hiring firm HireVue, citing ‘unfair and deceptive practices." Washington Post, 6 November 2019,
  9. Perry, Walter, et al. "Predictive Policing: The Role Of Crime Forecasting In Law Enforcement Operations." RAND Corporation, 2013, doi:10.7249/rr233.
  10. 10.0 10.1 10.2 Mühlhoff, Rainer. "Predictive Privacy: Towards An Applied Ethics Of Data Analytics." SSRN, 2020, doi:10.2139/ssrn.3724185.
  11. Lynn, John. “Using NLP with Machine Learning for Predictive Analytics in Healthcare”. Healthcare IT Today. December 12, 2016
  12. Kankanhalli, Atreyi, et al. "Big data and analytics in healthcare: introduction to the special section." Information Systems Frontiers 18.2 (2016): 233-235.
  13. Harris, Shannon L., Jerrold H. May, and Luis G. Vargas. "Predictive analytics model for healthcare planning and scheduling." European Journal of Operational Research 253.1 (2016): 121-131.
  14. Dinov, Ivo D., et al. "Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations." PLoS One, vol. 11, no. 8, 5 August 2016, doi:10.1371/journal.pone.0157077.
  15. "Predictive analytics in healthcare." Foresee Medical, Accessed 28 March 2021.
  16. Mishra, Sujeet N., et al. "Human Resource Predictive Analytics (HRPA) for HR Management in Organizations." International Journal of Scientific & Technology Research, vol. 5, no. 5, May 2016,
  17. King, Kylie Goodell. "Data analytics in human resources: A case study and critical review." Human Resource Development Review 15.4 (2016): 487-495.
  18. Noack, Brent. "Big data analytics in human resource management: Automated decision-making processes, predictive hiring algorithms, and cutting-edge workplace surveillance technologies." Psychosociological Issues in Human Resource Management 7.2 (2019): 37-42.
  19. Eidam, Eyragon. "The Role of Data Analytics in Predictive Policing." Government Technology, September 2016, Accessed 28 March 2021.
  20. Miller, L. (2021). LAPD will end controversial program that aimed to predict where crimes would occur. Los Angeles Times. Retrieved 17 April 2021, from
  21. Melicher, Ronald and Welshans, Merle (1988). Finance: Introduction to Markets, Institutions & Management (7th ed.). Cincinnatti OBN: Southwestern Publishing Company. p. 2. ISBN 0-538-06160-X.
  22. Stultz, Russell A. The Options Trading Primer : Using Rules-Based Option Trading to Earn a Steady Income. Business Expert Press, 2019.
  23. Linetsky, Vadim. “The Path Integral Approach to Financial Modeling and Options Pricing.” Computational Economics, vol. 11, no. 1, Society for Computational Economics, 1998, pp. 129–63.
  24. Hao, Karen. “What Is Machine Learning?” MIT Technology Review, MIT Technology Review, 5 Apr. 2021,
  25. By: IBM Cloud Education. “What Is Machine Learning?” IBM,
  26. 26.0 26.1 Hardesty, Larry. "Explained: Neural Networks." MIT News, 2021,
  27. 27.0 27.1 Barocas, Solon, and Andrew D. Selbst. "Big Data's Disparate Impact." SSRN, 2016, doi:10.2139/ssrn.2477899.
  28. Zarsky, Tal Z. "An Analytic Challenge: Discrimination Theory in the Age of Predictive Analytics." I/S: A Journal of Law and Policy, vol. 14.1, 2017, pp. 12-35,
  29. Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon.” Information, Communication & Society, 15(5), 662-679.
  30. Smith A (2018) Franken-algorithms: the deadly consequences of unpredictable code. The Guardian.
  31. Angwin J, Larson J (2016) Machine bias.
  32. Canca, Cansu. "Anonymity in the Time of a Pandemic: Privacy vs. Transparency." Bill of Health, Harvard Law, Accessed 27 March 2021.
  33. Mittelstadt, Brent D., et al. "The Ethics Of Algorithms: Mapping The Debate." Big Data & Society, vol. 3, no. 2, 2016, pp. 1-21, SAGE Publications, doi:10.1177/2053951716679679.
  34. Sloan, Robert H., Richard Warner. "When Is an Algorithm Transparent?: Predictive Analytics, Privacy, and Public Policy." IEEE: Security & Privacy, SSRN, 2017,
  35. Schermer, Bart W. "The Limits Of Privacy In Automated Profiling And Data Mining." Computer Law & Security Review, vol. 27, no. 1, 2011, pp. 45-52, doi:10.1016/j.clsr.2010.11.009.