Predictive Analytics

From SI410
Revision as of 14:27, 18 April 2021 by Ghchan (Talk | contribs) (Human Resources: Edited the Human Resources subsection for clarity and grammar)

Jump to: navigation, search
Predictive analytics and its wide range of uses [1]
Predictive analytics integrates techniques of computer science and statistics, such as regression analysis and machine learning algorithms to forecast future events [2]. By invoking processes of discrimination and calibration on vast data sets, such models are able to predict everything from potential financial risks and costs, to wildlife and human populations, and even an individual’s future behavior [3]. Rapid advancements in technology and the emergence of big data has seen the use of predictive analytics grow in all industries [4][5]. Healthcare treatment recommendations are generated using predictive analytics [6], as well as assessments of candidates for hire [7], and it is even used by law enforcement to help anticipate potential crimes and criminals [8]. As technology improves and evolves, data analytics and artificial intelligence will continue to grow in capability and in their potential applications. However, with predictive analytics becoming increasingly prevalent in decision-making processes that have direct and potentially life-changing impacts on people’s lives, ethical concerns regarding algorithmic bias, transparency, and data privacy are revealed [9].


Uses

Diagram showing the workflow of a predictive model generating a patient risk assessment [10]

Healthcare

Healthcare analytics refers to the systematic use of health data and related business insights developed through the application of data analytics to drive fact-based decision making for planning, management, measurement, and learning in healthcare [11]. In the context of healthcare, it can be used to identify high-risk patients and prescribe treatment, reducing unnecessary hospitalizations or readmissions. Researchers have also developed an analytical model to predict future patient behavior based on past behavior. The model provides an accurate prediction of no-show patients and assists clinics in developing operational mitigation strategies such as overbooking appointment slots. Such models can even be used to generate unique, optimal solutions for clinical planning and scheduling decisions to improve patient service at hospitals [12].

Predictive analytics has also been used to study Parkinson's disease [13]. Additionally, some support its use for creating models that would be able to predict which people are at a higher risk of developing chronic diseases, so as to identify such diseases earlier on [14].

Human Resources

In the field of human resources, predictive analytics can be used to forecast openings within companies and to predict which employees may be a liability [15], and the use of data analytics in human resources has seen a recent surge in popularity [16]. Companies use data analytics to design, evaluate, and implement new management policies; this also means that the traditional methods of using experience, intuition, and guesswork to guide human resources strategy are falling to the wayside [17].

A study conducted in 2019 on 4,800 individuals across companies in a variety of industries determined that roughly one-quarter to one-third of all companies used predictive analytics in human resources [18]. The study also found that the industry that used predictive analytics in human resources the most is the financial services sector, with 32% of companies applying analytics [19]. Technology (software), oil and energy, and healthcare and pharmaceuticals, all had over 25% of companies in such industries applying analytics to human resources [20].

Law Enforcement

In law enforcement, predictive modeling techniques referred to as PredPol (derived from the term "Predictive Policing"), have been used by the Santa Cruz California Police Department. Officers at this department state that it is used as a supplementary tool rather than a replacement for their normal rotations. Additionally, this PredPol system predicts solely based on crimes reported and not demographic or identifying information of individuals involved in the crimes in an attempt to reduce demographically based biases [21].

Sports

"Moneyball: The Art of Winning an Unfair Game" written by Michael Lewis and released in 2003 is probably the most famous and illuminating case of analytics in sports for the public. However, there have been sports statisticians trying to perfect predictive analysis for decades, such as Bill James who has been writing predictive baseball analytics books since 1977. In recent times, this analysis has grown more complex with the advancement of technology and the introduction of machine learning/computer vision.

Baseball is probably the sport most connected to predictive analysis since it is widely regarded as the easiest sport to predict with few variables. At its essence, baseball is stripped down to the competition between the batter and the pitcher. Baseball statisticians have now started to use neural networks to predict these plate appearances. Joshua Silver of Baseball Prospectus trained a neural network he names "Singlearity PA" on nine years of MLB games to more accurately predict plate outcomes [22]. Silver achieved significantly better prediction accuracy with the neural network platform compared to log5, a priorly popular method of prediction.

Predictive analysis using machine learning is not saved exclusively for the sport of baseball. Researchers have attempted to apply the same methods to the sport of American football. In 1996, M.C. Purucker, a member of the University of Pittsburgh's Bioengineering department attempted to create a neural network model to predict outcomes to NFL games. Using the neural network model, Purucker achieved 61% accuracy in predicting the outcomes [23]. In 2003, Joshua Kahn improved upon Purucker's model, reaching 75% accuracy[24]. Kahn also relied on machine learning to achieve this feat.

Computer vision is also employed in sports to build predictive features for fans. Often, the ball-path trackers seen in the sports of golf and tennis are constructed with computer vision principles[25].

Ethical Challenges

Bias and Discrimination

Ever since the rise of the computer gaming industry brought back the resurgence of neural networks, experts have argued that deep learning is a highly effective way to train an artificial intelligence system [26]. Designed to mimic the way a human brain thinks and makes decisions, a network of thousands or millions of individual processing nodes are connected together in a neural net, which enables an algorithm to train itself to perform a task given a prepared training data set [26]. However, according to Barocas and Selbst from Cornell University and UCLA respectively, “an algorithm is only as good as the data it works with [27]." Zarsky, a professor and vice dean at Haifa University, argues that algorithms trained on biased data sets will not only inherit pre-existing biases from the aforementioned data set but also generate novel patterns of unfair bias and discrimination and reinforce these patterns in their decision-making processes [28]. An algorithm may interpret inequalities in historical data as sensible patterns, which reinforces existing societal biases [27]. Detecting and addressing unfair bias and discrimination in algorithms for predictive analytics is particularly difficult as it often occurs due to unintended consequences from using the algorithm, and not the purposeful actions of an ill-intentioned programmer.[27].

Transparency

Some assert that transparency as an ethical issue is in opposition to other ethical interests such as privacy [29]. Algorithmic transparency means that the algorithm should have its details accessible and also comprehensible to humans analyzing them; accessible information that is not decipherable is not useful [30]. Most modern, sophisticated artificial intelligence systems are trained via deep learning, using extensive neural networks reaching up to fifty layers deep [26]. As each layer adds complexity, Sloan and Warner assert that the human comprehensibility of these networks is affected and thus the transparency too [31].

Predictive Privacy

The term “predictive privacy” refers to the ethical challenges posed by the ability of algorithms to predict sensitive information about an individual using information derived from data sets of other individuals [9]. In 2019, the Electronic Privacy Information Center (EPIC) raised this ethical concern in their official complaint to the Federal Trade Commission (FTC) against HireVue, a recruiting-technology company, stating that “the company’s use of unproven artificial-intelligence systems that scan people’s faces and voices [constitutes] a wide-scale threat to American workers [7]." Mühlhoff’s definition of a violation of predictive privacy is “if sensitive information about [a] person or group is predicted against their will or without their knowledge on the basis of data of many other individuals, provided that these predictions lead to decisions that affect anyone’s...freedom [9]." However, predictive privacy can still be violated regardless of the prediction’s accuracy, especially when systems for data collection and processing are designed such that subjects cannot provide meaningful or informed consent [32].

References

  1. "Predictive Analytics:What It Is & Why It's Important?". Edupristine, 2021, https://www.edupristine.com/blog/importance-of-predictive-analytics.
  2. Theodoridis, Sergios. Machine Learning : A Bayesian and Optimization Perspective. Elsevier Science & Technology, 2015, doi:10.1016/C2013-0-19102-7.
  3. Nyce, Charles. "Predictive Analytics White Paper." The Digital Insurer, American Institute for CPCU, 2007, www.the-digital-insurer.com/wp-content/uploads/2013/12/78-Predictive-Modeling-White-Paper.pdf.
  4. Nathan (September 2, 2008), "Insurers Shift to Customer-focused Predictive Analytics Technologies", Insurance & Technology, archived from the original on July 22, 2012, retrieved July 2, 2012
  5. Fletcher, Heather (March 2, 2011), "The 7 Best Uses for Predictive Analytics in Multichannel Marketing", Target Marketing
  6. Cohen, I. G., et al. "The Legal And Ethical Concerns That Arise From Using Complex Predictive Analytics In Health Care." Health Affairs, vol. 33, no. 7, 2014, pp. 1139-47, doi:10.1377/hlthaff.2014.0048.
  7. 7.0 7.1 Harwell, Drew. "Rights group files federal complaint against AI-hiring firm HireVue, citing ‘unfair and deceptive practices." Washington Post, 6 November 2019, www.washingtonpost.com/technology/2019/11/06/prominent-rights-group-files-federal-complaint-against-ai-hiring-firm-hirevue-citing-unfair-deceptive-practices.
  8. Perry, Walter, et al. "Predictive Policing: The Role Of Crime Forecasting In Law Enforcement Operations." RAND Corporation, 2013, doi:10.7249/rr233.
  9. 9.0 9.1 9.2 Mühlhoff, Rainer. "Predictive Privacy: Towards An Applied Ethics Of Data Analytics." SSRN, 2020, doi:10.2139/ssrn.3724185.
  10. Lynn, John. “Using NLP with Machine Learning for Predictive Analytics in Healthcare”. Healthcare IT Today. December 12, 2016
  11. Kankanhalli, Atreyi, et al. "Big data and analytics in healthcare: introduction to the special section." Information Systems Frontiers 18.2 (2016): 233-235.
  12. Harris, Shannon L., Jerrold H. May, and Luis G. Vargas. "Predictive analytics model for healthcare planning and scheduling." European Journal of Operational Research 253.1 (2016): 121-131.
  13. Dinov, Ivo D., et al. "Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations." PLoS One, vol. 11, no. 8, 5 August 2016, doi:10.1371/journal.pone.0157077.
  14. "Predictive analytics in healthcare." Foresee Medical, www.foreseemed.com/predictive-analytics-in-healthcare. Accessed 28 March 2021.
  15. Mishra, Sujeet N., et al. "Human Resource Predictive Analytics (HRPA) for HR Management in Organizations." International Journal of Scientific & Technology Research, vol. 5, no. 5, May 2016, www.ijstr.org/final-print/may2016/Human-Resource-Predictive-Analytics-hrpa-For-Hr-Management-In-Organizations.pdf.
  16. King, Kylie Goodell. "Data analytics in human resources: A case study and critical review." Human Resource Development Review 15.4 (2016): 487-495.
  17. Noack, Brent. "Big data analytics in human resource management: Automated decision-making processes, predictive hiring algorithms, and cutting-edge workplace surveillance technologies." Psychosociological Issues in Human Resource Management 7.2 (2019): 37-42.
  18. Eidam, Eyragon. "The Role of Data Analytics in Predictive Policing." Government Technology, September 2016, www.govtech.com/data/Role-of-Data-Analytics-in-Predictive-Policing.html. Accessed 28 March 2021.
  19. Singlearity: https://www.baseballprospectus.com/news/article/59993/singlearity-using-a-neural-network-to-predict-the-outcome-of-plate-appearances/
  20. Purucker: https://ieeexplore.ieee.org/abstract/document/535226
  21. Kahn: http://homepages.cae.wisc.edu/~ece539/project/f03/kahn.pdf
  22. CV: https://codeburst.io/use-cases-of-computer-vision-in-the-sports-industry-58af7e1a2acf
  23. 26.0 26.1 26.2 Hardesty, Larry. "Explained: Neural Networks." MIT News, 2021, news.mit.edu/2017/explained-neural-networks-deep-learning-0414.
  24. 27.0 27.1 27.2 Barocas, Solon, and Andrew D. Selbst. "Big Data's Disparate Impact." SSRN, 2016, doi:10.2139/ssrn.2477899.
  25. Zarsky, Tal Z. "An Analytic Challenge: Discrimination Theory in the Age of Predictive Analytics." I/S: A Journal of Law and Policy, vol. 14.1, 2017, pp. 12-35, kb.osu.edu/bitstream/handle/1811/86702/1/ISJLP_V14N1_011.pdf.
  26. Canca, Cansu. "Anonymity in the Time of a Pandemic: Privacy vs. Transparency." Bill of Health, Harvard Law, blog.petrieflom.law.harvard.edu/2020/03/30/anonymity-in-the-time-of-a-pandemic-privacy-vs-transparency. Accessed 27 March 2021.
  27. Mittelstadt, Brent D., et al. "The Ethics Of Algorithms: Mapping The Debate." Big Data & Society, vol. 3, no. 2, 2016, pp. 1-21, SAGE Publications, doi:10.1177/2053951716679679.
  28. Sloan, Robert H., Richard Warner. "When Is an Algorithm Transparent?: Predictive Analytics, Privacy, and Public Policy." IEEE: Security & Privacy, SSRN, 2017, dx.doi.org/10.2139/ssrn.3051588.
  29. Schermer, Bart W. "The Limits Of Privacy In Automated Profiling And Data Mining." Computer Law & Security Review, vol. 27, no. 1, 2011, pp. 45-52, doi:10.1016/j.clsr.2010.11.009.