Recently, the scientific method has been fundamentally transformed by the adoption of advanced computational methods, including artificial intelligence (AI) systems, across many policy-relevant sectors. AI is now used to detect and diagnose diseases based on medical images, suggest treatment based on individual patient data, and use electronic health records to predict which patients will require intensive care. In the environmental sciences, AI is being used to make predictions of extreme weather events, model the potential impacts of climate change on cities, and facilitate the discovery of novel climate patterns. These novel and complex computational methods produce scientific evidence that guides climate mitigation and adaptation policymaking.
However exciting, innovative technology poses new social, political, and moral challenges. We often believe that decision makers have a moral duty to collect good evidence, carefully deliberate about that evidence, and to make decisions that are adequately informed and justified by the available evidence.
But what does it mean to have good scientific evidence when that evidence is produced by an AI system?
And is it ethical to make high-stakes decisions about medical treatment or climate adaptation strategies based on scientific evidence produced by AI systems?
These are some of the questions I address in my research and publications:
"Confirming (climate) change: a dynamical account of model evaluation." Synthese (2022) (Thesis: If scientists can confirm the reliability of a climate model’s predictions over changes in space and time, then a model’s predictions are adequate for the purposes of public policy).
"Can machines learn how clouds work? The epistemic implications of machine learning methods in climate science." Philosophy of Science (2021) (Thesis: Machine learning methods are not reliable for climate prediction and decision-making because they do not represent causal processes that produce climate phenomena.)
"Against explainability requirements for ethical artificial intelligence in health care." AI & Ethics (2022) (Thesis: Explainability does not confirm the reliability of AI methods for medical decision making and, thus, is not necessary for the ethical use of AI in clinical settings.)
Future Research - Machine Evidence and Expertise
Scientists have recently developed AI systems that can both rival the performance of human experts as well as produce evidence that influences human decisions. For example, ShotSpotter is an AI system that locates the source of loud noises that the system identifies as gunshots. In other words, ShotSpotter effectively generates legal evidence. The potential for AI expert witnesses and AI generated evidence is appealing due to the purported objectivity, absence of personal motivation, reliable memory, and impressive accuracy of expert systems. As a result,
artificial intelligence (AI), including machine learning (ML) systems, are now used for automated decision making and decision support at every step of the criminal justice system—from predicting an individual’s propensity for future criminal activity to making parole recommendations.
In my most recent manuscript (available upon request), I explore the reliability and confirmational significance of recidivism risk prediction models. The use of algorithmic systems for criminal sentencing has rightfully elicited a great deal of public criticism and concern that such systems are both racially biased and further perpetuate structural and systematic injustice in the U.S. criminal justice system.
Rather than argue for or against a particular fairness measure or debate what the criteria and scope of algorithmic fairness ought to be, I recast algorithmic outputs—including predictions—as evidence. I then use a Bayesian account of evidence to assess COMPAS predictions as evidence for or against conditional probabilities about a black and white defendant’s probability of future rearrest given the COMPAS risk score. I draw out the ethical implications of this confirmational analysis of COMPAS predictions and put forth a novel interpretation of the legal notion of “equal treatment” as requiring the use of equally confirmatory algorithmic evidence for punitive decisions. Thus, I argue that it is fair to use strongly confirmatory scores, like COMPAS “low risk” scores as a mitigating factor in pre-sentencing and sentencing decisions for black defendants, and it is unfair to use weakly confirmatory scores, like COMPAS “high risk” scores, as an aggravating factor in pre-sentencing and sentencing decisions for black defendants. On this view, fair data-driven decision-making centers on the fair use of algorithmic evidence for human decision-making.
Looking forward, I plan to write a series of articles exploring some of the following questions:
Can we trust AI expert systems or is trust a concept that is only applicable to human agents?
Are we, as non-experts, obligated to epistemically defer to AI expert systems?
What should we do when AI and human experts disagree?
How should human experts incorporate the testimony or evidence of AI experts in their own reasoning and decision making?
How can reliance on AI experts cause injustice in human-machine interactions and human decision-making?