If risk assessments of criminal offenders consistently show signs of bias, what does that mean for criminal justice reform?
“There remains evidence that the instruments overpredict the risk of recidivism for some racial and ethnic groups relative to white individuals (e.g., Black, Hispanic, and Asian males and females on the general tools), as was true of previous versions of PATTERN and has been documented in prior reports.”
This article addresses risk assessment instruments designed to implement the federal First Step Act. It is of immense importance to the entire criminal justice system. It’s offered here as a service to my correctional readers, I’m unaware of an effort to publicize the results of this study.
The latest report on the federal risk assessment instrument is summarized below. This is a multi-year effort of considerable complexity that keeps running into problems of a race or heritage-neutral instrument. It’s being managed by the National Institute of Justice (NIJ) of the US Department of Justice. NIJ is using credible, experienced researchers.
The study uses rearrests and incarcerations as their measure of recidivism.
Background-First Step Act
On December 21, 2018, President Trump signed into law the First Step Act (FSA) of 2018. The act was the culmination of a bipartisan effort to improve criminal justice outcomes, as well as to reduce the size of the federal prison population while also creating mechanisms to maintain public safety.
The First Step Act requires the Attorney General to develop a risk and needs assessment system to be used by BOP (federal Bureau of Prisons) to assess the recidivism risk and criminogenic needs of all federal prisoners and to place prisoners in recidivism reducing programs and productive activities to address their needs and reduce this risk.
Risk Assessment Instruments
The research is important; everyone in state and federal corrections uses risk assessment instruments. Pretrial services, parole and probation, and correctional agencies all use them. Law enforcement uses its own methodologies to establish lists of potential high-risk violent offenders in communities.
We were told that risk assessment instruments would resolve the issue of excessive incarceration or community supervision without risking public safety. Considering that the great majority of state inmates have histories of violence or lengthy backgrounds of arrests and incarcerations per the Bureau of Justice Statistics, many doubted the ability of a risk instrument to have a meaningful impact on incarceration levels. Even state parole and probation caseloads have higher levels of felons and violent offenders than ever before.
Practitioners have always used criminal history, age, and the severity of the current crime as predictors of future criminality along with other factors. If you have an offender with no (or a limited) criminal history who engaged in a non-stranger violent crime (i.e., a fight involving a beer bottle with a friend) and he’s 32 years old, the risk of recidivism will be low.
Compared this example to someone engaged in multiple acts of stranger-to-stranger violence (i.e., armed robberies) who has a history of similar acts and is age 19; he will score much higher as to risk.
The problem is that risk assessment instruments tend to show bias based regarding race or heritage.
The Importance Of Risk Assessment Instruments
Risk assessment is at the heart and soul of decision-making within the criminal justice system. We thought we could devise a neutral process to decide who needs prison and who does not, who on parole and probation needs closer (or less) supervision, or who could be safely released on a pretrial basis.
It’s not an exaggeration to suggest that risk assessment decisions would guide the entire justice system.
It’s not an exaggeration to suggest that the criminal justice reform movement is based on an effective, neutral risk assessment instrument.
State and local justice agencies were told to employ risk assessments and to validate them based on local conditions. If one reads the summary below, one will see how complex this is and that local or state agencies do not have the resources to complete a fair, complex evaluation.
What’s below is an overview of the latest study of the federal risk assessment instrument that continues to show bias. It works as to risk factors but even the simplest efforts in the past based on age, criminal history, the severity of the crime, and related factors proved effective at predicting the risk to public safety.
What’s below is a lightly edited (for readability) version of the latest report. The name of the instrument is PATTERN.
An interesting overview of the latest study is available in Forbes.
Background-Do Correctional Rehabilitation Programs Work?
The short answer is either no or they didn’t show large declines based on extensive literature reviews. That doesn’t mean that we should not offer rehabilitative services to offenders on humanitarian grounds or to keep prisons safer. It simply indicates that their ability to reduce recidivism is limited. Most exposed to programs failed via rearrests and incarcerations.
Most programs don’t work and when they do, the effect is often a ten-twenty percent (or less) reduction in recidivism. Programs may show statistically significant reductions but statistical significance (i.e., the results have precisely measured the intervention) doesn’t necessarily mean that the program or intervention worked well.
Per a literature review, the most successful program is cognitive-behavioral therapy, which averaged a twenty percent reduction.
A substance abuse evaluation by the US Sentencing Commission stated that participants who completed the program were 27 percent less likely to recidivate.
The bottom line is that most offenders participating in programs saw no decline in recidivism, Do Offender Recidivism Programs Work? However, cognitive behavioral therapy and federal substance abuse programs seemed to work better than other interventions.
Advocates will continue to insist that they do work despite the available evidence. The recidivism rate of those released from prison is massive.
PATTERN Risk Analysis Study Conclusions
This study represents the fourth review and revalidation report on PATTERN. As mandated by the federal First Step Act, the current study evaluated PATTERN for its predictive accuracy, dynamic validity, and racial/ethnic neutrality on a subsequent cohort of FY 2019 FBOP releasees.
The current study findings continue to demonstrate that PATTERN is a strong predictor of general and violent recidivism at one-, two, and three-year follow-up periods.
Comparisons of recidivism rates also continue to indicate that such risk level designations provide meaningful distinctions of recidivism risk. In addition, the results continue to suggest that individuals can change their risk scores and levels during confinement.
Such changes in risk were not exclusively driven by changes in age. Those who reduced their RLC (risk scores) from first to last assessment were generally shown to have the lowest recidivism rates, followed by those who maintained the same risk level and those with a higher risk level, respectively.
While the study findings indicate that PATTERN is predictively accurate across the five racial and ethnic groups analyzed, there remains evidence that the instruments overpredict the risk of recidivism for some racial and ethnic groups relative to white individuals (e.g., Black, Hispanic, and Asian males and females on the general tools), as was true of previous versions of PATTERN and has been documented in prior reports.
Consistent with the previous review and revalidation reports, these findings also continue to document meaningful differences across the RLC-by-race distributions. Differences in group risk level distributions can be referred to as “differential impact.”
These distributions do not consider the accuracy or parity of recidivism predictions, only group differences in risk categories. Differential impact is distinguished from differential prediction, in which members of different groups with the same risk scores have different rates of recidivism.
Differential prediction (discussed further below) indicates that a tool predicts differently for different racial and ethnic groups, a form of racial bias. Differential impact and differential prediction can both be affected by biased data, and thus might be mitigated, for example, through the selection of prediction or outcome items, where alternative data sources are available.
As with all administrative data sources, it is possible that there are biases in PATTERN data sources and its measures, and the National Institute of Justice (NIJ) will continue to work to explore and mitigate these biases with its partners.
Even if these sources of data bias could be identified and corrected, however, there may still be some group-based differences that are not attributable to data bias. If so, groups may experience different risk scores and categories that would not necessarily indicate bias. Further, it is often difficult (or impossible) to discern whether some observed group-level differences in data are genuine or reflect some sort of systemic bias.
It would be possible for some group-level differences to exist, and for a tool to achieve parity in prediction across groups, while still demonstrating differential impact. NIJ recognizes concerns about potential bias in risk factors themselves (such as criminal history) and in the recidivism outcome measure, which captures arrest rather than conviction.
NIJ is committed to the ongoing efforts to explore additional data points, including the possibility of a conviction-based recidivism outcome measure. As noted above, FBOP is actively pursuing a way to capture reconviction information more accurately and comprehensively through a new partnership with the Data Science Discovery Program at the University of California, Berkeley.
Although racial and ethnic neutrality can be examined through numerous metrics, the racial and ethnic fairness analyses presented here have prioritized the differential prediction findings, reflecting the current emphasis in the field.
While an effective tool might still fairly reflect group-based differences in risk categorization, an unbiased tool should predict similarly across racial and ethnic groups. To address these questions, the study examined AUCs and predictive values by race, and employed regression analyses to test for differential prediction.
The findings indicate that different risk scores correspond to different recidivism likelihoods across racial and ethnic groups — evidence of differential prediction. This disparity remains NIJ’s leading concern related to PATTERN, and one which it is committed to addressing.
Overall, the differential prediction results are consistent with previous years and thus mirror the concerns raised in the USDOJ (2021b and 2023) reports.
Those reports discussed the inherent impossibility of satisfying all notions of racial and ethnic fairness, since different definitions are interrelated and conflicting. In addition, while the focus here is on differential results adapted from the Standards for Educational and Psychological Testing, those standards do not impose strict requirements on absolute parity across groups.
Furthermore, PATTERN addresses five distinct racial and ethnic groups, which poses unique challenges over the examinations found in the criminal justice literature that have typically considered just two racial or ethnic groups.
Nevertheless, the differential prediction results raise a clear concern related to PATTERN’s racial and ethnic neutrality. As noted, “there are no simple solutions to this complex problem” and indicated that “deliberate study and engagement with stakeholders and experts are warranted to identify an optimal path forward”.
NIJ and its consultants continue to investigate potential solutions for the differential prediction analyses identified. In 2023, Dr. Greg Ridgeway (University of Pennsylvania) joined the review and revalidation team as a statistical consultant; NIJ continues to explore methodological ways to fulfill the FSA’s mandate “to ensure that any disparities identified … are reduced to the greatest extent possible”.
Finally, the current study cohort included individuals released through September 30, 2019. Next year’s review and revalidation report will cover October 1, 2019, through September 30, 2020, which is significant for two reasons.
First, to date, the review and revalidation analyses have been retrospective, conducted on pre-deployment cohorts as individuals were released from custody and into the community with a three-year recidivism follow-up time.
PATTERN, and the FSA additional earned time rules, went into place in January 2020, meaning the upcoming cohort will be the first for which individuals were scored on PATTERN and may have earned additional time credit prior to release based on their scores.
Second, early 2020 also marked the onset of the COVID-19 pandemic which may have affected early release decisions apart from the new FSA rules. In addition, the combination of COVID-19 precautions and the civil unrest that marked the summer of 2020 may have had an impact on policing and arrest patterns, which could directly or indirectly affect the recidivism outcomes for the revalidation analyses.
Thus, the upcoming review and revalidation would be of particular interest as the first partially prospective PATTERN cohort, but will also be tempered with caution due to the potential confounding effects of the follow-up period.
Comments
2024-10-14T11:44-0400 | Comment by: Laurence
A good idea on the face of it, but this system also has potential for abuse in case of errors, which happen all the time in law enforcement.