5,779
Views
20
CrossRef citations to date
0
Altmetric
Articles

Predicting risk in criminal procedure: actuarial tools, algorithms, AI and judicial decision-making

ABSTRACT

Risk assessments are conducted at a number of decision points in criminal procedure including in bail, sentencing and parole as well as in determining extended supervision and continuing detention orders of high-risk offenders. Such risk assessments have traditionally been the function of the human discretion and intuition of judicial officers, based on clinical assessments, framed by legislation and common-law principles, and encapsulating the concept of individualised justice. Yet, the progressive technologisation of criminal procedure is witnessing the incursion of statistical, data-driven evaluations of risk. Human judicial evaluative functions are increasingly complemented by a range of actuarial, algorithmic, machine learning and Artificial Intelligence (AI) tools that purport to provide accurate predictive capabilities and objective, consistent risk assessments. But ethical concerns have been raised globally regarding algorithms as proprietary products with in-built statistical bias as well as the diminution of judicial human evaluation in favour of the machine. This article focuses on risk assessment and what happens when decision-making is delegated to a predictive tool. Specifically, this article scrutinises the inscrutable proprietary nature of such risk tools and how that may render the calculation of the risk score opaque and unknowable to both the offender and the court.

This article is part of the following collections:
The Future of the Criminal Law

Introduction

In the criminal jurisdiction, gauging unacceptable risk, high risk, risk to community safety as well as forecasting the likelihood of reoffending and prospects of rehabilitation are necessary judicial tasks in bail, sentencing and parole procedures that determine the liberty of an accused or offender. In addition, risk assessments are made for the extended supervision and continuing detention of high-risk sex, terrorism and violent offenders to ensure the safety and protection of the community and to promote offenders’ rehabilitation (for example, Crimes (High Risk Offenders) Act 2006 (NSW) s Citation3(Citation1)). Such risk assessments have traditionally been the function of the human discretion and intuition of judicial officers, based on clinical assessments, framed by legislation and common-law principles, and encapsulating the concept of individualised justice (Markarian v The Queen [Citation2005] HCA Citation25; Muldrock v The Queen [Citation2011] HCA Citation39; Bugmy v The Queen [Citation2013] HCA Citation37). Yet, the progressive technologisation of criminal procedure is witnessing the incursion of statistical, data-driven evaluations of risk. Human judicial evaluative functions are increasingly complemented by a range of actuarial, algorithmic, machine learning and Artificial Intelligence (AI) tools that purport to provide accurate predictive capabilities and objective and consistent assessments of risk (Barabas, Dinakar, Ito, Virza, & Zittrain, Citation2017). But ethical and human rights concerns have been raised globally regarding algorithms as proprietary products, potentially with in-built statistical bias as well as the diminution of judicial human evaluation in favour of the machine (Angwin, Larson, Mattu, & Kirchner, Citation2016; Barabas et al., Citation2017; Dawson et al., Citation2019; European Commission, Citation2019; Wexler, Citation2017). A recent report from England and Wales found there was ‘a lack of explicit standards, best practice, and openness or transparency about the use of algorithmic systems in criminal justice’ (Law Society, Citation2019, p. 4). It is in this context that the need for ethical and human rights-based AI frameworks in Australia has been recently recognised (AHRC, Citation2018; Dawson et al., Citation2019).

Actuarial models, algorithms and AI frame a growing number of technologies used in criminal justice, for instance, in automated decisions, predictive policing and facial recognition (Law Society, Citation2019). This article will focus on risk assessment and what happens when decision-making is delegated to a predictive tool. Specifically, this article will scrutinise the inscrutable proprietary nature of such risk tools and how the calculation of the risk score may be rendered opaque and unknowable to both the offender and the court. This is a significant area to explore, given that the decision points in criminal procedure where algorithmic instruments may be applied represent the ultimate high stakes determination of liberty versus detention in a situation of extreme power imbalance (Barabas et al., Citation2017; Eckhouse, Lum, Conti-Cook, & Ciccolini, Citation2019; Law Society, Citation2019).

This article commences with an analysis of risk in the context of an increasingly risk-averse society and criminal justice system and the tensions between a general right to be at liberty versus community safety. In the next section, the article examines criticisms levelled at human judicial discretion, particularly in the context of sentencing. Flowing from this discussion, the article provides an overview of the development of predictive tools in risk assessments in criminal procedure to question whether machine and data-driven assessments offer more accuracy and objectivity than human judges. Algorithmic instruments and issues concerning embedded bias and proprietary interests are then examined through the lens of the now infamous United States decision of State of Wisconsin v Loomis Citation881 N.W.Citation2d Citation749 (Wis. Citation2016). Finally, the need to revisit the concept of procedural justice is examined in the context of a progressively technologised criminal justice system. The application of statistical, data-based instruments in deterministic criminal procedures may transgress the presumption of innocence in bail applications, breach individualised justice in sentencing and parole, further constrain or displace judicial discretion and diminish procedural justice.

Risk, unacceptable risk and protecting the community

The assessment of risk has become a critical element of the criminal justice system from law enforcement through to pre- and post-trial procedures including bail and sentencing as well as in decisions regarding an offender's release back into the community or continuing detention or extended supervision (Carlson, Citation2017; Harcourt, Citation2005). Recent case law demonstrates judicial engagement with risk-related terminology including risk management, risk profile, risk factors, risk associated behaviour and risk of recidivism. Case law also reveals how courts gauge concepts such as unacceptable risk, high risk and risk to community safety. What is ‘risk’ and why has it become a focal point in criminal justice?

Scholars have identified the increasing emphasis on ‘dangerousness, risk, pre-emption and uncertainty’ so that precaution and risk prevention have become dominant forces in the administration of justice, perhaps privileging ‘generalised fears’ and potential harms over ‘real threats’ (Brown et al., Citation2015, pp. 43–44). In our risk society (Beck, Citation1992), risk is a ‘core organising concept’ (McSherry, Citation2004, p. 1), and there is a clear political context that frames debates regarding law and order (Brown & Quilter, Citation2014) as well as moral panics (Lee, Citation2007) leading to the increased significance of risk assessment and tightening of legislative, policing and procedural measures. The concept of risk has justified the extension of criminal responsibility to behaviours associated with consorting, planning acts of terrorism and status offences such as being the member of an outlaw motorcycle gang, and has fuelled new forms of post-sentence supervision and preventive detention (Brown et al., Citation2015). Thus, ‘risk-based responsibility’ focuses criminal law's attention on the task of prevention that has, in turn, led to the criminalisation of pre-inchoate or pre-crime behaviours (Ashworth, Citation2018, p. 355; Lacey, Citation2016; Mythen, Walklate, & Khan, Citation2013; Wilson & McCulloch, Citation2015). The spread of criminal law's dominion over possible, remote, future harms and inchoate or preparatory criminality may be analysed as ‘risk-neutralisation’ or preventive justice on the one hand, while on the other, as diminishing the rule of law and presumption of innocence (Brown et al., Citation2015; Husak, Citation2008; O’Malley, Citation2013, p. 276; Zedner, Citation2007). This expansion of criminal law over pre-crime activities can also be indexed to the increasing scientisation of policing, that is, technological solutions to risk management (Ericson & Haggerty, Citation1997; Myhill & Johnson, Citation2016).

Risk assessments are undertaken at various criminal procedure decision points including at bail, sentencing and parole. Pending trial, an accused person may be held on remand, and using NSW as an example, courts consider the composite phrase ‘unacceptable risk’, that is, an unacceptable risk that the accused person, if released from custody, will fail to appear at proceedings, commit a serious offence, endanger the safety of victims, individuals or the community or interfere with witnesses or evidence (Bail Act 2013 (NSW)). Predicting such risk is ‘context specific’ to bail (Lynn v State of New South Wales [Citation2016] NSWCA Citation57, Beazley P at [74]), complex and controversial, given that ‘it is trite to observe that no grant of bail is risk free’ (R v Elzamtar [Citation2017] NSWSC Citation275, Harrison J at [23]). That is, the bail decision does not rest on the elimination of risk, nor probabilities of the risk, rather:

What must be established is that there is a sufficient likelihood of the occurrence of the risk which, having regard to all relevant circumstances, makes it unacceptable. Hence the possibility an offender may commit like offences has been viewed as sufficient to satisfy a court that there is an unacceptable risk. (Haidy v DPP [Citation2004] VSC Citation247, Redlich J at [16])

Whether a risk is unacceptable is contentious (R v Agang; R v Bajwa; R v Ghanem [Citation2017] NSWSC Citation138) as noted by Harrison J at [17]:

I accept that the existence of a risk and the assessment of whether or not it is unacceptable are matters about which minds may differ. … Inevitably and too often one is required to make the determination based only on contested inferences from the past and frail predictions about the future.

This observation highlights the complexity of pre-trial ‘frail predictions’, signals possible inconsistencies between judges and may offer an insight as to why statistical risk assessment tools are being adopted throughout many jurisdictions (Koepke & Robinson, Citation2018).

Risk assessments and ‘predictions of future criminality’ (ALRC, Citation2005, p. 51) in sentencing can arise in a number of other circumstances (Hannah-Moffat, Citation2013). Certainly, one purpose of sentencing is to protect the community from the offender, and incapacitation through incarceration eliminates the risk of re-offending while the offender is detained. Risk is also significant when making an Intensive Correction Order (ICO), a form of custodial sentence served in the community. The paramount consideration in making such an order is community safety and, in this evaluation, the court is to address the offender's risk of reoffending (see, for example, Crimes (Sentencing Procedure) Act Citation1999 (NSW) ss Citation7, 66). In the general sentencing context, the Veen series of cases provides a compelling examination of the risk of recidivism, public protection, preventive detention as well as indefinite detention (McSherry, Citation2004; Veen (no. Citation1) (Citation1979) Citation143 CLR Citation458; Veen (no. Citation2) (Citation1988) Citation164 CLR Citation465).

Risk assessment may be a critical function within the prison system too, for example, in the security classification of inmates, identifying inmates’ risks and criminogenic needs and in offender management (Law Society, Citation2019; Moore, Citation2015). Risk assessments are also fundamental at the other end of the criminal process – when an offender may be eligible to be released from a custodial situation on parole back into the community. In effect, parole means that the offender may serve the remainder of their sentence in the community, subject to supervision and strict conditions. In parole determinations, risk to community safety is central, as well as risk of reoffending, risks to the offender and risks to other persons (see, for example, Crimes (Administration of Sentences) Act 1999 (NSW) Part Citation6 ss 128, 130; Crimes (Administration of Sentences) Regulation 2014 (NSW) Parts Citation14, Citation14A).

Regarding high-risk offenders and the imposition of extended supervision orders (a mode of supervising an offender in the community at the expiration of their custodial sentence) or continuing detention orders (a form of preventive, protective and indeterminate detention beyond the custodial sentence period), courts must consider the composite phrase ‘unacceptable risk’ (see, for example, Crimes (High Risk Offenders) Act Citation2006 (NSW)). This is done in the context of ‘making the community secure from harm as opposed to guaranteeing [emphasis added] its safety and protection … were it otherwise, every risk would be unacceptable’ (Lynn v State of New South Wales [Citation2016] NSWCA Citation57, Beazley P at [61]). This requires a consideration of firstly, ‘the probability that the risk will manifest’ and secondly, the seriousness of the potential harm (State of NSW v Ceissman [Citation2018] NSWSC Citation508, Rothman J at [26]). The assessment of the unacceptability of any risk ‘involves at least notionally the arithmetical product of the consequences of the risk should it eventuate on the one hand and the likelihood that it will eventuate on the other hand’ (State of New South Wales v Pacey (Final) [Citation2015] NSWSC Citation1983, Harrison J at [43]). Given the close affiliation between calculations of dangerousness (Hobbs & Trotter, Citation2018), risk assessments and arithmetic predictions, the rise of statistical, actuarial or algorithmic evaluation is not surprising.

Objecting to the subjectivity of judges

At the same time as the emergence of an increasingly risk-averse society, there has been a trend towards criticising the human discretion that operates at all levels of the criminal justice system and is, perhaps, made most public in sentencing judgements. The public and media not infrequently express frustration with sentencing decisions that seem too lenient, unfair or inconsistent with other decisions (Zdenkowski, Citation2000). Scholars, too, critique the levels of unpredictability and numerical inconsistency in sentencing decisions (Krasnostein & Freiberg, Citation2013; Stobbs, Hunter, & Bagaric, Citation2017). However, many decisions made throughout criminal justice are premised upon the concept of individualised justice and the exercise of human discretion: from police officers’ discretion to arrest, to judicial discretion in granting or denying release to bail, to sentencing and to the post-sentence orders that may be made in relation to high-risk offenders. The principle of individualised justice is responsive to the individual offender, the facts and the offence (Anthony, Bartels, & Hopkins, Citation2015) and embraces the notion that ‘there is no greater inequality than the equal treatment of unequals’ (Dennis v United States Citation339 US Citation162, Citation184 (Frankfurter J) (Citation1950)). It ensures that the punishment fits the crime and the offender's moral culpability (R v Fernando (Citation1992) Citation76 A Crim R Citation58; Bugmy v The Queen (Citation2013) Citation249 CLR Citation571; Munda v Western Australia (Citation2013) Citation249 CLR Citation600; Elias v The Queen (Citation2013) Citation248 CLR Citation483). This principle is particularly relevant in sentencing where the judiciary recognises that ‘the outcome of discretionary decision-making can never be uniform’ (Wong v The Queen [Citation2001] HCA Citation54, Gleeson CJ at [6]), and there is no singular ‘correct’ sentence (Martin, Citation2017, p. 19), as every offender and offence is different (Anthony et al., Citation2015). There is, however, a proviso that ‘like cases should be treated in a like manner’ (Wong v The Queen [Citation2001] HCA Citation54, Gleeson CJ at [6]).

Of course, the exercise of discretion in criminal justice is not completely unfettered, with legislation, legal precedent and guidelines serving to delimit discretion (Martin, Citation2017). For instance, in sentencing, judges may be constrained by guideline judgements, mandatory minimum sentences, maximum penalties, the principles that imprisonment ought to be a last resort, prescribed sentencing ‘discounts’, aggravating and mitigating factors and non-parole periods. Within this process, numerical consistency is less important than the consistent application of legal principles (Hili v The Queen (Citation2010) Citation242 CLR Citation520).

Ultimately in sentencing, judges are required to assess the appropriate sentence given the particular offence committed by the particular offender, while balancing objective and subjective factors with the purposes of sentencing that include punishment, deterrence, community protection, rehabilitation, accountability, denouncement and recognition of the harm inflicted on the victim and the community. This is a complex exercise in proportionality and giving weight to all competing and multiple objectives of sentencing, a process referred to as ‘instinctive’ or ‘intuitive’ synthesis. Instinctive synthesis has been described as a global value judgement, recognised as not necessarily logical and a process that may produce ‘outcomes upon which reasonable minds will differ’ (Hudson v The Queen (Citation2010) Citation30 VR Citation610; 205 A Crim R 199; [2010] VSCA 332 at [27]). However well the process of instinctive synthesis is articulated in sentencing judgements, it seems to many to be an arbitrary, unpredictable and perhaps overly subjective process (Stobbs et al., Citation2017). Instinctive synthesis has been critiqued by scholars and judges as being a vague and inherently opaque process that contradicts requirements for judicial decisions to be open for scrutiny (Brown et al., Citation2015). The term infers a sense of ‘mystical “instinct”’, suggesting that logic and rationality are not at the epicentre of sentencing decisions and, as such, it impedes transparency and accountability in judicial decision-making processes (Markarian v The Queen [Citation2005] HCA Citation25, Kirby J at [129]–[130]). Would the procedure be more transparent and fair if instinctive synthesis and related risk assessments by humans were instead solved or structured by an algorithm, an AI judge (Sourdin, Citation2018)? Stobbs et al. (Citation2017, p. 262) suggest that sentencing is amenable to being automated because it is premised upon established principles, weightings and key factors. AI is well suited to the complexities of calibrating multiple variables, if not presenting a means to improve and refine the sentencing system by incorporating algorithmic risk assessments and by removing the ‘subconscious bias’ of humans. Likewise, vendors of predictive risk assessment tools promote them as the solution to human bias and an improvement on human judgement (Eckhouse et al., Citation2019). Non-human automated decision-making processes purport to make such assessments more consistent, timely, cost-effective and accurate (Hannah-Moffat, Citation2013; Hogan-Doran, Citation2017).

The growth of the algorithm

The push-back against judicial subjectivity and discretion needs to be examined in the context of preventive justice (Ashworth & Zedner, Citation2014) as well as the rise of ‘actuarial justice’ and the embedding of scientific discourse in criminal justice (Brown et al., Citation2015; Ericson & Haggerty, Citation1997; McSherry, Citation2004; O’Malley, Citation2013, p. 276). While courts in some jurisdictions have been concerned with questions of offenders’ current and future dangerousness, there have been definite shifts towards risk-based models and ‘statistical or actuarial risk prediction’ (Koepke & Robinson, Citation2018; McSherry, Citation2004, p. 2). At the outset, key terms need to be defined, as there are various statistical, data-driven predictive tools used in criminal procedure risk assessments that are not strictly AI. Rather they are ‘actuarial’ or ‘algorithmic’ instruments. However, similar ethical concerns and challenges arise.

Barabas et al. (Citation2017) state that actuarial decision-making practices in risk assessment have been used since the 1920s. According to Harcourt (Citation2005, p. 10) the term ‘actuarial’ refers to the use of:

statistical methods—rather than clinical methods—on large datasets of criminal offending rates … to determine the different levels of offending associated with a group … and, based on those correlations, to predict first the past, present or future criminal behavior of a particular individual and to administer second a criminal justice outcome for that particular individual.

Actuarial can also refer to the fact that the score is determined by an algorithm (Smid, Citation2014). Algorithms can be understood in various ways such as relating to a defined computational procedure or set of instructions that takes a set of values as input and produces a set of values as output (Law Society, Citation2019). In relation to criminal justice, an algorithm can be understood as a rule that uses numerical inputs to produce a prediction relevant to the procedural decision point (Christin, Rosenblat, & Boyd, Citation2015). Algorithms operate in different ways, and more advanced forms include machine learning whereby the machine ‘learns’, improves its tasks over time and may modify an algorithm as it synthesises new data (Law Society, Citation2019, p. 10). Regarding AI, it is suggested that there is ‘no universally accepted definition’; it is an expression that encapsulates autonomous computerised processing of data that resembles or replicates human processing and intelligence (AHRC, Citation2018, p. 26). AI-informed decision-making and prediction occur when algorithms are applied to datasets with tasks ranging from simple, narrow automated systems to more sophisticated ‘neural nets and deep learning’ (Dawson et al., Citation2019, p. 14).

Prior to the uptake of actuarial or algorithmic techniques, risk was assessed in a clinical but human manner, for example, by psychiatrists or psychologists, based on professional, subjective evaluation (Carlson, Citation2017), basically, ‘unstructured clinical judgements’ (Hsu, Caputi, & Byrne, Citation2009; Jones & Milton, Citation2016, p. 1; Moore, Citation2015). Now this wholly human approach may be considered as overly subjective and lacking in reliability and consistency (Jones & Milton, Citation2016). The various generations of risk assessment have developed through the actuarial or statistical approach and have aimed for a greater level of objectivity (Hsu et al., Citation2009). They were initially based on ‘static factors (the need-to-know aspects of the offenders such as age at first offense and crime(s) committed)’, then later combined with ‘dynamic factors (the possibility of change in the offenders’ lives)’ (Hsu et al., Citation2009, p. 729) and more recently with other specific offender factors to enable treatment and intervention (Moore, Citation2015).

Risk assessment tools, in essence, use data regarding groups of people, a range of factors and weightings and human-inputted rules to predict an individual's future behaviour (Koepke & Robinson, Citation2018). Such instruments provide statistical predictions, typically comprising risk factors as predictors of violence or reoffending so that an individual is evaluated against these risk factors and ‘scored’ – the higher the score, the higher the risk (Carlson, Citation2017; Grann & Långström, Citation2007). These predictive tools have permeated law enforcement and the criminal justice system (Harcourt, Citation2005) to become the norm in practice (Smid, Citation2014) and ‘used as an objective, neutral mechanism of fair treatment’ (Eckhouse et al., Citation2019, p. 3; Hannah-Moffat, Citation2013). The instruments exude the veneer of objectivity and are used to score the risks of ‘flight, rearrest, parole violation … based on data from other people with characteristics similar’ (Eckhouse et al., Citation2019, p. 3). In summary, it is clear that subjective risk assessments have increasingly given way to mechanical and actuarial tools that produce statistical models based on extensive criminal offending datasets, perceived to be more objective than human observation (Carlson, Citation2017).

Forms of predictive risk assessment are now applied at various decision points in criminal justice and ‘dominate the field of crime and punishment’ such that there is an ‘actuarial turn’ in criminal law (Harcourt, Citation2005, p. 15). Law enforcement agencies have adopted predictive algorithmic instruments for a number of risk assessment applications. For instance, in England and Wales, the Harm Assessment Risk Tool (HART), described as a ‘random forest algorithm’, is deployed in determining whether individuals should be arrested, charged or diverted (Law Society, Citation2019, p. 46). Suspects are risk-scored using a variety of other algorithmic tools and matrices that seek to identify propensities for particular forms of offending and future behaviours, while predictive systems are also used to identify risks of victimisation (Law Society, Citation2019). In the sentencing context, pre-sentence reports routinely inform courts of the offender's risk score (Law Society, Citation2019).

Predictive tools in action

Recent NSW case law evidences the range of predictive, diagnostic tools including Level of Service Inventory-Revised (LSI-R), the Risk Assessment Report, the STABLE-2007 tool, the STATIC Risk Factors Actuarial Assessment-Sex Offending (STATIC-99R) and the Risk of Sexual Violence Protocol (RSVP). How can such emergent actuarial/algorithmic tools assist judicial officers in making decisions that impact a defendant's or offender's legal status and liberty? Can criminal procedure decisions be partially or fully automated or, at least, structured by an algorithm?

In State of New South Wales v Barrie (Final) [Citation2018] NSWSC Citation1005, the court discussed the risk assessment of a high-risk sexual offender in relation to a continuing detention order following the expiry of his custodial sentence. Regarding the likelihood of the offender committing further serious crimes, Adams J at [69] observed that the assessing psychologist had acknowledged that it was ‘not scientifically possible to accurately predict whether or not a specific offender will or will not actually reoffend’. Indeed the unpredictability of human behaviour means that those who need to assess risk increasingly make use of various approaches (Jones & Milton, Citation2016). In Barrie, the psychologist assessed his ‘risk of sexual reoffending using the STATIC-99R, which is an actuarial test applied in predicting the sexual recidivism for individuals charged with or convicted of sexual offences’ (State of New South Wales v Barrie (Final) [Citation2018] NSWSC Citation1005, N Adams J at [69]), a tool described in another case as having ‘moderate predictive accuracy’ (State of New South Wales v Graham James Kay [Citation2018] NSWSC Citation1235, Wilson J at [40]). Using the STATIC-99R tool, the offender in Barrie was assessed as a ‘7’, placing him in the high-risk category. The assessing psychologist in this case acknowledged the ‘limitations of this test and stated that a more comprehensive evaluation was obtained by reference to the RSVP (Risk of Sexual Violence Protocol)’ (State of New South Wales v Barrie (Final) [Citation2018] NSWSC Citation1005, N Adams J at [69]). The RSVP tool is a:

structured professional judgment instrument developed to assist in the identification and management of sexual violence using a range of factors identified by the literature related to sexual offending. It includes 22 static and dynamic factors grouped into five domains. Those five domains are: a history of sexual violence; psychological adjustment; mental disorder; social adjustment; and manageability. She concluded that the defendant presented with risk factors in all but one of those domains (he [did] not have a mental disorder). She state[d] that this suggests that the STATIC-99R assessment of high risk is an accurate reflection of his risk. (State of New South Wales v Barrie (Final) [Citation2018] NSWSC Citation1005, N Adams J at [69])

Here the psychologist used the RSVP tool, described as the most common Structured Professional Judgement instrument for high-risk sexual offenders, as a means to structure the human risk assessor's process of evaluation rather than replace that person (Jones & Milton, Citation2016).

In State of New South Wales v Graham James Kay [Citation2018] NSWSC Citation1235, while evidence was provided from three actuarial tools, the inability to scientifically predict whether the offender would reoffend was again noted, given the actuarial instruments were based on historical factors. One actuarial assessment instrument used was the Level of Service Inventory-Revised (LSI-R) that identifies an offender's risk of reoffending and their criminogenic needs (Watkins, Citation2011). LSI-R is described as a third-generation risk assessment tool that balances static factors with dynamic factors and has capacity to identify need patterns and profiles of offender groups. It is ‘regarded as a good predictor of general reoffending, but also a modest predictor of violence. Its capacity to predict sexual reoffending is mixed’ (State of New South Wales v Graham James Kay [Citation2018] NSWSC Citation1235, Wilson J at [39]). Such judicial commentary provides insights into the limitations of such risk assessment tools. Indeed, the LSI-R instrument has been criticised for being framed by specific geographic and temporal inputs (Koepke & Robinson, Citation2018) and on the basis that offenders do not have homogenous criminogenic needs or profiles (Hsu et al., Citation2009).

Clearly much reliance is being placed on actuarial predictive tools in the risk assessment of offenders at many levels of the criminal justice system (Carlson, Citation2017), and multiple forms of scoring risk are evident in the case law. Nevertheless there are recognised shortcomings. The assessing psychologist in State of NSW v Dillon (Final) recognised the limitations of instruments including STATIC-99R being that the recidivism estimates and rankings are based on groups of individuals, not necessarily directly reflective of the particular offender. He told the court:

When comparing group data to individual cases it is important to note that factors and circumstances unique to an individual may not have been captured within the normative group and caution must be exercised when making such a comparison. (State of New South Wales v Dillon (Final) [Citation2018] NSWSC Citation1626, at [102])

Moreover, the assessing psychologist explained that the tool may predict forms of sexual reoffending that would not actually even fall within the ambit of ‘serious’ sexual offences. Such actuarial instruments produce scores that ‘do not differentiate between the severity of offences that might be committed’, for example, they do not distinguish between grievous bodily harm and assault occasioning actual bodily harm (State of New South Wales v King (Final) [Citation2019] NSWSC Citation151, at [142]). STATIC instruments are also not sensitive to any changing circumstances that may positively or negatively affect the offender's actual risks of reoffending (State of New South Wales v Cook (Final) [Citation2019] NSWSC Citation51). Nevertheless, findings from tools such as the STATIC-99R are often combined with other tools such as the STABLE-2007 to present a composite Risks-Needs-Responsivity (RNR) assessment that focuses on risk management in the structuring of interventions, treatment and support (Eckhouse et al., Citation2019; Hannah-Moffat, Citation2013; Moore, Citation2015; Smid, Citation2014). The statistical likelihood of reoffending may also be scored through other instruments including the Violence Risk Appraisal Guide-Revised (VRAG-R) and the Violence Risk Scale (Grann & Långström, Citation2007).

On the face of it, it seems that these actuarial or algorithmic instruments offer evidence-based and objective forms of risk assessment. However, a number of critical concerns have been expressed. Jones and Milton (Citation2016, p. 1) note the advantages of statistical methods especially in dealing with large caseloads but importantly acknowledge that algorithms provide ‘very little information about the actual individual being assessed’, thus recognising the conflict with the principle of individualised justice. Other studies have found that the margins of error of these actuarial or mathematical methods are large and the application of group data to an individual cannot be meaningfully undertaken with precision (Hart, Michie, & Cooke, Citation2007). Several scholars identify the opacity of the algorithm as being particularly problematic, that is, the actual algorithm, its inputs or processes may be protected trade secrets so that individuals impacted by the algorithmic assessment cannot critique or understand the determination (Carlson, Citation2017; Hogan-Doran, Citation2017). For example, is it possible to question the exact weighting applied to various risk factors to understand if the weighting is excessive or disproportionate to other factors? How can individuals respond to the case brought against them, challenge the accuracy of the algorithm and defend themselves against an adverse determination? At the end of the day, are there decisions or assessments that critically impact individual human lives and liberty that should not be delegated to algorithms? Automated systems and algorithmic assessments have become ‘a largely uncontested aspect’ and de rigeur in criminal procedure (Carlson, Citation2017; p. 313, quoting Harcourt, Citation2005) yet they give rise to questions as to exactly who – or what – is now the primary decision-maker (Amoore & Raley, Citation2017; Hogan-Doran, Citation2017).

Algorithms and embedded bias

The United States case of State of Wisconsin v Loomis Citation881 N.W.Citation2d Citation749 (Wis. Citation2016) illustrates the challenges in utilising algorithmic risk assessment tools, such as the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), in sentencing procedure (Gordon, Citation2017; Harvard, Citation2017). COMPAS, designed by Northpointe (now Equivant), a private for-profit company, is used to assess an offender's criminogenic needs and predict the likelihood of their reoffending (Carlson, Citation2017; Eckhouse et al., Citation2019). It is used widely in the United States by various justice agencies ‘to inform decisions regarding the placement, supervision and case management of offenders’ (Northpointe, Citation2012). Pre-sentencing, Mr Loomis was assessed by COMPAS as presenting a high risk of reoffending and ultimately sentenced to six years imprisonment and five years of extended supervision. He appealed against the severity of the sentence and asserted that the COMPAS tool breached his right to be sentenced according to accurate information, it breached his right to an individualised sentence and it improperly used gendered assessments (Carlson, Citation2017). On appeal, the Supreme Court held that the defendant's right to due process was not violated by the use of a risk assessment scoring system.

Loomis raises the issues of profiling an individual against a predictive algorithm and concerns regarding the biases embedded within. There is unease that algorithmic tools may disproportionately score some offenders, particularly from marginalised communities, as having a higher risk of reoffending, reinforcing dominant social hegemonies, prejudice and inequality (Angwin et al., Citation2016; Martin, Citation2017). Broad (Citation2018) examines how the over-policing of black populations in the United States fashions the algorithms used in criminal justice. Eckhouse et al. (Citation2019) examine the layers of embedded bias in statistical risk assessments – how the process of using data from an already racially biased criminal justice system perpetuates bias in the resultant risk scores. In relation to the COMPAS tool used in Loomis, that instrument was found by ProPublica to be ‘nearly twice as likely to inaccurately predict that a Black defendant was at high risk for rearrest as a White defendant’ (Eckhouse et al., Citation2019, p. 190, quoting Angwin et al., Citation2016), that is, it can falsely label and misclassify Black defendants as criminals (Carlson, Citation2017). These conclusions have been disputed (Northpointe, Citation2016; ProPublica, Citation2016; Tashea, Citation2017). Indeed it has been suggested that bias cannot easily be extracted, as ‘machine learning algorithms need assumptions and bias in order to function’ (Amoore & Woznicki, Citation2018, np). Perhaps it simply must be recognised that algorithms act ‘in a way that is racist and prejudicial’, and those are the ‘new ethico-political relations’ in which we now exist (Amoore & Woznicki, Citation2018, np). Moreover in Loomis, the risk factor in question was gender, not race. While gendered conceptions of risk present significant issues worthy of critique (Gelsthorpe & Hedderman, Citation2012; Hannah-Moffat & O’Malley, Citation2007), Mr Loomis was unsuccessful in arguing that due process was violated by the use of a risk assessment tool in which gender was a factor. Ultimately, the use of COMPAS in Loomis was seen as acceptable given that it was not the sole determinative means to assess the risk of reoffending. However, the judgement did acknowledge the dangers in placing complete reliance on algorithmic instruments in decisions concerning liberty (Eckhouse et al., Citation2019).

Algorithms as proprietary products

Loomis also reveals the impenetrability of algorithms that are subject to proprietary interests or trade secrets (Gordon, Citation2017). While the offender in Loomis could verify certain inputs to the risk assessment, the concluding score that recommended a significant prison sentence could not be challenged, as the internal structure and formulae of the COMPAS algorithm were based on proprietary information. The court accepted that Northpointe treated COMPAS as a trade secret and it did not need to disclose how risk scores or factors were determined and weighed (Carlson, Citation2017). Therefore Mr Loomis was not entitled to access the formulae or factors that were determinative in his high-risk score and his significant punishment. As the Honourable Justice Martin (Citation2017, p. 22) argues, ‘this is entirely inconsistent with the common law requirement that a decision maker must expose his or her reasoning’.

Profiling an individual's behaviour and characteristics against meta-data is also problematic, as the process is largely invisible and its validity is difficult to challenge. Risk assessment tools are constantly evolving and must be monitored and ‘re-normed for accuracy due to changing populations and subpopulations’ (Loomis, p. 769). According to Hogan-Doran (Citation2017, p. 13, quoting Goodfellow, Benigo, & Courville, Citation2017), the internal structure of algorithms, especially those based on Deep Learning, are virtually impossible to decipher given the cascade of layers of processing units. This leads to the ‘black box’ problem where the inputs and outputs may be clear but the process remains opaque.

Eckhouse et al. (Citation2019, p. 16) critique the Loomis decision as failing to address the trade secret of the COMPAS instrument, thereby preventing ‘judges, defendants, and researchers from vetting the algorithms and evaluating the fairness’. Carlson (Citation2017, p. 329) argues that the use of risk assessment tools such as COMPAS in criminal justice should be subject to the ‘same transparency requirements as other government agencies’ instead of protecting the commercial interests of the private vendor. Interestingly, the Supreme Court in Loomis held that future risk assessments using COMPAS must disclose the proprietary nature of the instrument and how its scores are based on group data from a national sample, rather than the particular individual. Whether that warning will successfully encourage a degree of scepticism in judges regarding the use of risk assessment instruments is debatable, given the broad endorsement of these tools throughout the criminal justice system and the efficiency pressures placed on the judiciary (Harvard, Citation2017). Clearly the algorithmic instruments used to predict risks of reoffending are largely sealed, secret and autonomous. In this way, it is argued that the proprietorial nature of algorithms created by private organisations challenges the fundamental principles of procedural justice, particularly, open justice and individualised justice.

Technologised procedural justice

As a range of technologies intrude and transform legal procedure and practice, there is an imperative to scrutinise the ethics of using algorithmic tools in the context of the fundamental principles of procedural justice. Is it possible to reconcile new technologies with traditional common-law principles, professional ethics and human rights (AHRC, Citation2018; Law Society, Citation2019; Martin, Citation2017)? When risk assessment tools are analysed through the lens of procedural justice principles, it quickly becomes evident how algorithmic instruments and automated systems that make decisions about people's lives and liberty may compromise procedural justice.

Procedural justice, a slippery term that encapsulates fairness and due process, derives from natural justice, and its elements include open justice, equality before the law, the presumption of innocence and the right to hear and answer a case brought by the state (Bronitt & McSherry, Citation2017; McKay, Citation2018; Mulcahy, Citation2013). According to open justice, criminal proceedings should be subject to public oversight as a means to counteract abuses of power and to promote transparency, accountability and ultimately, the rule of law (Resnik, Citation2015). Open justice may be undermined when defendants, courts and society are denied the oversight of algorithmic tools that are used in determining a defendant's legal status and liberty. Such tools need to be ‘testable and contestable’ (Hildebrandt, Citation2018, p. 34). Equality of arms is a key principle in procedural justice meaning that the defendant should not be at a disadvantage compared with the prosecuting state, that is, there should be a level playing field (Roberts & Zuckerman, Citation2010). Of course, in criminal procedure, that principle represents the ideal rather than the reality, nevertheless it is further challenged in situations where the prosecution uses inscrutable algorithmic tools and undisclosed input data against a defendant. The presumption of innocence is the golden thread that runs through the criminal justice system to ensure that the legal onus of proof remains on the prosecution (Woolmington v DPP [Citation1935] AC Citation462). If there is no way of proving or disproving an algorithm's formulae or methodology, the burden of proving a case against a defendant beyond reasonable doubt seems compromised. Additionally, both the presumption of innocence as well as the principle of individualised justice (Martin, Citation2017) are potentially undermined when an individual defendant is assessed against aggregate group data. Finally, the hearing rule, audi alteram partem, requires that the defendant be enabled to hear and comprehend the case being brought against them (Butt & Hamer, Citation2011), yet the use of secret proprietary information against a citizen is at odds with this right.

These issues can also be assessed through the lens of human rights principles (AHRC, Citation2018; Aletras, Tsarapatsanis, Preoţiuc-Pietro, & Lampos, Citation2016; Pasquale & Cashwell, Citation2018; ICCPR). For instance, there is the potential for algorithmic instruments to violate human rights, specifically, equality before the courts and the right to a fair and public hearing heard by a competent, independent and impartial tribunal. Other human rights measures that may be challenged by AI include the presumption of innocence, general procedural fairness requirements including the presentation of an understandable case against the defendant, and protection from discrimination. A report in England and Wales highlights these challenges to procedural justice as well as embedded bias, lack of scrutiny and the disregard of individual contextual factors (Law Society, Citation2019). Rather than as an after-thought, procedural safeguards therefore need to be considered before the cautious implementation of algorithmic assessments that ‘score’ a person and determine their liberty or legal status. Procedural justice demands that persons affected by an algorithmic determination should be enabled to unpack and contest the decision (Keats Citron & Pasquale, Citation2014). However, the suppression of algorithms’ operation and structure means that courts, judiciary, legal profession and defendants are denied the ability to comprehend and contest the decision-making process. Criminal justice does appear to be a realm where care must be taken to keep a ‘human in the loop’ as the ‘proper figure’ and critical safeguard (Amoore & Raley, Citation2017, p. 7) of procedural values.

Impacts on legitimacy and decision-makers

The diminution of procedural justice values can be seen as ultimately undermining the legitimacy of the criminal process. A 2019 Data61 CSIRO Discussion Paper identified a range of core AI principles including ‘Generates net-benefits’, ‘Do no harm’, ‘Regulatory and legal compliance’, ‘Privacy protection’, ‘Fairness’, ‘Transparency & Explainability’, ‘Contestability’ and ‘Accountability’ (Dawson et al., Citation2019, p. 6). What is missing is the principle of legitimacy that underpins society's voluntary compliance and trust in the law. Legitimacy is premised on a ‘moral authority, which in turn depends on law's ability to justify its requirements’ (Stern, Citation2018, p. 4, quoting Raz, Citation1979; see also Sheppard, Citation2018; McKay, Citation2019). It is associated with the rule of law that, amongst other things, requires legal decision-makers to have authority to determine outcomes and for the decision-making process to be examinable and contestable (Oswald, Citation2018). In addition, the rule of law provides for legal certainty, as well as checks and balances that enable citizens to enforce fundamental rights (Hildebrandt, Citation2018). To fulfil legitimacy, the law must be ‘accessible and so far as possible intelligible, clear and predictable’ (Gordon, Citation2017, p. 2, quoting Lord Bingham, Citation2006). However, it is clear that when responsibility for a decision is delegated to an algorithmic instrument, that delegation can render decision-making opaque, inscrutable and incontestable (Hildebrandt, Citation2018). On this basis, this article has sought to demonstrate some of the ways that the legitimacy of criminal procedure is challenged when decision-making authority is ‘ceded to the algorithm’ (Stern, Citation2018, p. 3).

Can the mere existence of a risk assessment score, even as complementary methodology, sway a human decision-maker? Eckhouse et al. (2019) suggest yes; a predictive risk score can influence a judge's decision by focusing attention on potential recidivism over and above other relevant factors. For example, Carlson (Citation2017) discusses a case where the COMPAS score was so high that the sentencing judge overturned the plea deal and sentenced the offender to two years, whereas the judge acknowledged that without the risk assessment, he would have only imposed a one-year sentence. In addition, risk-averse judges may rely on risk scores as a means to deflect blame onto the algorithm; in effect the risk assessment tool acts to ‘de-reponsibilize decision-makers’ (Carlson, Citation2017; Eckhouse et al., Citation2019; Harcourt, Citation2010, p. 2). Other studies suggest that it would be unusual for a judge to defy the algorithmic conclusion (Christin et al., Citation2015; Harvard, Citation2017). Indeed, given the proprietorial claims made by vendors of algorithmic tools, it would be impossible for a judge to meaningfully critique or challenge the risk assessment. Ultimately, judges can be ‘questioned and rebuked for discriminatory behaviour’ (Pasquale & Cashwell, Citation2018, p. 66) ‘whereas an algorithm subtly premised on biased data … could remain virtually immune from criticism’ (Stern, Citation2018, p. 6). While earlier in this article I discussed criticisms of judicial decision-makers and their perceived human ‘subconscious bias’ (Stobbs et al., Citation2017, p. 262), embedding algorithmic unconscious processes into decision-making does not address transparency concerns; it serves only to replace human intuition with mechanical inscrutability and incontestability (Hildebrandt, Citation2018; Oswald, Citation2018; Stern, Citation2018). Extrapolating from Ericson and Haggerty’s (Citation1997) conception of the scientisation of policing discussed earlier, the uptake of algorithmic instruments in criminal procedure is, perhaps, a step towards the scientisation of judicial functions.

Conclusion

With the potential for algorithmic instruments to ‘deeply change the nature of the evolution of the law’ (Law Society, Citation2019, p. 4), there is a recognised need for responsible, accountable and ethical algorithmic design (Kroll, Citation2015). There are also suggestions that, instead of placing reliance on the private commercial sector, governments ought to develop their own actuarial and algorithmic instruments (Carlson, Citation2017). But in countries such as Australia, where the separation of powers between executive and judicial functions is valued, that could lead to an unacceptable blurring of that separation. Certainly, various scholars support the idea of a regulatory body to oversee and audit algorithms and thereby ensure transparency, accountability and procedural justice (Balkin, Citation2017; Hogan-Doran, Citation2017; Pasquale, Citation2017). For example in England and Wales, a National Register of Algorithmic Systems has been recommended (Law Society, Citation2019). Various scholars argue that where private, commercial organisations are involved in essential public functions, their products should be subject to public, democratic disclosure and freedom of information requirements (Carlson, Citation2017; Keats Citron & Pasquale, Citation2014).

Criminal justice is a human institution – it is focused on human behaviours and human harms and has, traditionally, resolved human transgression in a communal fashion. While traditional, non-technologised procedure cannot be valourised, the legitimacy of the criminal justice system, and particularly the coercive power of the state to punish, imprison and supervise offenders, has been premised on open justice, a system that aspires to accountability, impartiality and transparency. The incursion of secret algorithms devised by the private for-profit sector into the public duties of judicial officers challenges the presumed independence of judicial functions. While algorithmic instruments may be useful and complementary predictive tools, they have no role as a sole or final arbiter. To invoke proprietorial protections and financial interests is to prohibit defendants, courts and the community from scrutinising the validity and reliability of predictive formulae used in deterministic criminal procedures. The situation thus synthesises an element into decision-making that is even more opaque than any exercise of judicial discretion. At least an imperfect decision by a judge may be tested on appeal, whereas an imperfect algorithm may be forever concealed.

Acknowledgement

The author thanks the two reviewers for their constructive feedback on the original version of this article. In addition, this paper benefited from feedback and discussions at the ‘Artificial Intelligence and the Law’ Conference, Geneva Law School, January 2019. The author gratefully acknowledges the support of the Universities of Geneva and Sydney, Renmin University and Harvard University which made those discussions possible.

Disclosure statement

No potential conflict of interest was reported by the author.

Case Law

Legislation and Conventions

  • Bail Act 2013 (NSW)
  • Crimes (Administration of Sentences) Act 1999 (NSW)
  • Crimes (Administration of Sentences) Regulation 2014 (NSW)
  • Crimes (High Risk Offenders) Act 2006 (NSW)
  • International Covenant on Civil and Political Rights (ICCPR) opened for signature 16 December 1966, 999 UNTS 171 (entered into force 23 March 1976)
  • Terrorism (High Risk Offenders) Act 2017 (NSW)

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.