1,048
Views
15
CrossRef citations to date
0
Altmetric
Article

Plea Discounts or Trial Penalties? Making Sense of the Trial-Plea Sentence Disparities

ORCID Icon &
Pages 1226-1249 | Received 09 Apr 2018, Accepted 19 Nov 2018, Published online: 10 Jan 2019
 

Abstract

There is a consensus that defendants who plead guilty generally receive less harsh sentences than similarly-situated defendants convicted at trial. However, there is less consensus on how to characterize this disparity in the sentence. Some researchers refer to the disparities as “trial penalties,” whereas others refer to them as “plea discounts.” We contend that the two terms have different theoretical backgrounds and underlying assumptions. As a result, the theories require different modeling strategies, and can lead to different predictions on the relationship between the disparity and some key case characteristics.

We start by differentiating the two perspectives theoretically. We then present an empirical analysis on defendants in New York State to substantiate our theoretical arguments. We demonstrate that the estimates of the trial-plea disparities depend on the assumption of the default, as estimates of the trial penalties differed considerably from the estimates of the plea discounts.

Acknowledgement

These data are provided by the New York State Division of Criminal Justice Services (DCJS). The opinions, findings, and conclusions expressed in this publication are those of the authors and not those of DCJS. Neither New York State nor DCJS assumes liability for its contents or use thereof.

The authors thank Jeffery Ulmer, the three anonymous reviewers, and Cassia Spohn for their comments and insights, and thank Jason Walker for his help in copyediting.

Disclosure Statement

No potential conflict of interest was reported by the authors.

Notes on Contributors

Shi Yan is an assistant professor in the School of Criminology and Criminal Justice, Watts College of Public Service and Community Solutions, Arizona State University. His research interests include sentencing, plea bargaining, and criminal careers.

Shawn D. Bushway is a professor of public administration and policy in the Rockefeller College of Public Affairs and Policy at the University at Albany, SUNY. His current research interests include the relationship between work and crime, plea bargaining, and the process of desistance/dynamic change in offending behavior.

Notes

1 In this paper, we refer to the differences in sentence as the trial-plea disparities. We only use the terms “trial penalties” and “plea discounts” when we mention the specific theories and modeling strategies.

2 In reality, prosecutors and defense attorneys are likely to make decisions on a case-by-case basis. There are cases that the parties seek trials from the beginning, and cases that both parties aim at pleading out (Heumann, Citation1978; Ulmer, Citation1997). A more accurate interpretation of “default” refers to the overall pattern or proportion—or the “standard practice” for most cases to be disposed of.

3 For more technical details of the shadow theory, see Bushway et al. (Citation2014).

4 By definition, either “trial penalties” or “plea discounts” are only applicable to defendants who did not pick the default path. If trials are the default, then defendants who plead receive plea discounts, and those convicted at trial just receive the default sentence. To the contrary, if defendants who plead receive the default sentence, then those convicted at trial receive trial penalties. Under this view on the default sentence, we cannot observe both trial penalties and plea discounts on the same defendant. Instead, we will have to estimate the trial penalties and plea discounts on two different groups of defendants.

5 Of course, plea bargaining dynamics may change with the practice of local prosecutor offices, and inexperienced or uncooperative defense attorneys (and sometimes prosecutors) may consistently fail to predict the outcome of cases they handle. We were unable to test the extent of these issues, since our dataset does not contain information on the stability of policy or the length of working relationship among courtroom actors. Nevertheless, multiple previous studies suggested that local courtroom communities tended to be reasonably stable, and courtroom actors were adaptive to local policy changes (Heumann, Citation1978; Ulmer, Citation1997). Therefore, we do not believe the issues would substantively affect the findings and arguments of the paper.

6 The reason is that criminal statutes often specify a close range of the corresponding sentence for a given charge. Even when the law prescribes a wide range of sentence options (such as the Penal Law of New York State; for details, see Yan, Bushway, & Redlich, Citation2018), courtroom actors often share local norms and “going rates” (Eisenstein & Jacob, Citation1977; Nardulli et al., Citation1988; Ulmer, Citation1997). The practice of “judge shopping” also suggests that the parties are aware of the potential sentences. To the contrary, the standard of “beyond a reasonable doubt” can be more ambiguous. Psychological studies suggest that a variety of cognitive factors may distort the assessment of guilty, particularly in jury trials (Gould & Leo, Citation2010). This is less the case in assessing the sentence.

7 Specifically, depending on the mode of conviction, a variable can have different correlations with the sentence. For example, it is quite common for prosecutors to withhold prior convictions from the sentence recommendation in exchange for guilty pleas (Heumann, Citation1978). Therefore, for defendants who plead guilty, each prior conviction may be correlated with a smaller sentence enhancement (i.e., having a smaller regression coefficient) than defendants convicted at trial. The same may be true for the type and severity of the initial charge (if charge bargaining is prevalent) and even for extralegal defendant characteristics. If the sentence at trial were modeled together with the sentence at plea, important differences in the sentence generating processes would be masked. This problem can be partially resolved by adding an interaction term between trial/plea and each regressor to the model. However, it will likely result in a model with too many regressors, and a table that is too complicated to interpret.

8 Another piece of evidence supporting this argument is that among comments on individual cases, the term “trial penalty” was most likely to appear when defendants of less serious crimes received harsh penalties at trial (Alschuler, Citation1968; Kim, Citation2015; McCoy, Citation2005). To the contrary, few researchers have called it a “trial penalty” when defendants of serious crimes received harsh sentences at trial. The only exception to the latter may be the circumstances where defendants of murder received the death penalty at trial after they declined plea offer. Yet those circumstances are beyond the scope of the current paper.

9 For example, Abrams (Citation2011) used the length of tenure of judges as an instrument to estimate the effect of pleading on the sentence, but acknowledged that the instrument was weak (p. 217). Researchers were able to measure some of the psychological factors mentioned here using experimental designs (see Redlich, Bibas, Edkins, & Madon, Citation2017). However, this approach has known external validity issues, as the vast majority of studies used vignettes in which the sentence was hypothetical and manipulated. The link between the unobserved defendant characteristics and the “benefits” of pleading guilty in real-life cases remains largely unanswered in the literature.

10 Ulmer (Citation1997, p. 91) indicated that in one of the counties studied, “[t]he supervisor of the sexual assault team indicated that his ADAs engaged in plea negotiations (with frequent charge reductions) more often because evidence in such cases was typically weak (e.g., reluctance of victims to testify in court, indefinite physical evidence, lack of witnesses). On the other hand, the supervisor of the drug team said that his ADAs seldom engaged in plea negotiations…”

11 This does not necessarily imply that any incident that falls under one crime type is more serious and blameworthy than every incident under another type. For example, although some homicide actions are more justifiable than some robberies (such as euthanasia), the average homicide is more serious than the average robbery. Therefore, we believe we can reasonably establish some hierarchy of the severity of crimes.

12 The sample did not suffer from major missing data problems. Less than 0.1% of the sample had missing arraignment charges, we used their arrest charges (which had no missing) as their arraignment charges. Approximately 0.8% of the sample had race coded as “unknown” or “missing,” we combined these defendants with defendants who were Asian (1.5%) or of other race (0.6%) into one category “other or missing race” (see Table 2 below). A total of 23 defendants had missing gender, and we dropped them from the analysis. There was no missing in the county or year of disposition.

13 Sentence exposure refers to the maximum sentence a defendant can receive for the top charge. This approach allowed us to take into account the variation in severity within a crime type. For example, robbery 1st degree had a statutory maximum of 25 years, whereas robbery 3rd degree had 7 years. There were 14 different statutory classes of felonies in New York, and we used the statutory maximum sentence for each class. For details on the statutory exposure for each crime type, see McKinney's Consolidated Laws of New York Annotated, 2012 edition. We also tested the average sentence received by defendants convicted at trial as an alternative measure of crime severity at the crime level, and the results were similar.

14 The courtroom workgroup theories was proposed at the individual courtroom level (sometimes even specific to individual judges), and ideally, researchers need to estimate county-specific models. However, the analysis we conducted required a large number of trial cases, and none of the counties had a sufficiently large trial sample. To partially address this issue, we conducted a robustness check using only cases processed in New York City. See Results section for details.

15 In the plea discount model, some crime types were even predicted to have a negative average trial sentence—which can be conceptually perceived as some alternative sanctions. According to Equation 3, this will result in plea discount estimates greater than one. If the trial sentence is lower than the plea sentence, but still positive, the resulting plea discount will be negative.

16 This can be particularly likely if defendants were detained before they plead guilty. Unfortunately, the dataset does not contain information on pretrial detention, and we could not further validate this explanation.

17 With a sample size around 40, it was very difficult to find significant relationships. Yet two problems could arise if we attempted to increase the sample size here (i.e., having more crime types in the analysis). One was that the dataset had a limited number of categories (approximately 50) to begin with. The other and the more problematic one was that if we increased the number of crime types, more crime types may have had unreasonable plea discount and/or trial penalty estimates because of the lack of trial conviction cases.

18 Even though the slope estimate here was much larger than that in the plea discount model, it is noteworthy that plea discounts varied below one in theory (and between -0.5 and 1 in the analytic sample), whereas trial penalties did not have a theoretical upper limit. Therefore, we caution readers from concluding that there was stronger support to the trial penalty perspective only because the coefficient was larger.

19 All robustness check results are available upon request from the corresponding author.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 386.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.