2,950
Views
58
CrossRef citations to date
0
Altmetric
Articles

Re-Examining the Functional Form of the Certainty Effect in Deterrence Theory

Pages 712-741 | Published online: 14 Jun 2011
 

Abstract

In this paper we explore the functional form of the risk-certainty effect for deterrence. Using a sample of serious youth offenders, we first estimate a simple linear model of the relationship between the perceived certainty of punishment and self-reported offending. Consistent with previous literature we find evidence of a moderate deterrent effect. We then examined whether, consistent with a linear model, the effect of perceived risk is truly constant at different ranges of the risk continuum. Estimating a nonparametric regression model that makes no a priori assumption about the functional form of the model but allows the data itself to yield the appropriate functional form, we found marked departures from linearity. Our examination showed evidence of both a tipping effect, whereby perceived risk deters only when it reaches a certain threshold (between an estimated risk of .3 and .4) and a substantially accelerated deterrent effect for individuals at the high end of the risk continuum. Perceived sanction threats did, however, have a non-trivial deterrent effect within the mid-range of risk. The implications of our findings for both theory and additional research are discussed.

Acknowledgments

The project described was supported by funds from the following: Office of Juvenile Justice and Delinquency Prevention, National Institute of Justice, John D. and Catherine T. MacArthur Foundation, William T. Grant Foundation, Robert Wood Johnson Foundation, William Penn Foundation, Center for Disease Control, National Institute on Drug Abuse (R01DA019697), Pennsylvania Commission on Crime and Delinquency, and the Arizona Governor’s Justice Commission. We are grateful for their support. The content of this paper, however, is solely the responsibility of the authors and does not necessarily represent the official views of these agencies.

Notes

1. And these data probably underestimate differences in clearance rates by crime type. Estimated clearance rates are artificially inflated because the denominator consists of crimes known to the police and the police cannot possibly know all of the crimes. However, the proportion of all crimes of physical victimization known to the police is likely to exceed the proportion of crimes that do not entail physical victimization known to the police. The former are more serious and there is, by definition, at least one witness (the victim) to the crime.

2. This representation makes two simplifying assumptions. First, it assumes the benefits are not contingent on avoiding detection. This assumption can be relaxed as follows: (1 − p)U(Benefits) − p*U(Costs) > 0. In this latter case, the utility of the benefits are only offset against the utility of the costs if the offender avoids detection. Second, the model assumes all costs from offending are contingent on apprehension by authorities. This need not be true. For example, Grasmick and Bursik (Citation1990) observed that individuals can feel ashamed or embarrassed about committing a crime even if they are not officially sanctioned. Williams and Hawkins (Citation1986) termed this “stigma from the act.” While potentially important in other contexts, these two assumptions are inconsequential for present purposes.

3. This area of research overlaps with and is sometimes called Behavioral Economics.

4. Consider a further example, adapted from Gonzalez and Wu (Citation1999). Suppose a researcher must decide whether to improve a manuscript submission with additional analyses that will take one week to complete. The contemplated analyses are expected to improve the chances of publication by 10 percentage points. Would the researcher be any more willing to do the additional work if they believed the probability of success without the extra analyses was .90 than if they believed that probability was .40? Or, what about 0 vs. .40? In the former instance, the result is a certain publication. In the latter, the chances improve from no chance to at least some chance.

5. According to Prospect Theory, decisions under uncertainty entail two phases (Kahneman & Tversky, Citation1979). In the first or editing phase, the various behavioral prospects confronting an individual are given a preliminary examination and the available options are filtered and simplified. In the second evaluation phase, the decision-maker selects the option he or she values most. One of the consequences of the simplification of options that occurs during the editing phase is that the person may “discard events of extremely low probability and…treat events of extremely high probability as if they were certain” (Kahneman & Tversky, Citation1979, p. 282). In other words, Prospect Theory anticipates that people have limited ability to discriminate at both the low and high end of probability judgments with the result that “highly unlikely events are either ignored or over weighted, and the difference between high probability and certainty is either neglected or exaggerated” (Kahneman & Tversky, Citation1979, pp. 282-283).

6. The 17 self-report items include: (1) destroyed or damaged property, (2) set fire to house, building, etc., (3) rape, (4) murder, (5) shot at someone, (6) beaten up someone badly, (7) been in a fight, (8) beaten up, threatened or attacked someone as part of a gang, (9) entered/broken in a building to steal, (10) used checks/credit cards illegally, (11) stolen a car/motorcycle to keep or sell, (12) prostitution, (13) stolen something from a store, (14) stolen a car/motorcycle to keep or sell, (15) carjacked someone, (16) taken something from another by force (w/weapon), and (17) taken something from another by force (w/out weapon).

7. An alternative to defining SRO with a binary indicator would have been to treat offending as a count variable, and thus use the total SRO frequency during that period. However, this strategy would have induced several key problems. First, in each observation period after the first one, the median frequency was zero, and overall, the upper quartile of the SRO distribution is 2 and the 90th quantile is 8, suggesting that much of the offending activity is being driven by a decreasing number of offenders in the upper half to third of the distribution. Second, there are some extreme (and perhaps unrealistic) outliers, as evident from the large maxima and standard deviations of the reported frequency distributions. In combination, these factors resulted in such extreme skew, that the mean was likely not an appropriate summary measure for the subsequent analyses we employ. All of these issues are thus correctable by simply treating the SRO measure as binary.

8. Since the same individual can contribute multiple observations, we have cluster-corrected the standard errors throughout the analyses.

9. We use the term marginal effect to literally refer to the slope of (or change in) the functional form of the relationship between expected SRO and perceived risk. We do not mean to imply that a reader should take this “effect” as having any type of causal interpretation in our specification.

10. For example, including a squared term would imply that the shape of the function is parabolic.

11. GAM is a preferable nonparametric estimation method as it provides a significance test as to whether the smooth fit is different from flat.

12. As opposed to a standard parametric linear model, which assumes the expected value of outcome Y given predictors X has a linear form E[Y|X] = β 0 + β 1 X, the GAM generalizes the linear model by modeling E[Y|X] = s 0 + s 1(X), where s 1(·) is any smooth function, and in this case is estimated via smoothing splines. Since our outcome is binary, the present analysis uses a version of the estimator incorporating a logit link function.

13. As is typical with many nonparametric estimators, over-fitting or under-fitting may be an issue. The R package GAM has a built-in measure which utilizes cross-validation as a mechanism to determine optimal fit.

14. There are some conceptual issues that arise when considering individuals who report having zero risk. For instance, some individuals who report zero risk may do so as a result of their intention to simply not offend. These individuals might be thought of as “out-of-market” offenders, at least as far as the current period. Alternatively, others who report zero risk may be the result of an “experiential effect”, i.e. some very high-rate offenders (accurately) perceive that there is virtually no chance they get caught, but it is not enough to actually deter them (i.e. for all intents and purposes, sanction threat perceptions do not matter to them). These individuals are in fact not deterrable, and thus will not contribute to any potential tipping effect. Thus, the inclusion of either set of individuals may thus be distorting the true “offender” functional form. To directly address this we exclude the n = 516 individuals who report having zero risk from the GAM estimate, resulting in an estimation sample of N = 8,932. Note, however, that (1) this exclusion is merely done so as not to distort the true functional form on the learned graph, and most importantly, (2) the conclusions would remain fundamentally unchanged if these individuals were to be included, given that we use locally-weighted estimator. Specifically, this means that all subsequent results, including the location of the tipping point and differential marginal effects in different regions of the risk continuum, are independent of this omission.

15. Two technical points regarding estimation should be mentioned. First, note that standard linear model is as preferable as GAM in this case, since all we require to test is if the slope is different than flat (i.e. a null hypothesis of no association). Thus, we use the former strictly for parsimony. Second, the standard errors are again cluster-corrected at the individual level. Still, given the large amount of data, we should have ample statistical power to estimate local linear models, and still be able to detect statistically significant local effects. Thus, any test in which we retain the null hypothesis is not likely to reflect a Type II error.

16. Note that the zero risk individuals are again removed in accordance with the above results, although this “correction” is once again ultimately trivial, as we are searching for the point at which the slope turns negative, and the inclusion of these zero-risk individuals would clearly not influence this, as is visible from Figure .

17. We may be able to narrow the local region where the tipping point exists even further (e.g. restrict the range to, say .3–.35), however, a limited capacity for sufficient statistical power precludes this.

18. For those readers unfamiliar with logit marginal effects, this quantity is computed as ∂P(SRO/risk)/∂risk, or in other words, the change in the probability of the offending with respect to a change in risk. In this case, it may be thought of as measuring the slope of the function, and hence it is used as it is a more contextually intuitive alternative to reporting the results of the logit model as an odds ratio.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.