When I was a PhD student in epidemiology, doing a case-control study of pesticides and non-Hodgkin’s lymphoma, the two best-thumbed books on my shelf were those by Breslow and Day (1980) and Schlesselman (1982), with the former addressing case-control studies through a statistical perspective and the latter through an epidemiological perspective.
This new text by Borgan et al. is the logical successor of Breslow and Day (in fact, Breslow is one of the co-editors), and it is fascinating to see the progress that has been made since 1980. The fundamentals of the analysis of case-control studies have changed little during that time. There are major changes in available computing power, making many of the traditional computational methods redundant (e.g., the formulas for matched analyses, although these were just reformulations of the well-known Mantel–Haenszel (1959) method). On the other hand, it is striking how some methodological errors transcend the generations. For example, I am aware of courses that still teach that a paired matched design requires a pair-matched analysis. In fact, Breslow and Day state, with reference to matched case-control studies, that “if matching is performed only on age and sex then a stratified analysis rather than one which retains individual matching may be more appropriate. Individual matching in the analysis is only necessary if matching in the design was genuinely at the individual level,” and the same point is made in Borgan et al. Some methodological errors have to be rectified anew with each new generation (Pearce Citation2016; Mansournia, Jewell, and Greenland Citation2018).
What is particularly striking, and useful, in Borgan et al. is the wealth of new topics, particularly those covered in the sections on case-control studies that use full-cohort information, case-control studies for time-to-event data, and case-control studies in genetic epidemiology. These particularly relate to developments in the two main methods of sampling controls in nested case-control studies: density sampling and case-cohort sampling. The former had been “discovered” prior to 1980 (Sheehe Citation1962; Miettinen Citation1976), but was not referenced by Breslow and Day, who repeated the well-known (and generally incorrect) conclusion of Cornfield (Citation1951) that the odds ratio in a case-control study only approximately estimates the relative risk, and only if the disease is rare. The case-cohort approach was not widely known in 1980 (though a key paper was published by Thomas (Citation1972)). Both approaches have become mainstream in the last 40 years, with many related developments. These particularly include methods for using the whole cohort for analysis of the subsample that comprises the case-control study. For example, Lumley discusses calibration of weights ( “generalized raking”), and Keogh discusses the useful of multiple imputation to use the data that is available for the whole cohort in order to better analyze the nested case-control data. The latter approach essentially involves analyzing the whole set of cohort data and using multiple imputation for those variables that were only collected in the case-control subset. There are also excellent chapters on the self-controlled case series method, and various methods for case-control studies of genetic epidemiology.
So far, so good. So what is missing from this “definitive” new text? Unfortunately it is missing most of the developments in epidemiological methods that have paralleled these developments in statistical methods. For a start, it is extraordinary to see a modern text on case-control studies that does not use (at least sometimes) directed acyclic graphs (DAGs) to address issues of case-control design and analysis, especially when precursors of such graphs could be found in discussions of confounding in Breslow and Day (Citation1980, sec. 3.4) and Schlesselman (1982, sec. 2.10). There is an excellent short chapter by Didelez and Evans on causal inference from case-control studies, which introduces the reader to DAGs. However, I could not find them used anywhere else in the text, not even in the chapter on matched case-control studies—which involve issues that are difficult to explain mathematically yet are easily explained with DAGs (Mansournia, Hernan, and Greenland Citation2013). Apart from the coverage of some of these topics in the chapter by Didelez and Evans, there is little or nothing on hierarchical regression, multilevel analysis, shrinkage (although this is covered briefly in the excellent chapter by Graham and McNeney), causal diagrams, marginal structural models, Mendelian randomization (the two-sample approach usually uses case-control studies for the gene-outcome association estimation (Lawlor Citation2016)), instrumental variables, mediation analysis, triangulation, and a host of other important topics (Vandenbroucke and Pearce Citation2015) including prevalent/incident exposures in case-control studies, and the never-ending significance-testing controversy (Wasserstein, Schirm, and Lazar Citation2019). One could argue that it is not necessary to cover all developments in epidemiological methods for a textbook that is specifically focused on statistical data analysis. Nonetheless, most of the methods just listed are indeed statistical, and their omission shows how isolated different sections of the “epidemiological-statistical complex” remain. I myself have been quite critical of some of these new methods, at least when they have been applied inappropriately (Broadbent, Vandenbroucke, and Pearce Citation2016; Vandenbroucke, Broadbent, and Pearce Citation2016), but I still recognize their power and validity when used appropriately.
So what of these developments in epidemiological theory that have occurred since the publication of Schlesselman in 1982? For these, you will have to look elsewhere (e.g., Rothman, Greenland, and Lash Citation2008). Provided that it is read and used together with such a comprehensive epidemiological text, this new Handbook of Statistical Methods for Case-Control Studies is a valuable and important book, which will be useful for seminars and courses on the developments in statistical theory that have occurred since the publication of Breslow and Day in 1980.
London School of Hygiene and Tropical Medicine
United Kingdom
[email protected]
References
- Breslow N. E., and Day, N. E. (1980), Statistical Methods in Cancer Research, Volume 1—The Analysis of Case-Control Studies, Lyon: IARC.
- Broadbent, A., Vandenbroucke, J. P., and Pearce, N. (2016), “Formalism or Pluralism? A Reply to Commentaries on ‘Causality and Causal Inference in Epidemiology’,” International Journal of Epidemiology, 45, 1841–1851.
- Cornfield, J. (1951), “A Method of Estimating Comparative Rates From Clinical Data: Applications to Cancer of the Lung, Breast and Cervix,” Journal of the National Cancer Institute, 11, 1269–1275. DOI: https://doi.org/10.1093/jnci/11.6.1269.
- Lawlor, D. A. (2016), “Commentary: Two-Sample Mendelian Randomization: Opportunities and Challenges,” International Journal of Epidemiology, 45, 908–915. DOI: https://doi.org/10.1093/ije/dyw127.
- Mansournia, M. A., Hernan, M. A., and Greenland, S. (2013), “Matched Designs and Causal Diagrams,” International Journal of Epidemiology, 42, 860–869. DOI: https://doi.org/10.1093/ije/dyt083.
- Mansournia, M. A., Jewell, N. P., and Greenland, S. (2018), “Case-Control Matching: Effects, Misconceptions, and Recommendations,” European Journal of Epidemiology, 33, 5–14. DOI: https://doi.org/10.1007/s10654-017-0325-0.
- Mantel, N., and Haenszel, W. (1959), “Statistical Aspects of the Analysis of Data From Retrospective Studies of Disease,” Journal of the National Cancer Institute, 22, 719–748. DOI: https://doi.org/10.1093/jnci/22.4.719.
- Miettinen, O. (1976), “Estimability and Estimation in Case-Control Studies,” American Journal of Epidemiology, 103, 226–235. DOI: https://doi.org/10.1093/oxfordjournals.aje.a112220.
- Pearce, N. (2016), “Analysis of Matched Case-Control Studies,” British Medical Journal, 352, i969.
- Rothman, K. J., Greenland, S., and Lash, T. L. (2008), Modern Epidemiology (3rd ed.), Philadelphia, PA: Lippincott Williams & Wilkins.
- Schlesselman, J. J. (1982), Case-Control Studies: Design, Conduct, Analysis, New York: Oxford University Press.
- Sheehe, P. R. (1962), “Dynamic Risk Analysis in Retrospective Matched-Pair Studies of Disease,” Biometrics, 18, 323–341. DOI: https://doi.org/10.2307/2527475.
- Thomas, D. B. (1972), “The Relationship of Oral Contraceptives to Cervical Carcinogenesis,” Obstetrics and Gynecology, 40, 508–518.
- Vandenbroucke, J., Broadbent, A., and Pearce, N. (2016), “Causality and Causal Inference in Epidemiology—The Need for a Pluralistic Approach,” International Journal of Epidemiology, 45, 1776–1786. DOI: https://doi.org/10.1093/ije/dyv341.
- Vandenbroucke, J., and Pearce, N. (2015), “Point: Incident Exposures, Prevalent Exposures, and Causal Inference: Does Limiting Studies to Persons Who Are Followed From First Exposure Onward Damage Epidemiology?,” American Journal of Epidemiology, 182, 826–833. DOI: https://doi.org/10.1093/aje/kwv225.
- Wasserstein, R. L., Schirm, A. L., and Lazar, N. A. (2019), “Moving to a World Beyond p < 0.05,” The American Statistician, 73, 1–19.