391
Views
0
CrossRef citations to date
0
Altmetric
Editorial

How to maximize power for differential expression analysis in discovery omics through experimental design

, &
Pages 299-301 | Received 11 Sep 2023, Accepted 11 Nov 2023, Published online: 24 Nov 2023

1. Introduction

Unfortunately, ‘experimental design’ can mean different things to different researchers involved in the same discovery-omics study. For some, it can mean instrument configuration [Citation1], experimental protocol used, or other elements of the experiment from sampling to data analysis. For statisticians, it has a clear and historic meaning dating back to Fisher who introduced the concept of experimental design almost a century ago [Citation2]. He considered it to include three elements of an experiment: replication, randomization, and blocking. This use of experimental design is an often underappreciated, yet critically important element of discovery-omics experiments, particularly within a translational medicine context. Oberg, Vitek, and others have addressed the importance of proper experimental design for proteomics [Citation3,Citation4], but the call for proper experimental design in omics goes back at least to the early days of microarrays [Citation5].

To statisticians, broadly speaking, factors are those variables that are under the control of the experimenter, and that are varied to study one or more response variables. In translational omic studies, these responses are often patient outcomes, while treatment levels are the levels of the factors. For example, when comparing the molecular response to knocking out a gene, ‘transgenic’ and ‘wild-type’ would likely be treatment levels of the ‘knockout’ factor. The three components of experimental design, replication, randomization, and blocking, occur within the context of the experimental factors and their treatment levels.

2. Replication

Variation is inherent to the physical world. Variation is frequently considered as either technical or biological, but this is just a useful simplification. Technical variation can come from different sources, including sample handling and from the analytic instruments, often called pre-analytic and analytic variation, respectively. In bottom-up proteomics, there is, e.g., analytic variation due to proteolytic digestion or MS detection. Likewise, biological variation can occur over space within a single individual (e.g. across the surface of an organ), or over time (e.g. days since transplant), as well as the obvious difference between individuals. Variation is thus driven by all causes of the response, both measured and unmeasured, plus measurement errors from instruments. Replication is the process by which we explore variation and the analysis of this variation is the core of the statistics of differential expression analysis. Replication allows an experiment to increase the precision of its estimates and hence power, since the random variation of a statistic (sampling error) will tend to decrease as function of the square root of the number of independent replicates. Chance imbalances between factor levels will in probability even out with greater replication, preventing bias.

Not all replicates are worth the same, however. Biological replicates will typically yield more information than technical replicates since they are independent samples, whereas technical replicates are correlated and hence partially redundant. The relative increase in power associated with increasing replications always falls on a complex landscape based on the specific choices made in the omics implementation. For example, Dowell et al. recently explored the interplay of data-dependent and data-independent acquisition in bottom-up proteomics, showing the need to pair analytic techniques with the appropriate statistical methods [Citation6].

An experimental design must, therefore, consider a priori the number of technical and biological replicates; whether the number of technical replicates will differ between biological replicates, and to what extent; and if technical replicates are to be averaged or treated as nested observations. Each choice will imply a different experimental design with implications for statistical inference and power. For example, the use of technical replications nested within biological samples is analogous to a cluster randomized design, which in the absence of balance or in the presence of many factors requires use of multilevel models.

Omics analyses are still relatively costly. Thus, it is commendable to (i) establish robust standard operating procedures, (ii) determine technical variations for a given sample type, and (iii) if acceptable (e.g. <20% CV), consider these variations for the actual study to (iv) include more biological replicates at the expense of technical replicates and generate more valuable data. Particularly for clinical specimens, it is imperative to assess the impact of sampling protocols on measurements. Mertins et al. demonstrated that ischemia during collection of cancer tissues can substantially affect a subset of the phosphoproteome, including critical cancer pathways related to stress response, transcriptional regulation, and cell death [Citation7]. In clinical settings, unavoidable delays between blood collection and plasma generation can range from minutes to hours. While the impact on peptide-centric targeted proteomics seems to be minor [Citation8], intact-protein assays such as ELISA, SomaLogic, or Olink could be significantly impacted by protein degradation/modification during hours of room temperature blood storage.

While controls are an essential component to any meaningful experiment, they need to be carefully planned to avoid excessive use. Because of the high specificity of omics (quantitative data typically linked to identified biomolecule) at well-controlled quality (false-discovery rates), classical negative controls are often dispensable.

3. Randomization

All measurement systems experience errors that frequently change over time. MS can experience drifts in signal intensity and mass deviation, while in LC analyte peak shapes and retention times can change. Moreover, some peptides within a sample are less stable and may degrade or modify over time. All these factors will ultimately affect quantitative studies: if signal intensity decreases over the course of a label-free quantitation study, where control samples are measured first followed by the case samples, the measured intensities of the control samples will be biased upwards relative to the case samples. Randomization, where samples are not measured in blocks (first controls then cases) but rather in randomized order, is the strategy to avoid such systematic biases: intensity drifts will result in increased noise (and a corresponding loss of power), but it will not lead to biased results.

In classical experiments, to avoid systematic errors, research subjects or patients are randomly assigned to treatments. This prevents differences in subjects from creating a bias and the illusion of a difference when none is present. Unfortunately, for many translational studies it is impossible to randomly assign subjects to treatments, and these rather represent observational studies to examine subjects that present with a given condition. For example, we take samples from a biobank that were collected prior to learning individual outcomes and examine those samples for evidence that can be correlated to outcome. Even still, randomization is important.

4. Blocking

Many discovery-omics technologies can only measure a limited number of samples concurrently or within a batch. This limit might be the number of stable-isotope labeling variants or the maximum number of LC runs before necessary maintenance. As Churchill explained regarding early microarrays, the number of samples in discovery studies can greatly exceed this limit, provided the samples are properly arranged or blocked into groups [Citation5]. Notably, different blocks, e.g., samples analyzed before and after routine system maintenance, will often have a systematic drift in signal intensity leading to systematic errors and consequently to potentially false conclusions. However, complete randomization, in which samples [replicates] are assigned fully at random to treatment level without considering batch number, technician, reagent lot, or other nuisance factors, can lead to chance imbalances between treatment groups, even though it is otherwise unbiased. Thus, in a design with two treatment levels and batch size limited to ten samples, there is only a ~25% chance of achieving balance (5:5 treated:control) with complete randomization in a given batch. Any blocking in the design must be incorporated into the statistical analysis. For many blocked designs, this means extending simple univariate tests, such as t-tests of the log abundances, to linear regression models that include both treatment and blocks (batch, etc.) as covariates. For nested designs or experiments with missing data, which can induce a correlation between treatment and blocks, more complex models are required. In practice, different omics technologies can rely on standardized experimental designs, and as long as the researcher modifies their experiment to match the blocking, the blocking design will be fine. Huang et al. recently proposed a set of blocked designs for TMT-based proteomics [Citation9], Clough et al. demonstrated the utility of blocking within label-free bottom-up proteomics [Citation10], while Churchill preposing designs for two-color microarrays 21 years earlier [Citation5]. The key point when considering a blocked experimental design is that if the study is not using a known blocking strategy, care should be taken to consult with someone knowledgeable in blocking before beginning data collection.

5. Conclusion

Algorithms generate numbers. It is rare that an improperly designed experiment fails in such a way that a statistical analysis cannot be run, or that the analysis will signal the experimenter that something has gone wrong. Rather, poorly designed experiments generate meaningless lists of ‘differentially expressed’ molecular entities that do not reflect reality.

We remind researchers that it is essential to use proper statistical experimental design when conducting omics experiments. Otherwise, any effort to follow up on these mis-identified differentially expressed features will result in failure and frustration. It is the researcher’s responsibility to ensure that they conduct well-designed studies. This is best done by consulting with both statisticians and those performing the omic analysis, prior to collecting any samples.

Declaration of interests

RP Zahedi is CEO of MRM Proteomics Inc. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

Reviewer disclosures

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Additional information

Funding

This paper was funded by the University of Manitoba and Shared Health.

References

  • Eriksson J, Fenyö D. Modeling experimental design for proteomics. Methods Mol Biol. 2010;673:223–230. doi: 10.1007/978-1-60761-842-3_14
  • Fisher RA. The design of experiments. Edinburgh: Oliver and Boyd; 1935.
  • Oberg AL, Vitek O. Statistical design of quantitative mass spectrometry-based proteomic experiments. J Proteome Res. 2009;8(5):2144–2156. doi: 10.1021/pr8010099
  • Podwojski K, Stephan C, Eisenacher M. Important issues in planning a proteomics experiment: statistical considerations of quantitative proteomic data. Methods Mol Biol. 2012;893:3–21. doi: 10.1007/978-1-61779-885-6_1
  • Churchill GA. Fundamentals of experimental design for cDNA microarrays. Nat Genet. 2002;32(Suppl):490–495. doi: 10.1038/ng1031
  • Dowell JA, Wright LJ, Armstrong EA, et al. Benchmarking quantitative performance in label-free proteomics. ACS Omega. 2021;6(4):2494–2504. doi: 10.1021/acsomega.0c04030
  • Mertins P, Yang F, Liu T, et al. Ischemia in tumors induces early and sustained phosphorylation changes in stress kinase pathways but does not affect global protein levels. Mol & Cell Proteomics. 2014;13(7):1690–1704. doi: 10.1074/mcp.M113.036392
  • Gaither C, Popp R, Zahedi RP, et al. Multiple reaction monitoring-mass spectrometry enables robust quantitation of plasma proteins regardless of whole blood processing delays that May occur in the clinic. Mol & Cell Proteomics. 2022;21(5):100212. doi: 10.1016/j.mcpro.2022.100212
  • Huang T, Staniak M, da Veiga Leprevost F, et al. Statistical detection of differentially abundant proteins in experiments with repeated measures designs and isobaric labeling. J Proteome Res. 2023;22(8):2641–2659. doi: 10.1021/acs.jproteome.3c00155
  • Clough T, Thaminy S, Ragg S, et al. Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs. BMC Bioinf. 2012;13(Suppl16):S6. doi: 10.1186/1471-2105-13-S16-S6

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.