4,000
Views
7
CrossRef citations to date
0
Altmetric
Review

Statistical normalization methods in microbiome data with application to microbiome cancer research

ORCID Icon
Article: 2244139 | Received 20 Feb 2023, Accepted 31 Jul 2023, Published online: 25 Aug 2023

References

  • Hong M, Tao S, Zhang L, Diao L-T, Huang X, Huang S, Xie S-J, Xiao Z-D, Zhang H, RNA sequencing: new technologies and applications in cancer research. J Hematol Oncol 2020, 13 (1), 1–41. doi:10.1186/s13045-020-01005-x
  • Ahn H, Min K, Lee E, Kim H, Kim S, Kim Y, Kim G, Cho B, Jeong C, Kim Y, Whole-transcriptome sequencing reveals characteristics of cancer microbiome in Korean patients with GI tract cancer: fusobacterium nucleatum as a Therapeutic target. Microorganisms 2022, 10 (10), 1896. doi:10.3390/microorganisms10101896
  • Newsome RC, Yang Y, Jobin C, The microbiome, gastrointestinal cancer, and immunotherapy. J Gastroen Hepatol 2022, 37 (2), 263–272. doi:10.1111/jgh.15742
  • Ajayi TA, Cantrell S, Spann A, Garman KS, Leong JM, Barrett’s esophagus and esophageal cancer: links to microbes and the microbiome. PLoS Pathog 2018, 14 (12), e1007384. doi:10.1371/journal.ppat.1007384
  • Stewart OA, Wu F, Chen Y, The role of gastric microbiota in gastric cancer. Gut Microbes 2020, 11 (5), 1220–1230. doi:10.1080/19490976.2020.1762520
  • Janney A, Powrie F, Mann EH, Host–microbiota maladaptation in colorectal cancer. Nature 2020, 585 (7826), 509–517. doi:10.1038/s41586-020-2729-3
  • Yang D, Wang X, Zhou X, Zhao J, Yang H, Wang S, Morse MA, Wu J, Yuan Y, Li S, Blood microbiota diversity determines response of advanced colorectal cancer to chemotherapy combined with adoptive T cell immunotherapy. Oncoimmunology 2021, 10 (1), 1976953. doi:10.1080/2162402X.2021.1976953
  • Yang Y, Gharaibeh RZ, Newsome RC, Jobin C, Amending microbiota by targeting intestinal inflammation with TNF blockade attenuates development of colorectal cancer. Nature Cancer 2020, 1 (7), 723–734. doi:10.1038/s43018-020-0078-7
  • Riquelme E, Zhang Y, Zhang L, Montiel M, Zoltan M, Dong W, Quesada P, Sahin I, Chandra V, San Lucas A, et al., Tumor microbiome diversity and composition influence pancreatic cancer outcomes. Cell 2019, 178 (4), 795–806.e12. doi:10.1016/j.cell.2019.07.008
  • Geller LT, Barzily-Rokni M, Danino T, Jonas OH, Shental N, Nejman D, Gavert N, Zwang Y, Cooper ZA, Shee K, Potential role of intratumor bacteria in mediating tumor resistance to the chemotherapeutic drug gemcitabine. Sci 2017, 357 (6356), 1156–1160. doi:10.1126/science.aah5043
  • Chakladar J, Kuo SZ, Castaneda G, Li WT, Gnanasekar A, Yu MA, Chang EY, Wang XQ, Ongkeko WM, The pancreatic microbiome is associated with carcinogenesis and worse prognosis in males and smokers. Cancers 2020, 12 (9), 2672. doi:10.3390/cancers12092672
  • Ponziani FR, Bhoori S, Castelli C, Putignani L, Rivoltini L, Del Chierico F, Sanguinetti M, Morelli D, Paroni Sterbini F, Petito V, Hepatocellular carcinoma is associated with gut microbiota profile and inflammation in nonalcoholic fatty liver disease. Hepatology 2019, 69 (1), 107–120. doi:10.1002/hep.30036
  • Ren Z, Li A, Jiang J, Zhou L, Yu Z, Lu H, Xie H, Chen X, Shao L, Zhang R, Gut microbiome analysis as a tool towards targeted non-invasive biomarkers for early hepatocellular carcinoma. Gut 2019, 68 (6), 1014–1023. doi:10.1136/gutjnl-2017-315084
  • Behary J, Amorim N, Jiang X-T, Raposo A, Gong L, McGovern E, Ibrahim R, Chu F, Stephens C, Jebeili H, et al., Gut microbiota impact on the peripheral immune response in non-alcoholic fatty liver disease related hepatocellular carcinoma. Nat Commun 2021, 12 (1), 187. doi:10.1038/s41467-020-20422-7
  • Zheng Y, Fang Z, Xue Y, Zhang J, Zhu J, Gao R, Yao S, Ye Y, Wang S, Lin C, Specific gut microbiome signature predicts the early-stage lung cancer. Gut Microbes 2020, 11 (4), 1030–1042. doi:10.1080/19490976.2020.1737487
  • Zhu J, Liao M, Yao Z, Liang W, Li Q, Liu J, Yang H, Ji Y, Wei W, Tan A, Breast cancer in postmenopausal women is associated with an altered gut metagenome. Microbiome 2018, 6 (1), 1–13. doi:10.1186/s40168-018-0515-3
  • Liss MA, White JR, Goros M, Gelfond J, Leach R, Johnson-Pais T, Lai Z, Rourke E, Basler J, Ankerst D, et al., Metabolic biosynthesis pathways identified from fecal microbiome associated with prostate cancer. Eur Urol 2018, 74 (5), 575–582. doi:10.1016/j.eururo.2018.06.033
  • Gopalakrishnan V, Spencer CN, Nezi L, Reuben A, Andrews M, Karpinets T, Prieto P, Vicente D, Hoffman K, Wei SC, Gut microbiome modulates response to anti–PD-1 immunotherapy in melanoma patients. Sci 2018, 359 (6371), 97–103. doi:10.1126/science.aan4236
  • Matson V, Fessler J, Bao R, Chongsuwat T, Zha Y, Alegre M-L, Luke JJ, Gajewski TF, The commensal microbiome is associated with anti–PD-1 efficacy in metastatic melanoma patients. Sci 2018, 359 (6371), 104–108. doi:10.1126/science.aao3290
  • Routy B, Le Chatelier E, Derosa L, Duong CP, Alou MT, Daillère R, Fluckiger A, Messaoudene M, Rauber C, Roberti MP, Gut microbiome influences efficacy of PD-1–based immunotherapy against epithelial tumors. Sci 2018, 359 (6371), 91–97. doi:10.1126/science.aan3706
  • Xia Y, Sun J, Chen DG, Bioinformatic analysis of microbiome data. Stat Anal Of Microbiome Data With R. Singapore: Springer; 2018. doi:10.1007/978-981-13-1534-3_1
  • Xia Y, Sun J, Statistical data analysis of microbiomes and metabolomics. American Chemical Society, 2022.
  • Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM, Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol 1998, 5 (10), R245–R249. doi:10.1016/S1074-5521(98)90108-9
  • Raes J, Foerstner KU, Bork P, Get the most out of your metagenome: computational analysis of environmental sequence data. Curr Opin Microbiol 2007, 10 (5), 490–498. doi:10.1016/j.mib.2007.09.001
  • Hallam SJ, Putnam N, Preston CM, Detter JC, Rokhsar D, Richardson PM, DeLong EF, Reverse methanogenesis: testing the hypothesis with environmental genomics. Sci 2004, 305 (5689), 1457–1462. doi:10.1126/science.1100025
  • Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF, Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 2004, 428 (6978), 37–43. doi:10.1038/nature02340
  • Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, Environmental genome shotgun sequencing of the sargasso sea. Sci 2004, 304 (5667), 66–74. doi:10.1126/science.1093857
  • Sharpton TJ, An introduction to the analysis of shotgun metagenomic data. Front Plant Sci 2014, 5, 209. doi:10.3389/fpls.2014.00209
  • Schloss PD, Handelsman J, Metagenomics for studying unculturable microorganisms: cutting the gordian knot. Genome Biol 2005, 6 (8), 1–4. doi:10.1186/gb-2005-6-8-229
  • Albertsen M, Hugenholtz P, Skarshewski A, Nielsen KL, Tyson GW, Nielsen PH, Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 2013, 31 (6), 533–538. doi:10.1038/nbt.2579
  • Kuczynski J, Lauber CL, Walters WA, Parfrey LW, Clemente JC, Gevers D, Knight R, Experimental and analytical tools for studying the human microbiome. Nat Rev Genet 2012, 13 (1), 47–58. doi:10.1038/nrg3129
  • Boulund F, Sjören A, Kristiansson E, Tentacle: distributed quantification of genes in metagenomes. GigaScience 2015, 4 (1), s13742-015-0078–1. doi:10.1186/s13742-015-0078-1
  • Xia Y, Sun J, An integrated analysis of microbiomes and metabolomics. Am Chemi Soci 2022.
  • Xia Y, Sun J, Hypothesis testing and statistical analysis of microbiome. Genes & Dis 2017, 4 (3), 138–148. doi:10.1016/j.gendis.2017.06.001
  • Xia Y, Sun J, Chen D-G, Statistical analysis of microbiome data with R. Springer Singapore: 2018; Vol. 847. doi:10.1007/978-981-13-1534-3
  • Xia Y, Chaptereleven - correlation and association analyses in microbiome study integrating multiomics in health and disease. In: Progress in molecular biology and translational science, Sun J, editor Academic Press: 2020; Vol. 171, pp. 309–491. doi:10.1016/bs.pmbts.2020.04.003
  • Proctor L, Priorities for the next 10 years of human microbiome research. Nat Publ Group: 2019. 569 7758 623–625 doi:10.1038/d41586-019-01654-0
  • Xia Y, Sun J, Bioinformatic and statistical analysis of microbiome data: from raw sequences to advanced modeling with QIIME 2 and R. Springer International Publishing: 2023. doi:10.1007/978-3-031-21391-5
  • Xia Y, Correlation and association analyses in microbiome study integrating multiomics in health and disease. Prog Mol Biol Transl Sci 2020, 171, 309–491.
  • McMurdie PJ, Holmes S, McHardy AC, Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol 2014, 10 (4), e1003531. doi:10.1371/journal.pcbi.1003531
  • White JR, Nagarajan N, Pop M, Ouzounis CA, Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput Biol 2009, 5 (4), e1000352. doi:10.1371/journal.pcbi.1000352
  • Paulson JN, Stine OC, Bravo HC, Pop M, Differential abundance analysis for microbial marker-gene surveys. Nat Methods 2013, 10 (12), 1200–1202. doi:10.1038/nmeth.2658
  • Beszteri B, Temperton B, Frickenhaus S, Giovannoni SJ, Average genome size: a potential source of bias in comparative metagenomics. ISME J 2010, 4 (8), 1075–1077. doi:10.1038/ismej.2010.29
  • Frank JA, Sørensen SJ, Quantitative metagenomic analyses based on average genome size normalization. Appl Environ Microb 2011, 77 (7), 2513–2521. doi:10.1128/AEM.02167-10
  • Sanders HL, Marine benthic diversity: a comparative study. Am Nat 1968, 102 (925), 243–282. doi:10.1086/282541
  • Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, Lozupone C, Zaneveld JR, Vázquez-Baeza Y, Birmingham A, et al., Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 2017, 5 (1), 27. doi:10.1186/s40168-017-0237-y
  • Colwell RK, Chao A, Gotelli NJ, Lin S-Y, Mao CX, Chazdon RL, Longino JT, Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages. J Plant Ecol 2012, 5 (1), 3–21. doi:10.1093/jpe/rtr044
  • McMurdie PJ, Holmes S, Watson M, Phyloseq: an R package for reproducibleinteractive analysis and graphics of microbiome census data. PLoS One 2013, 8 (4), e61217. doi:10.1371/journal.pone.0061217
  • Hughes JB, Hellmann JJ, The application of rarefaction techniques to molecular inventories of microbial diversity. Methods Enzymol 2005, 397, 292–308.
  • Bullard JH, Purdom E, Hansen KD, Dudoit S, Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinform 2010, 11 (1), 94. doi:10.1186/1471-2105-11-94
  • Dillies M-A, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, et al., A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief Bioinform 2012, 14 (6), 671–683. doi:10.1093/bib/bbs046
  • Mitra S, Klar B, Huson DH, Visual and statistical comparison of metagenomes. Bioinformatics 2009, 25 (15), 1849–1855. doi:10.1093/bioinformatics/btp341
  • Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y, RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 2008, 18 (9), 1509–1517. doi:10.1101/gr.079558.108
  • McKnight DT, Huerlimann R, Bower DS, Schwarzkopf L, Alford RA, Zenger KR, Methods for normalizing microbiome data: an ecological perspective. Methods Ecology Evol 2019, 10 (3), 389–400. doi:10.1111/2041-210X.13115
  • McCafferty J, Mühlbauer M, Gharaibeh RZ, Arthur JC, Perez-Chanona E, Sha W, Jobin C, Fodor AA, Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model. ISME J 2013, 7 (11), 2116–2125. doi:10.1038/ismej.2013.106
  • Bolstad BM, Irizarry RA, Åstrand M, Speed TP, A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19 (2), 185–193. doi:10.1093/bioinformatics/19.2.185
  • Irizarry RA, Hobbs B, Collin F, Beazer‐Barclay YD, Antonellis KJ, Scherf U, Speed TP, Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4 (2), 249–264. doi:10.1093/biostatistics/4.2.249
  • Hansen KD, Irizarry RA, Wu Z, Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 2012, 13 (2), 204–216. doi:10.1093/biostatistics/kxr054
  • Hicks SC, Okrah K, Paulson JN, Quackenbush J, Irizarry RA, Bravo HC, Smooth quantile normalization. Biostatistics 2017, 19 (2), 185–198. doi:10.1093/biostatistics/kxx028
  • Fortin J-P, Labbe A, Lemire M, Zanke BW, Hudson TJ, Fertig EJ, Greenwood CM, Hansen KD, Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol 2014, 15 (11), 503. doi:10.1186/s13059-014-0503-2
  • Robinson MD, Oshlack A, A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 2010, 11 (3), R25. doi:10.1186/gb-2010-11-3-r25
  • Love MI, Huber W, Anders S, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014, 15 (12), 550. doi:10.1186/s13059-014-0550-8
  • Jonsson V, Osterlund T, Nerman O, Kristiansson E, Variability in metagenomic count data and its influence on the identification of differentially abundant genes. J Computer Biological 2017, 24 (4), 311–326. doi:10.1089/cmb.2016.0180
  • Sohn MB, Du R, An L, A robust approach for identifying differentially abundant features in metagenomic samples. Bioinformatics 2015, 31 (14), 2269–2275. doi:10.1093/bioinformatics/btv165
  • Smyth GK, Limma: linear models for microarray data. In Bioinformatics and computational biology solutions using r and bioconductor, Springer: New York, 2005; pp. 397–420. doi:10.1007/0-387-29362-0_23
  • Chen L, Reeve J, Zhang L, Huang S, Wang X, Chen J, GMPR: a robust normalization method for zero-inflated count data with application to microbiome sequencing data. PeerJ 2018, 6, e4600. doi:10.7717/peerj.4600
  • Kumar MS, Slud EV, Okrah K, Hicks SC, Hannenhalli S, Corrada Bravo H, Analysis and correction of compositional bias in sparse sequencing count data. Bmc Genom 2018, 19 (1), 799. doi:10.1186/s12864-018-5160-5
  • Aitchison J, A concise guide to compositional data analysis. In: 2nd Compositional Data Analysis Workshop. Girona, Italy 20032003.
  • Aitchison J, The statistical analysis of compositional data. Chapman and Hall Ltd. Reprinted in 2003 with additional material by The Blackburn Press.: London, 1986.
  • Lin H, Peddada SD, Analysis of compositions of microbiomes with bias correction. Nat Commun 2020, 11 (1), 3514. doi:10.1038/s41467-020-17041-7
  • Ma Y, Luo Y, Jiang H, Valencia A, A novel normalization and differential abundance test framework for microbiome data. Bioinformatics 2020, 36 (13), 3959–3965. doi:10.1093/bioinformatics/btaa255
  • Mulenga M, Kareem SA, Sabri AQM, Seera M, Govind S, Samudi C, Mohamad SB, Feature extension of gut microbiome data for deep neural network-based colorectal cancer classification. IEEE Access 2021, 9, 23565–23578. doi:10.1109/ACCESS.2021.3050838
  • Singh D, Singh B, Investigating the impact of data normalization on classification performance. Appl Soft Comput 2020, 97, 105524. doi:10.1016/j.asoc.2019.105524
  • Gotelli N, Colwell R, Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecol Lett 2001, 4, 379–391. 4 doi:10.1046/j.1461-0248.2001.00230.x
  • Mao CX, Colwell RK, Estimation of species richness: mixture models, the role of rare species, and inferential challenges. Ecology 2005, 86 (5), 1143–1153. doi:10.1890/04-1078
  • Brewer A, Williamson M, A new relationship for rarefaction. Biodivers Conserv 1994, 3 (4), 373–379. doi:10.1007/BF00056509
  • Horner-Devine MC, Lage M, Hughes JB, Bohannan BJM, A taxa–area relationship for bacteria. Nature 2004, 432 (7018), 750–753. doi:10.1038/nature03073
  • Jernvall J, Wright PC, Diversity components of impending primate extinctions. Proc Natl Acad Sci U S A 1998, 95 (19), 11279–11283. doi:10.1073/pnas.95.19.11279
  • Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, et al., Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 2009, 75 (23), 7537–7541. doi:10.1128/AEM.01541-09
  • Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, et al., QIIME allows analysis of high-throughput community sequencing data. Nat Methods 2010, 7 (5), 335–336. doi:10.1038/nmeth.f.303
  • Jari Oksanen FGB, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin PR, O’Hara RB, Simpson GL, Peter Solymos MHHS, Szoecs E, et al., Vegan: community ecology package ordination methods, diversity analysis and other functions for community and vegetation ecologists. http://CRAN.R-project.org/package=vegan. 2019.
  • Leo Lahti SSEA, Tools for microbiome analysis in R. Version 1.9.95. 2017.
  • Lauber CL, Hamady M, Knight R, Fierer N, Pyrosequencing-based assessment of soil pH as apredictor of soil bacterial community structure at the continental scale. Appl Environ Microb 2009, 75 (15), 5111–5120. doi:10.1128/AEM.00335-09
  • Lauber CL, Zhou N, Gordon JI, Knight R, Fierer N, Effect of storage conditions on the assessment of bacterial community structure in soil and human-associated samples. FEMS Microbiol Lett 2010, 307 (1), 80–86. doi:10.1111/j.1574-6968.2010.01965.x
  • Aguirre de Cárcer D, Denman SE, McSweeney C, Morrison M, Evaluation of subsampling-based normalization strategies for tagged high-throughput sequencing data sets from gut microbiomes. Appl Environ Microb 2011, 77 (24), 8795–8798. doi:10.1128/AEM.05491-11
  • Kuczynski J, Liu Z, Lozupone C, McDonald D, Fierer N, Knight R, Microbial community resemblance methods differ in their ability to detect biologically relevant patterns. Nat Methods 2010, 7 (10), 813–819. doi:10.1038/nmeth.1499
  • Hamady M, Knight R, Microbial community profiling for human microbiome projects:tools, techniques, and challenges. Genome Res 2009, 19 (7), 1141–1152. doi:10.1101/gr.085464.108
  • Kuczynski J, Costello EK, Nemergut DR, Zaneveld J, Lauber CL, Knights D, Koren O, Fierer N, Kelley ST, Ley RE, et al., Direct sequencing of the human microbiome readily reveals community differences. Genome Biol 2010, 11 (5), 210. doi:10.1186/gb-2010-11-5-210
  • Hillmann B, Al-Ghalith GA, Shields-Cutler RR, Zhu Q, Gohl DM, Beckman KB, Knight R, Knights D, Rawls JF, Evaluating the information content of shallow shotgun metagenomics. mSystems 2018, 3 (6), e00069–18. doi:10.1128/mSystems.00069-18
  • Papadimitriou K, Anastasiou R, Georgalaki M, Bounenni R, Paximadaki A, Charmpi C, Alexandraki V, Kazou M, Tsakalidou E, Comparison of the microbiome of artisanal homemade and industrial feta cheese through amplicon sequencing and shotgun metagenomics. Microorganisms 2022, 10 (5), 1073. doi:10.3390/microorganisms10051073
  • Pereira MB, Wallroth M, Jonsson V, Kristiansson E, Comparison of normalization methods for the analysis of metagenomic gene abundance data. Bmc Genom 2018, 19 (1), 274. doi:10.1186/s12864-018-4637-6
  • Xuan C, Shamonki JM, Chung A, DiNome ML, Chung M, Sieling PA, Lee DJ, Takabe K, Microbial dysbiosis is associated with human breast cancer. PLoS One 2014, 9 (1), e83744. doi:10.1371/journal.pone.0083744
  • Sze MA, Baxter NT, Ruffin MT, Rogers MA, Schloss PD, Normalization of the microbiota in patients after treatment for colonic lesions. Microbiome 2017, 5 (1), 1–10. doi:10.1186/s40168-017-0366-3
  • Baxter NT, Ruffin MT, Rogers MA, Schloss PD, Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med 2016, 8 (1), 1–10. doi:10.1186/s13073-016-0290-3
  • Dai Z, Coker OO, Nakatsu G, Wu WKK, Zhao L, Chen Z, Chan FKL, Kristiansen K, Sung JJY, Wong SH, et al., Multi-cohort analysis of colorectal cancer metagenome identified altered bacteria across populations and universal bacterial markers. Microbiome 2018, 6 (1), 70. doi:10.1186/s40168-018-0451-2
  • Bergemann TL, Wilson J, Proportion statistics to detect differentially expressed genes: a comparison with log-ratio statistics. BMC Bioinform 2011, 12, 228–228. 1 doi:10.1186/1471-2105-12-228
  • Dillies MA, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, et al., A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief Bioinform 2013, 14 (6), 671–683. doi:10.1093/bib/bbs046
  • Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD, Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 2015, 26 (1), 27663. doi:10.3402/mehd.v26.27663
  • Tsilimigras MC, Fodor AA, Compositional data analysis of the microbiome: fundamentals, tools, and challenges. Ann Epidemiol 2016, 26 (5), 330–335. doi:10.1016/j.annepidem.2016.03.002
  • Morton JT, Sanders J, Quinn RA, McDonald D, Gonzalez A, Vázquez-Baeza Y, Navas-Molina JA, Song SJ, Metcalf JL, Hyde ER, et al., Balance trees reveal microbial niche differentiation. mSystems 2017, 2 (1). doi:10.1128/mSystems.00162-16
  • Jackson DA, COMPOSITIONAL DATA in COMMUNITY ECOLOGY: THE PARADIGM or PERIL of PROPORTIONS? Ecology 1997, 78 (3), 929–940. doi:10.1890/0012-9658(1997)078[0929:CDICET]2.0.CO;2
  • Rezasoltani S, Aghdaei HA, Jasemi S, Gazouli M, Dovrolis N, Sadeghi A, Schlüter H, Zali MR, Sechi LA, Feizabadi MM, Oral microbiota as novel Biomarkers for colorectal cancer screening. Cancers 2023, 15 (1), 192. doi:10.3390/cancers15010192
  • Poore GD, Kopylova E, Zhu Q, Carpenter C, Fraraccio S, Wandro S, Kosciolek T, Janssen S, Metcalf J, Song SJ, Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 2020, 579 (7800), 567–574. doi:10.1038/s41586-020-2095-1
  • Hasan R, Bose S, Roy R, Paul D, Rawat S, Nilwe P, Chauhan NK, Choudhury S, Tumor tissue-specific bacterial biomarker panel for colorectal cancer: bacteroides massiliensis, alistipes species, alistipes onderdonkii, bifidobacterium pseudocatenulatum, Corynebacterium appendicis. Arch Microbiol 2022, 204 (6), 1–10. doi:10.1007/s00203-022-02954-2
  • Simpson RC, Shanahan ER, Batten M, Reijers ILM, Read M, Silva IP, Versluis JM, Ribeiro R, Angelatos AS, Tan J, et al., Diet-driven microbial ecology underpins associations between cancer immunotherapy outcomes and the gut microbiome. Nat Med 2022, 2344–2352. doi:10.1038/s41591-022-01965-2
  • Huber W, Von Heydebreck A, Sültmann H, Poustka A, Vingron M, Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002, 18 (suppl_1), S96–S104. doi:10.1093/bioinformatics/18.suppl_1.S96
  • Parsons HM, Ludwig C, Günther UL, Viant MR, Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation. BMC Bioinform 2007, 8 (1), 234. doi:10.1186/1471-2105-8-234
  • van den Berg RA, Hoefsloot HC, Westerhuis JA, Smilde AK, van der Werf MJ, Centering, scaling, and transformations: improving the biological information content of metabolomics data. Bmc Genom 2006, 7 (1), 142. doi:10.1186/1471-2164-7-142
  • Kvalheim OM, Brakstad F, Liang Y, Preprocessing of analytical profiles in the presence of homoscedastic or heteroscedastic noise. Anal Chem 1994, 66 (1), 43–51. doi:10.1021/ac00073a010
  • Feng C, Wang H, Lu N, Chen T, He H, Lu Y, Tu XM, Log-transformation and its implications for data analysis. Shanghai Arch Psychiatry 2014, 26 (2), 105–109. doi:10.3969/j.issn.1002-0829.2014.02.009
  • Feng C, Wang H, Lu N, Tu XM, Log transformation: application and interpretation in biomedical research. Stat Med 2013, 32 (2), 230–239. doi:10.1002/sim.5486
  • Xia Y, Sun J, Pretreating and normalizing metabolomics data for statistical analysis. Genes & Dis 2023. doi:10.1016/j.gendis.2023.04.018
  • De Livera AM, Dias DA, De Souza D, Rupasinghe T, Pyke J, Tull D, Roessner U, McConville M, Speed TP, Normalizing and integrating metabolomics data. Anal Chem 2012, 84 (24), 10768–10776. doi:10.1021/ac302748b
  • Durbin BP, Hardin JS, Hawkins DM, Rocke DM, A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics 2002, 18 (suppl_1), S105–S110. doi:10.1093/bioinformatics/18.suppl_1.S105
  • Oresta B, Braga D, Lazzeri M, Frego N, Saita A, Faccani C, Fasulo V, Colombo P, Guazzoni G, Hurle R, et al., The microbiome of catheter collected urine in males with bladder cancer according to disease stage. J Urol 2021, 205 (1), 86–93. doi:10.1097/JU.0000000000001336
  • Choi H, Kim S, Fermin D, Tsou C-C, Nesvizhskii AI, QPROT: statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics. J Proteomics 2015, 129, 121–126. doi:10.1016/j.jprot.2015.07.036
  • Li P, Piao Y, Shon HS, Ryu KH, Comparing the normalization methods for the differential analysis of illumina high-throughput RNA-Seq data. BMC Bioinform 2015, 16 (1), 347. doi:10.1186/s12859-015-0778-7
  • Abrams ZB, Johnson TS, Huang K, Payne PRO, Coombes K, A protocol to evaluate RNA sequencing normalization methods. BMC Bioinform 2019, 20 (24), 679. doi:10.1186/s12859-019-3247-x
  • Smyth GK, Limma: linear models for microarray data. In Bioinformatics and computational biology solutions using R and bioconductor, Springer: 2005; pp. 397–420. doi:10.1007/0-387-29362-0_23
  • Wang Q, Ye J, Fang D, Lv L, Wu W, Shi D, Li Y, Yang L, Bian X, Wu J, et al., Multi-omic profiling reveals associations between the gut mucosal microbiome, the metabolome, and host DNA methylation associated gene expression in patients with colorectal cancer. BMC Microbiol 2020, 20 (S1), 83. doi:10.1186/s12866-020-01762-2
  • Robinson MD, McCarthy DJ, Smyth GK, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010, 26 (1), 139–140. doi:10.1093/bioinformatics/btp616
  • Boulund F, Pereira MB, Jonsson V, E K, Computational and statistical considerations in the analysis of metagenomic data. In: Metagenomics: perspectives, methods and applications, M N, editor Academic Press: Cambridge, 2018pp. 81–102 doi:10.1016/B978-0-08-102268-9.00004-5
  • Sakurai T, De Velasco MA, Sakai K, Nagai T, Nishiyama H, Hashimoto K, Uemura H, Kawakami H, Nakagawa K, Ogata H, et al., Integrative analysis of gut microbiome and host transcriptomes reveals associations between treatment outcomes and immunotherapy-induced colitis. Mol Oncol 2022, 16 (7), 1493–1507. doi:10.1002/1878-0261.13062
  • Lin Y, Lau HC-H, Liu Y, Kang X, Wang Y, Ting NL-N, Kwong TN-Y, Han J, Liu W, Liu C, et al., Altered mycobiota signatures and enriched pathogenic aspergillus rambellii are associated with colorectal cancer based onmulticohort fecal metagenomic analyses. Gastroenterology 2022, 163 (4), 908–921. doi:10.1053/j.gastro.2022.06.038
  • Jin C, Lagoudas GK, Zhao C, Bullman S, Bhutkar A, Hu B, Ameh S, Sandel D, Liang XS, Mazzilli S, et al., Commensal microbiota promote lung cancer development via γδ Tcells. Cell 2019, 176 (5), 998–1013.e16. doi:10.1016/j.cell.2018.12.040
  • Parhi L, Alon-Maimon T, Sol A, Nejman D, Shhadeh A, Fainsod-Levi T, Yajuk O, Isaacson B, Abed J, Maalouf N, Breast cancer colonization by fusobacterium nucleatum accelerates tumor growth and metastatic progression. Nat Commun 2020, 11 (1), 1–12. doi:10.1038/s41467-020-16967-2
  • Anders S, Huber W, Differential expression analysis for sequence count data. Genome Biol 2010, 11 (10), R106. doi:10.1186/gb-2010-11-10-r106
  • Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B, Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 2008, 5 (7), 621–628. doi:10.1038/nmeth.1226
  • Thompson KJ, Ingle JN, Tang X, Chia N, Jeraldo PR, Walther-Antonio MR, Kandimalla KK, Johnson S, Yao JZ, Harrington SC, et al., A comprehensive analysis of breast cancer microbiota and host gene expression. PLoS One 2017, 12 (11), e0188873. doi:10.1371/journal.pone.0188873
  • Lopes-Ramos CM, Kuijjer ML, Ogino S, Fuchs CS, DeMeo DL, Glass K, Quackenbush J, Gene regulatory network analysis identifies sex-linked differences in colon cancer drug metabolism. Cancer Res 2018, 78 (19), 5538–5547. doi:10.1158/0008-5472.CAN-18-0454
  • Kadota K, Nishiyama T, Shimizu K, A normalization strategy for comparing tag count data. Algorithms Mol Biol 2012, 7 (1), 5–5. doi:10.1186/1748-7188-7-5
  • Fu L, Luo K, Lv J, Wang X, Qin S, Zhang Z, Sun S, Wang X, Yun B, He Y, Integrating expression data-based deep neural network models with biological networks to identify regulatory modules for lung adenocarcinoma. Biology 2022, 11 (9), 1291. doi:10.3390/biology11091291
  • Anders S, McCarthy DJ, Chen Y, Okoniewski M, Smyth GK, Huber W, Robinson MD, Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat Protoc 2013, 8 (9), 1765. doi:10.1038/nprot.2013.099
  • Anders S, McCarthy DJ, Chen Y, Okoniewski M, Smyth GK, Huber W, Robinson MD, Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat Protoc 2013, 8 (9), 1765–1786. doi:10.1038/nprot.2013.099
  • Maza E, In papyro comparison of TMM (edgeR), RLE (DESeq2), and MRN normalization methods for asimple two-conditions-without-replicates rna-seq experimental design. Front Genet 2016, 7 (164). doi:10.3389/fgene.2016.00164
  • Wu Z, Liu W, Jin X, Ji H, Wang H, Glusman G, Robinson M, Liu L, Ruan J, Gao S, NormExpression: an R package to normalize gene expression data using evaluated methods. Front Genet 2019, 10 (400). doi:10.3389/fgene.2019.00400
  • Klann E, Williamson JM, Tagliamonte MS, Ukhanova M, Asirvatham JR, Chim H, Yaghjyan L, Mai V, Microbiota composition in bilateral healthy breast tissue and breast tumors. Cancer Cause Control 2020, 31 (11), 1027–1038. doi:10.1007/s10552-020-01338-5
  • Law CW, Chen Y, Shi W, Smyth GK, Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 2014, 15 (2), R29. doi:10.1186/gb-2014-15-2-r29
  • Schmidt BL, Kuczynski J, Bhattacharya A, Huey B, Corby PM, Queiroz EL, Nightingale K, Kerr AR, DeLacure MD, Veeramachaneni R, Changes in abundance of oral microbiota associated with oral cancer. PLoS One 2014, 9 (6), e98741. doi:10.1371/journal.pone.0098741
  • Behary J, Amorim N, Jiang X-T, Raposo A, Gong L, McGovern E, Ibrahim R, Chu F, Stephens C, Jebeili H, Gut microbiota impact on the peripheral immune response in non-alcoholic fatty liver disease related hepatocellular carcinoma. Nat Commun 2021, 12 (1), 1–14. doi:10.1038/s41467-020-20422-7
  • Smyth GK, Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004, 3 (1). 1–25 doi:10.2202/1544-6115.1027
  • Lee C, Lee S, Park T In A comparison study of statistical methods for the analysis metagenome data, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Kansas City, United States. IEEE. 2017; pp 1777–1781.
  • Costea PI, Zeller G, Sunagawa S, Bork P, A fair comparison. Nat Methods 2014, 11 (4), 359–359. doi:10.1038/nmeth.2897
  • Paulson JN, Bravo HC, Pop M, A fair comparisonreply. Nat Methods 2014, 11 (4), 359–360. doi:10.1038/nmeth.2898
  • Paulson JN, Olson ND, Braccia DJ, Wagner J, Talukder H, Pop M, HC B, metagenomeSeq: statistical analysis for sparse high-throughput sequncing. Bioconductor Package, http://www.cbcb.umd.edu/software/metagenomeSeq. Version 1.28.2. 2013.
  • Norouzi-Beirami MH, Marashi S-A, Banaei-Moghaddam AM, Kavousi K, Beyond taxonomic analysis of microbiomes: a functional approach for revisiting microbiome changes in colorectal cancer. Front Microbiol 2020, 10, 3117. doi:10.3389/fmicb.2019.03117
  • Wang Q, Ye J, Fang D, Lv L, Wu W, Shi D, Li Y, Yang L, Bian X, Wu J, Multi-omic profiling reveals associations between the gut mucosal microbiome, the metabolome, and host DNA methylation associated gene expression in patients with colorectal cancer. BMC Microbiol 2020, 20 (S1), 1–13. doi:10.1186/s12866-020-01762-2
  • Alshawaqfeh M, Rababah S, Hayajneh A, Gharaibeh A, Serpedin E, MetaAnalyst: a user-friendly tool for metagenomic biomarker detection and phenotype classification. BMC Med Res Methodol 2022, 22 (1), 1–14. doi:10.1186/s12874-022-01812-5
  • Hughes JB, Hellmann JJ, Ricketts TH, Bohannan BJ, Counting the uncountable: statistical approaches to estimating microbial diversity. Appl Environ Microbiol 2001, 67 (10), 4399–4406. doi:10.1128/AEM.67.10.4399-4406.2001
  • Ai D, Pan H, Li X, Gao Y, Liu G, Xia LC, Identifying gut microbiota associated with colorectal cancer using a zero-inflated lognormal model. Front Microbiol 2019, 10, 826. doi:10.3389/fmicb.2019.00826
  • Abbas-Aghababazadeh F, Li Q, Fridley BL, Lin H, Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing. PLoS One 2018, 13 (10), e0206312. doi:10.1371/journal.pone.0206312
  • Lee W-H, Chen K-P, Wang K, Huang H-C, Juan H-F, Characterizing the cancer-associated microbiome with small RNA sequencing data. Biochem Bioph Res Co 2020, 522 (3), 776–782. doi:10.1016/j.bbrc.2019.11.166
  • Kharofa J, Apewokin S, Alenghat T, Ollberding NJ, Metagenomic analysis of the fecal microbiome in colorectal cancer patients compared to healthy controls as a function of age. Cancer Med 2022. 12 3 2945–2957 doi:10.1002/cam4.5197
  • Swift D, Cresswell K, Johnson R, Stilianoudakis S, Wei X, A review of normalization and differential abundance methods for microbiome counts data. Wiley Interdiscip Rev Comput Stat 2022, e1586. 2023 1 doi:10.1002/wics.1586
  • Yang L, Chen J, A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions. Microbiome 2022, 10 (1), 130. doi:10.1186/s40168-022-01320-0
  • Lin H, Peddada SD, Analysis of microbial compositions: a review of normalization and differential abundance analysis. NPJ Biofilms Microbiomes 2020, 6 (1), 60. doi:10.1038/s41522-020-00160-w
  • Muthiah S; H, C. B. Wrench:wrench normalization for sparse count data. R pack-age version 1.16.0. https://github.com/HCBravoLab/Wrench.
  • Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R, Modeling and analysis of compositional data. John Wiley & Sons: London UK, 2015. doi:10.1002/9781119003144
  • Aitchison J, The statistical analysis of compositional data. J R Stat Soc Series B Stat Methodol 1982, 44 (2), 139–160. doi:10.1111/j.2517-6161.1982.tb01195.x
  • Egozcue JJ, Isometric logratio transformations for compositional data analysis. Math Geol 2003, 35 (3), 279–300. doi:10.1023/A:1023818214614
  • Fernandes AD, Macklaim JM, Linn TG, Reid G, Gloor GB, ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-seq. PLoS One 2013, 8. 8 7 doi:10.1371/journal.pone.0067019
  • Fernandes AD, Reid JN, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB, Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2014, 2 (1), 15. doi:10.1186/2049-2618-2-15
  • Morton JT, Sanders J, Quinn RA, McDonald D, Gonzalez A, Vázquez-Baeza Y, Navas-Molina JA, Song SJ, Metcalf JL, Hyde ER, et al., Balance trees reveal microbial niche differentiation. mSystems 2017, 2 (1), e00162–16. doi:10.1128/mSystems.00162-16
  • Silverman JD, Washburne AD, Mukherjee S, David LA, A phylogenetic transform enhances analysis of compositional microbiota data. Elife 2017, 6, e21887. doi:10.7554/eLife.21887
  • van den Boogaart KG, Tolosana-Delgado R, Analyzing compositional data with R. Springer-Verlag: Berlin Heidelberg, 2013. doi:10.1007/978-3-642-36809-7
  • Quinn TP, Crowley TM, Richardson MF, Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods. BMC Bioinform 2018, 19 (1), 274. doi:10.1186/s12859-018-2261-8
  • Urbaniak C, Angelini M, Gloor GB, Reid G, Human milk microbiota profiles in relation to birthing method, gestation and infant gender. Microbiome 2016, 4 (1), 1. doi:10.1186/s40168-015-0145-y
  • Quinn TP, Erb I, Richardson MF, Crowley TM, Wren J, Understanding sequencing data as compositions: an outlook and review. Bioinformatics 2018, 34 (16), 2870–2878. doi:10.1093/bioinformatics/bty175
  • Seyednasrollah F, Laiho A, Elo LL, Comparison of software packages for detecting differential expression in RNA-seq studies. Brief Bioinform 2013, 16 (1), 59–70. doi:10.1093/bib/bbt086
  • Tarazona S, Furió-Tarí P, Turrà D, Pietro AD, Nueda MJ, Ferrer A, Conesa A, Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package. Nucleic Acids Res 2015, 43 (21), e140–e140. doi:10.1093/nar/gkv711
  • Williams CR, Baccarella A, Parrish JZ, Kim CC, Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq. BMC Bioinform 2017, 18 (1), 38. doi:10.1186/s12859-016-1457-z
  • Morton JT, Marotz C, Washburne A, Silverman J, Zaramela LS, Edlund A, Zengler K, Knight R, Establishing microbial composition measurement standards with reference frames. Nat Commun 2019, 10 (1), 2719. doi:10.1038/s41467-019-10656-5
  • Peng Z, Cheng S, Kou Y, Wang Z, Jin R, Hu H, Zhang X, Gong J-F, Li J, Lu M, The gut microbiome is associated with clinical response to anti–PD-1/PD-L1 immunotherapy in gastrointestinal cancer. Cancer Immunol Res 2020, 8 (10), 1251–1261. doi:10.1158/2326-6066.CIR-19-1014
  • Thomas C, Aitchison J, Log-ratios and geochemical discrimination of scottish dalradian limestones: a case study. Geol Soc London, Spec Publ 2006, 264 (1), 25–41. doi:10.1144/GSL.SP.2006.264.01.03
  • Mandal S, Treuren W, White RA, Eggesbo M, Knight R, Peddada SD, Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 2015, 26. 26 doi:10.3402/mehd.v26.27663
  • Xia Y, Sun J, Chen D-G, Compositional analysis of microbiome data. In Statistical analysis of microbiome data with R, Springer Singapore: Springer Singapore, 2018; pp. 331–393. doi:10.1007/978-981-13-1534-3_10
  • Brill B, Amir A, Heller R, Testing for differential abundance in compositional counts data, with application to microbiome studies. arXiv preprint arXiv 2019, 1904.08937.
  • Wallen ZD, Comparison study of differential abundance testing methods using two large parkinson disease gut microbiome datasets derived from 16S amplicon sequencing. BMC Bioinform 2021, 22 (1), 1–29. doi:10.1186/s12859-021-04193-6
  • Bai J, Jhaney I, Daniel G, Bruner DW Inpilot study of vaginal microbiome using QIIME 2™ in women with gynecologic cancer before and after radiation therapy, Oncol Nurs Forum, 2019. 46 2 E48–E59 doi:10.1188/19.ONF.E48-E59
  • Cheung MK, Yue GGL, Tsui KY, Gomes AJ, Kwan HS, Chiu PWY, San Lau CB, Discovery of an interplay between the gut microbiota and esophageal squamous cell carcinoma in mice. Am J Cancer Res 2020, 10 (8), 2409.
  • Debelius JW, Huang T, Cai Y, Ploner A, Barrett D, Zhou X, Xiao X, Li Y, Liao J, Zheng Y, et al., Subspecies niche specialization in the oral microbiome is associated with nasopharyngeal carcinoma risk. mSystems 2020, 5 (4), e00065–20. doi:10.1128/mSystems.00065-20
  • Xia Y, Sun J, Chen D-G, Modeling Zero-Inflated Microbiome Data. In: Statistical analysis of microbiome data with R, Xia Y, Sun J Chen D-G, editors Springer Singapore: Springer Singapore, 2018; pp. 453–496. doi:10.1007/978-981-13-1534-3_12
  • Wang S, Robust differential abundance test in compositional data. arXiv preprint arXiv 2021, 2101.08765.
  • Kaul A, Mandal S, Davidov O, Peddada SD, Analysis of microbiome data in the presence of excess zeros. Front Microbiol 2017, 8 (2114). doi:10.3389/fmicb.2017.02114
  • Dai W, Li C, Li T, Hu J, Zhang H, Super-taxon in human microbiome are identified to be associated with colorectal cancer. BMC Bioinform 2022, 23 (1), 243. doi:10.1186/s12859-022-04786-9
  • Ridout M, Hinde J, Demétrio CG, A score test for testing a zero‐inflated Poisson regression model against zero‐inflated negative binomial alternatives. Biometrics 2001, 57 (1), 219–223. doi:10.1111/j.0006-341X.2001.00219.x
  • Jung BC, Jhun M, Lee JW, Bootstrap tests foroverdispersion in a zero‐inflated Poisson regression model. Biometrics 2005, 61 (2), 626–628. doi:10.1111/j.1541-0420.2005.00368.x
  • Sayyari E, Kawas B, Mirarab S, TADA: phylogenetic augmentation of microbiome samples enhances phenotype classification. Bioinformatics 2019, 35 (14), i31–i40. doi:10.1093/bioinformatics/btz394
  • Mansour RF, Alfar NM, Abdel‐Khalek S, Abdelhaq M, Saeed RA, Alsaqour R, Optimal deep learning based fusion model for biomedical image classification. Expert Syst 2022, 39 (3), e12764. doi:10.1111/exsy.12764
  • Zhang X, Yi N, Valencia A, Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data. Bioinformatics 2020, 36 (8), 2345–2351. doi:10.1093/bioinformatics/btz973
  • Lozupone C, Knight R, UniFrac: a new phylogenetic method for comparing microbial communities. applied and environmental microbiology 2005, 71 (12), 8228–8235. Appl Environ Microb doi:10.1128/AEM.71.12.8228-8235.2005
  • Lozupone C, Lladser ME, Knights D, Stombaugh J, Knight R, UniFrac: an effective distance metric for microbial community comparison. ISME J 2011, 5 (2), 169–172. doi:10.1038/ismej.2010.133
  • Di Bella JM, Bao Y, Gloor GB, Burton JP, Reid G, High throughput sequencing methods and analysis for microbiome research. J Microbiol Methods 2013, 95 (3), 401–414. doi:10.1016/j.mimet.2013.08.011
  • Segata N, Boernigen D, Tickle TL, Morgan XC, Garrett WS, Huttenhower C, Computational meta’omics for microbial community studies. Mol Syst Biol 2013, 9 (1), 666. doi:10.1038/msb.2013.22
  • Navas-Molina JA, Peralta-Sánchez JM, González A, McMurdie PJ, Vázquez-Baeza Y, Xu Z, Ursell LK, Lauber C, Zhou H, Song SJ, Advancing our understanding of the human microbiome using QIIME. Vol. 531, Methods Enzymol. Elsevier; 2013; pp 371–444.
  • Hughes JB, Hellmann JJ, The application of rarefaction techniques to molecular inventories of microbial diversity. In Methods in enzymology, Academic Press: 2005; Vol. 397, pp. 292–308. doi:10.1016/S0076-6879(05)97017-1
  • Koren O, Knights D, Gonzalez A, Waldron L, Segata N, Knight R, Huttenhower C, Ley RE, Eisen JA, A guide toenterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets. PLoS Comput Biol 2013, 9 (1), e1002863. doi:10.1371/journal.pcbi.1002863
  • Xia Y, Morrison-Beedy D, Ma J, Feng C, Cross W, Tu X, Modeling count outcomes from HIV risk reduction interventions: a comparison of competing statistical models for count responses. AIDS Res Treat 2012, 2012, 593569–593569. doi:10.1155/2012/593569
  • Feng C, Wang H, Han Y, Xia Y, Lu N, Tu XM, Some theoretical comparisons of negative binomial and zero-inflated Poisson distributions. Commun In Stat- Theory And Methods 2015, 44 (15), 3266–3277. doi:10.1080/03610926.2013.823203
  • Xia Y, Sun J, C DG, Modeling over-dispered microbiome data. In Statistical analysis of microbiome data with R, Springer: Singapore, 2018; pp. 395–451. doi:10.1007/978-981-13-1534-3_11
  • Xia Y, Sun J, C DG, Modeling Zero-Inflated Microbiome Data. In Statistical analysis of microbiome data with R, Springer Singapore: Singapore, 2018; pp. 453–496. doi:10.1007/978-981-13-1534-3_12
  • Jonsson V, Österlund T, Nerman O, Kristiansson E, Variability in metagenomic count data and its influence on the identification of differentially abundant genes. J Comput Biol 2017, 24 (4), 311–326. doi:10.1089/cmb.2016.0180
  • Wang Y, Lêcao K-A, Managing batch effects in microbiome data. Brief Bioinform 2019. 21 6 1954–1970 doi:10.1093/bib/bbz105
  • Ma S, Shungin D, Mallick H, Schirmer M, Nguyen LH, Kolde R, Franzosa E, Vlamakis H, Xavier R, Huttenhower C, Population structure discovery in meta-analyzed microbial communities and inflammatory bowel disease using MMUPHin. Genome Biol 2022, 23 (1), 1–31. doi:10.1186/s13059-022-02753-4
  • Ling W, Lu J, Zhao N, Lulla A, Plantinga AM, Fu W, Zhang A, Liu H, Song H, Li Z, et al., Batch effects removal for microbiome data via conditional quantile regression. Nat Commun 2022, 13 (1), 5418. doi:10.1038/s41467-022-33071-9
  • Dai Z, Wong SH, Yu J, Wei Y, Birol I, Batch effects correction for microbiome data with Dirichlet-multinomial regression. Bioinformatics 2019, 35 (5), 807–814. doi:10.1093/bioinformatics/bty729
  • Anscombe FJ, The transformation of Poisson, binomial and negative-binomial data. Biometrika 1948, 35 (3/4), 246–254. doi:10.1093/biomet/35.3-4.246
  • de Cárcer DA, Denman SE, McSweeney C, Morrison M, Evaluation of subsampling-based normalization strategies for tagged high-throughput sequencing data sets from gut microbiomes. Appl Environ Microb 2011, 77 (24), 8795–8798. doi:10.1128/AEM.05491-11
  • Xia Y, Sun J, Pretreating and normalizing metabolomics data for statistical analysis. Genes & Dis in press, 2023. doi:10.1016/j.gendis.2023.04.018
  • Boulund F, Pereira MB, Jonsson V, Kristiansson E, Chapter 4 - computational and statistical considerations in the analysis of metagenomic data. In: Metagenomics, Nagarajan M, editor Academic Press: 2018; pp. 81–102. doi:10.1016/B978-0-08-102268-9.00004-5
  • Paulson JN Normalization and differential abundance analysis of metagenomic biomarker-gene surveys. University of Maryland, College Park, 2015.
  • Jonsson V, Österlund T, Nerman O, Kristiansson E, Statistical evaluation of methods for identification of differentially abundant genes in comparative metagenomics. Bmc Genom 2016, 17, 78–78. 1 doi:10.1186/s12864-016-2386-y
  • Parks DH, Tyson GW, Hugenholtz P, Beiko RG, STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics 2014, 30 (21), 3123–3124. doi:10.1093/bioinformatics/btu494
  • Xia Y, Sun J, Chen D-G, What Are Microbiome Data? In Statistical analysis of microbiome data with R, Springer Singapore: Springer Singapore, 2018; pp. 29–41. doi:10.1007/978-981-13-1534-3_2
  • Glusman G, Caballero J, Robinson M, Kutlu B, Hood L, Jordan IK, Optimal scaling ofdigital transcriptomes. PLoS One 2013, 8 (11), e77885. doi:10.1371/journal.pone.0077885
  • Egozcue J, Pawlowsky-Glahn V, Mateu-Figueraz G, Barceló-Vidal C, doi:10.1023/A:1023818214614. Math Geol 2003, 35, 279–300. 3
  • Greenacre M, Measuring Subcompositional Incoherence. Math Geosci, 43, 681–693. 2011. 6 doi:10.1007/s11004-011-9338-5
  • Martín-Fernández J-A, Hron K, Templ M, Filzmoser P, Palarea-Albaladejo J, Bayesian-multiplicative treatment of count zeros in compositional data sets. Stat Modelling 2015, 15 (2), 134–158. doi:10.1177/1471082X14535524
  • Mosimann JE, On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions. Biometrika 1962, 49 (1/2), 65–82. doi:10.1093/biomet/49.1-2.65
  • Lovell D, Pawlowsky-Glahn V, Egozcue JJ, Marguerat S, Bähler J, Dunbrack Jr RL, Proportionality: a valid alternative to correlation for relative data. PLoS Comput Biol 2015, 11 (3), e1004075. doi:10.1371/journal.pcbi.1004075
  • Kristiansson E, Hugenholtz P, Dalevi D, ShotgunFunctionalizeR: an R-package for functional comparison of metagenomes. Bioinformatics 2009, 25 (20), 2737–2738. doi:10.1093/bioinformatics/btp508
  • Hanley JA, McNeil BJ, The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982, 143 (1), 29–36. doi:10.1148/radiology.143.1.7063747
  • Bradley AP, The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997, 30 (7), 1145–1159. doi:10.1016/S0031-3203(96)00142-2
  • Pearson KI, Mathematical contributions to the theory of evolution.—VII. On the correlation of characters not quantitatively measurable. Philos Trans of the R Soc of London Ser A Containing Pap of a Math or Phys Charact 1900, 195 (262–273), 1–47.
  • Spearman C, The Proof and Measurement of Association between Two Things. Am J Psychol 1904, 15 (1), 72–101. doi:10.2307/1412159
  • Matthews BW, Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim et Biophysica Acta (BBA)-Protein Struct 1975, 405 (2), 442–451. doi:10.1016/0005-2795(75)90109-9
  • Müller R, Büttner P, A critical discussion of intraclass correlation coefficients. Stat Med 1994, 13 (23–24), 2465–2476. doi:10.1002/sim.4780132310
  • Bray JR, Curtis JT, An ordination of the upland forest communities of southern Wisconsin. Ecol Monogr 1957, 27 (4), 326–349. doi:10.2307/1942268
  • Witten DM, Classification and clustering of sequencing data using a Poisson model. The Annals of Applied Statistics 2011, 5 (4), 2493–2518. Ann Appl Stat doi:10.1214/11-AOAS493
  • Lozupone CA, Hamady M, Kelley ST, Knight R, Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities. Appl Environ Microb 2007, 73 (5), 1576–1585. doi:10.1128/AEM.01996-06
  • Sneath PH, The application of computers to taxonomy. Microbiology 1957, 17 (1), 201–226. doi:10.1099/00221287-17-1-201
  • McQuitty LL, Hierarchical linkage analysis for the isolation of types. Educ Psychol Meas 1960, 20 (1), 55–67. doi:10.1177/001316446002000106
  • Sokal R, Sneath PHA Principles of numerical taxonomy. WH: Freeman, San Francisco: 1963.
  • Sokal RR, A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull. 1958, 38, 1409–1438.
  • Ward JH Jr, Hierarchical grouping to optimize an objective function. J Am Stat Assoc 1963, 58 (301), 236–244. doi:10.1080/01621459.1963.10500845
  • Pearson KL, LIII. On lines and planes of closest fit to systems of points in space. London, Edinburgh Dublin Phil Mag J Sci 1901, 2 (11), 559–572. doi:10.1080/14786440109462720
  • Hotelling H, Analysis of a complex of statistical variables into principal components. J Educ Psychol 1933, 24 (6), 417. doi:10.1037/h0071325
  • Hotelling H, Relations Between Two Sets of Variates. Biometrika 1936, 28 (3/4), 321–377. doi:10.1093/biomet/28.3-4.321
  • Gower JC, Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 1966, 53 (3–4), 325–338. doi:10.1093/biomet/53.3-4.325
  • Shepard RN, The analysis of proximities: Multidimensional scaling with an unknown distance function. I. Psychometrika 1962, 27 (2), 125–140. doi:10.1007/BF02289630
  • Shepard RN, Metric structures in ordinal data. J Math Psychol 1966, 3 (2), 287–315. doi:10.1016/0022-2496(66)90017-4
  • Kruskal JB, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 1964, 29 (1), 1–27. doi:10.1007/BF02289565
  • Kruskal JB, Nonmetric multidimensional scaling: A numerical method. Psychometrika 1964, 29 (2), 115–129. doi:10.1007/BF02289694
  • Pollard K, Gilbert H, Ge Y, Taylor S, Dudoit S , Multtest: Resampling-Based Multiple Hypothesis Testing 2023. http://CRAN.Rproject.org/package=multtest,rpackageversion2.57.0
  • Robinson MD, Smyth GK, Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 2008, 9 (2), 321–332. doi:10.1093/biostatistics/kxm030
  • Anderson MJ, Walsh DC, PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: what null hypothesis are you testing? Ecological monographs 2013, 83 (4), 557–574. doi:10.1890/12-2010.1
  • Lin Y, Golovnina K, Chen Z-X, Lee HN, Negron YLS, Sultana H, Oliver B, Harbison ST, Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. Bmc Genom 2016, 17 (1), 28. doi:10.1186/s12864-015-2353-z
  • Wang Z, Gerstein M, Snyder M, RNA-Seq: a revolutionary tool for transcriptomics. Nature reviews genetics 2009, 10 (1), 57–63. Nat Rev Genet doi:10.1038/nrg2484
  • Song H, Ling W, Zhao N, Plantinga AM, Broedlow CA, Klatt NR, Hensley-McBain T, Wu MC, Accommodating multiple potential normalizations in microbiome associations studies. BMC Bioinform 2023, 24 (1), 22. doi:10.1186/s12859-023-05147-w