3,177
Views
3
CrossRef citations to date
0
Altmetric
Research Paper

Multimodal metagenomic analysis reveals microbial single nucleotide variants as superior biomarkers for early detection of colorectal cancer

ORCID Icon, , ORCID Icon, ORCID Icon, , , , ORCID Icon, ORCID Icon & ORCID Icon show all
Article: 2245562 | Received 24 Jul 2023, Accepted 03 Aug 2023, Published online: 27 Aug 2023
 

ABSTRACT

Microbial signatures show remarkable potentials in predicting colorectal cancer (CRC). This study aimed to evaluate the diagnostic powers of multimodal microbial signatures, multi-kingdom species, genes, and single-nucleotide variants (SNVs) for detecting precancerous adenomas. We performed cross-cohort analyses on whole metagenome sequencing data of 750 samples via xMarkerFinder to identify adenoma-associated microbial multimodal signatures. Our data revealed that fungal species outperformed species from other kingdoms with an area under the ROC curve (AUC) of 0.71 in distinguishing adenomas from controls. The microbial SNVs, including dark SNVs with synonymous mutations, displayed the strongest diagnostic capability with an AUC value of 0.89, sensitivity of 0.79, specificity of 0.85, and Matthews correlation coefficient (MCC) of 0.74. SNV biomarkers also exhibited outstanding performances in three independent validation cohorts (AUCs = 0.83, 0.82, 0.76; sensitivity = 1.0, 0.72, 0.93; specificity = 0.67, 0.81, 0.67, MCCs = 0.69, 0.83, 0.72) with high disease specificity for adenoma. In further support of the above results, functional analyses revealed more frequent inter-kingdom associations between bacteria and fungi, and abnormalities in quorum sensing, purine and butanoate metabolism in adenoma, which were validated in a newly recruited cohort via qRT-PCR. Therefore, these data extend our understanding of adenoma-associated multimodal alterations in the gut microbiome and provide a rationale of microbial SNVs for the early detection of CRC.

This article is part of the following collections:
Gut Microbiota in Cancer Development and Treatment

Acknowledgments

The authors would like to thank all the researchers for generously sharing their sequencing data included in this study. We acknowledge funding from the National Natural Science Foundation of China (82170542 to RZ, 92251307 to RZ, 82000536 to NJ, 91942312 to ZL, 81630017 to ZL, 32200529 to DW), the National Key Research and Development Program of China (2021YFF0703700/2021YFF0703702 to RZ), and the Guangdong Province “Pearl River Talent Plan” Innovation and Entrepreneurship Team Project (2019ZT08Y464 to LZ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

All processed data for this work are available at NODE with project ID OEP003766. Particularly, comprehensive multi-strain microbial SNV profiles are provided for future analysis. Raw data of our in-house metagenomic sequencing cohort are available from the Sequence Read Archive (SRA) with study ID: SRP308947. Other metagenomic sequencing data used in this manuscript are available from SRA with study IDs: ERP008729, ERP005534, DRA006684, DRA008156, SRP136711, SRP108915, SRP327788, and SRP129027.

Code availability

The codes and scripts for the bioinformatics analysis in this paper are available at https://github.com/tjcadd2020/Adenoma. xMarkerFinder, the core workflow used in this paper, is provided at https://github.com/tjcadd2020/xMarkerFinder.

Abbreviations

AUC: area under the ROC curve; BMI: body mass index; CD: Crohn’s disease; CDS: coding sequence; CRC: colorectal cancer; CTC: circulating tumor cell; ctDNA: circulating tumor DNA; IBD: inflammatory bowel disease; IGR: intergenic region; KO: KEGG orthology; MCC: Matthews correlation coefficient; MIDAS: Metagenomic Intra-Species Diversity Analysis System; MMUPhin: Meta-analysis Methods with a Uniform Pipeline for Heterogeneity in Microbiome Studies; NAFLD: nonalcoholic fatty liver disease; PCoA: principal coordinate analysis; PERMANOVA: permutational multivariate analysis of variance; qRT-PCR: quantitative real-time PCR; RF: random forest; SNV: single-nucleotide variant; UC: ulcerative colitis; WMS: whole metagenome sequencing.

Author contributions

NJ, RZ, ZL and LZ conceived and designed the study. WG, SG, DW, and NJ performed the public data collection. WG and NJ conducted the microbiome analysis. WG performed the bioinformatics analysis and model construction. XG, RS and ZF recruited the participants, collected the fecal sample and performed the experimental validation. WG and NJ drafted the manuscript. WG, XG, LZ, SG, RS, ZF, DW, ZL, RZ and NJ reviewed and edited the manuscript. All authors read and approved the final manuscript.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/19490976.2023.2245562

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

The work was supported by the National Natural Science Foundation of China [82170542, 92251307, 82000536, 91942312, 81630017, 32200529]; the National Key Research and Development Program of China [201YFF0703700/2021YFF0703702]; and the Guangdong Province “Pearl River Talent Plan” Innovation and Entrepreneurship Team Project [2019ZT08Y464].