1,654
Views
2
CrossRef citations to date
0
Altmetric
Research Paper

Using fecal immmunochemical cartridges for gut microbiome analysis within a colorectal cancer screening program

ORCID Icon, , ORCID Icon, , , , , , , ORCID Icon, , & ORCID Icon show all
Article: 2176119 | Received 21 Jul 2022, Accepted 31 Jan 2023, Published online: 16 Feb 2023

ABSTRACT

The colorectal cancer (CRC) screening program B-PREDICT is an invited two-stage screening project using a fecal immunochemical test (FIT) for initial screening followed by a colonoscopy for those with a positive FIT. Since the gut microbiome likely plays a role in the etiology of CRC, microbiome-based biomarkers in combination with FIT could be a promising tool for optimizing CRC screening. Therefore, we evaluated the usability of FIT cartridges for microbiome analysis and compared it to Stool Collection and Preservation Tubes. Corresponding FIT cartridges as well as Stool Collection and Preservation Tubes were collected from participants of the B-PREDICT screening program to perform 16S rRNA gene sequencing. We calculated intraclass correlation coefficients (ICCs) based on center log ratio transformed abundances and used ALDEx2 to test for significantly differential abundant taxa between the two sample types. Additionally, FIT and Stool Collection and Preservation Tube triplicate samples were obtained from volunteers to estimate variance components of microbial abundances. FIT and Preservation Tube samples produce highly similar microbiome profiles which cluster according to subject. Significant differences between the two sample types can be found for abundances of some bacterial taxa (e.g. 33 genera) but are minor compared to the differences between the subjects. Analysis of triplicate samples revealed slightly worse repeatability of results for FIT than for Preservation Tube samples. Our findings indicate that FIT cartridges are appropriate for gut microbiome analysis nested within CRC screening programs.

This article is part of the following collections:
Gut Microbiota in Cancer Development and Treatment

Introduction

Colorectal cancer (CRC) is the third leading cancer-related cause of death worldwide and represents a major public health issue.Citation1 In Austria, the CRC incidence rate is observed in the lower third within the European Union with about 4.500 new cases diagnosed each year.Citation2 Moreover, recent data indicate that the incidence of CRC is increasing, especially among younger adults.Citation3 Therefore, CRC has become an important and challenging global public health problem, in which the detection of cancer in early stages is of high importance. The natural history of sporadic CRC usually involves slow progression from precancerous polyps to cancer, which offers opportunities for screening and early detection.Citation4 Early detection of CRC is an important issue since stage at diagnosis remains the most important prognostic factor.Citation5 As CRC is one of the most preventable cancers, population-wide screening programs are recommended in many countries. Screening programs have the potential to detect early precancerous lesions and perform endoscopic removal of adenomas, thereby contributing to the reduction of CRC incidence and mortality.Citation6–8

In the ongoing “Colorectal Cancer Study of Austria” (CORSA) participants are recruited in cooperation with the province‐wide screening project “Burgenland Prevention Trial of Colorectal Disease with Immunological Testing” (B‐PREDICT), since 2003.Citation9 B-PREDICT, conducted in the Austrian federal state Burgenland, is an invited two-stage screening project for individuals aged between 40 and 80 using a fecal immunochemical test (FIT) for initial screening. Participants with a positive test are offered a diagnostic colonoscopy. During their clinical appointment, these participants are asked to take part in CORSA, sign a written informed consent, complete questionnaires and provide an EDTA blood sample and a stool sample for the CORSA biobank.

The economic burden of CRC in Austria was estimated €157 million per year. These costs account only for general healthcare costs as well as nursing expenses. Informal costs, costs of unpaid patient care provided by friends and relatives, lost earnings due to illness and premature death are not included, but are known to account for a major proportion of CRC-related costs (about 60%).Citation10 These figures highlight that non-healthcare costs contribute even more to the socioeconomic burden of CRC and that healthcare costs, cost-effectiveness, and success of cancer treatment are interrelated.

Nowadays, the preferred approach in testing for occult blood in feces used for CRC screening programs is the FIT, despite its relatively low specificity and sensitivity. Commonly used FIT test have shown low sensitivity for precancerous lesions (12.3–32.4%)Citation11 and early-stage cancer (40%).Citation12 Additionally, FIT tests may show false-negative results due to smoking status or advanced age, both of which are well-known risk factors for CRC, causing some cases to be missed. Taken together, there is an urgent demand for novel noninvasive biomarkers in addition to FIT – to identify those individuals who are more likely to benefit from screening colonoscopy and those who need an earlier or more frequent colonoscopy. The combination of conventional screening methods such as FIT with microbiome-based methods could be a promising tool for early detection of CRC. There is some evidence of carcinogenic mechanisms induced by bacteriaCitation13,Citation14 and therefore it has been hypothesized that the gut microbiome could play an important role in the development and progression of CRC. Specific changes in the microbiome occur during different stages of colorectal neoplasia, from adenomatous adenomas to early-stage cancer, to metastatic disease, supporting an etiologic and diagnostic role for the microbiome.Citation15,Citation16

An important issue in microbiome studies is the sample collection methodology. Although, recent studies have demonstrated that gut-based microbial DNA isolated from FIT cartridges can replace naïve stool samples for microbiome analysis, there is little consent in standard fecal sample collection methods.Citation17–19 The standardized sample collection methodology, particularly the feasibility of FIT samples for microbiome analyses within CRC screening programs are currently intensively discussed in research networks and consortia focusing on gut microbiome-based biomarkers.

Therefore, we evaluated the microbial reliability, inter- as well as intra-variability and usability of stool samples collected in FIT cartridges and Stool Collection and Preservation Tubes from participants of the screening program B-PREDICT as well as additional volunteer samples.

Methods

Questionnaires

CORSA participants and volunteers provided a basic CORSA questionnaire assessing data on body mass index (BMI), smoking history, alcohol consumption, education level, family status, profession, basic dietary habits, information on use of antibiotics and diabetes.

Fecal sample collection

Participants were instructed to collect stool samples at most three days prior to bowel cleanse and colonoscopy from the same bowel movement and to store them at room temperature until their clinical appointment. In the hospital, all samples were frozen and stored at −80°C until DNA extraction. Two sample collection methods were used: OC-Sensor FIT cartridges (Eiken Chemical Co., Ltd., Tokyo, Japan) and Stool Collection and Preservation Tubes (Norgen Biotek Corp., Ontario, Canada), henceforth referred to as FIT and Norgen, respectively. Each patient provided one FIT as well as one Norgen sample from the same bowl movement.

In addition to participants recruited with the B-PREDICT screening, five volunteers provided FIT cartridges as well as Norgen samples in triplicates. Volunteer samples were collected from the same bowel movement, stored three days on room temperature, and frozen at −80°C until DNA extraction.

DNA isolation

DNA isolation is performed from FIT cartridge buffers and matching Norgen samples with the beads-based QIAamp PowerFaecal Pro DNA Kit (Qiagen, Hilden, Germany) in combination with a Precellys® 24 homogenizer (VWR International GmbH, Vienna, Austria). 500 µL buffer-stool solution was used as starting material for DNA isolation from each sample. The quality and quantity of the DNA is assessed prior to 16S rRNA sequencing using a NanoDropTM ND-1000 spectrophotometer (VWR International GmbH, Vienna, Austria) and fluorometrically with the QubitTM dsDNA HS Assay Kit (ThermoFisher Scientific, Vienna, Austria).

16s rRNA gene sequencing

For the analysis of the bacterial microbiota, the variable V3-V4 region of the eubacterial 16S rDNA gene was amplified. The 16S small subunit ribosomal gene functions as an exclusive highly conserved housekeeping gene, which can be used to determine microbial communities within samples. Sample library preparation was performed according to the Illumina protocol (Illumina, San Diego, USA) followed by sequence analysis on the Illumina MiSeq platform. The gene‐specific sequences used in the given protocol are selected from Klindworth et al.Citation20 as the most promising bacterial primer pair. Illumina adapter overhang nucleotide sequences are added to the gene‐specific sequences. The full length primer sequences, using standard IUPAC nucleotide nomenclature, to follow the protocol targeting this region are:

16S Amplicon PCR Forward Primer = 5’

TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG

16S Amplicon PCR Reverse Primer = 5’

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC

For sequencing the MiSeq Reagent Kit v3 (Illumina, San Diego, USA) enabling a read length up to 2 × 300 bp was utilized.

Read pre-processing and taxonomic classification

Primers were trimmed and spacer sequences removed from the raw sequencing reads using cutadapt version 3.4. All trimmed reads were brought to a consistent length using cutadapt, with reads shorter than 280bp/278bp (forward/reverse) being discarded.Citation21 The pre-processed reads where then analyzed using FIGARO version 1.1.1Citation22 with default options to predict the most optimal quality trimming parameters for DADA2 version 1.18.0.Citation23 Using the determined cutoff values of 275/169 basepairs and maximum expected errors of 2/1 (forward/reverse), the reads were quality filtered, trimmed to a uniform length, denoised, and merged into amplicon sequencing variants (ASV). Using DECIPHER version 2.18.1 and IdTaxaCitation24 as well as dada2, the ASVs were taxonomically classified using the SILVA v138Citation25 database. For the taxonomic classification at the species level, only exact matches were used.

Statistical analysis

We performed our analyses based on ASVs, representing the highest possible resolution, as well as on various taxonomic ranks, representing different levels of aggregation. ASVs present in less than 5% of the analyzed samples were excluded. Microbial abundances were transformed using the center log ratio (CLR) transformation due to the compositional nature of microbiome datasets.Citation26,Citation27 The resulting values are scale invariant and therefore count normalization is unnecessary. Since this transformation cannot be calculated for count matrices containing 0-values, all 0s were imputed using the R package zCompositions applying multiplicative simple replacement.Citation28

Differences in sample characteristics between FIT and Norgen samples were visualized using violin plots, i.e. density plots displayed vertically like boxplots.Citation29 Intra-class coefficients (ICCs)Citation30 were calculated for all ASV abundances between FIT and Norgen samples of the patient cohort (ICCCitation1,Citation3) and of the volunteers (ICC(3,k)). Consistency was chosen as the relationship considered to be important, since absolute deviations would not decrease the usability of FIT-originated data for risk prediction. However, ICCs for absolute agreement (ICCCitation1,Citation2) were calculated utilizing the triplicate samples (FIT as well as Norgen tubes) available from volunteers. This was done for a range of alpha and beta diversities (i.e. by calculating the respective diversity measure and comparing the first components of the Principal Coordinates Analysis (PCoA)) as well as the ASV CLR abundances. Additionally, all the abundance-based ICCs were calculated for each taxonomic rank (species – phylum) in the same way as for the ASVs.

Calculating the Euclidean distance between two samples using the CLR values results in the Aitchison distance, which was used to perform a hierarchical clustering of all samples with Ward’s clustering criterion.Citation31 ALDEx2 was used to identify significantly differently abundant ASVs and taxa between FIT and Norgen samples.Citation32 P-values were corrected for multiple testing using the Benjamini–Hochberg method considering all tests performed at that specific taxonomic rank as the total number of hypotheses. Effect sizes calculated by ALDEx2 were converted to standardized effect sizes (Cohen’s d).Citation33,Citation34

The volunteer samples, which consist of triplicates of each sample type, were used to calculate a linear model to identify the proportions of the sum of squares explained by the subject and the sample type for each ASV identified in at least three samples. The results are presented together with the sum of squared errors in a ternary plot.Citation35 Additionally, separate linear models were fitted to samples of each type to identify the sample-type-specific proportion of variance explained by the subject for each ASV present in at least two samples.

All analyses were performed with the statistical programming language R, version 4.1.1Citation36 and the R packages ggplot2,Citation37 ggpubr,Citation38 and ggraphCitation39 were used for visualizations. To produce the PCA the base R-package “stats” was used. Results with a p-value smaller 0.05 were deemed statistically significant.

is giving a schematic graphical flow chart of the experimental and analytical workflow of the presented study.

Figure 1. Schematic workflow of the presented study.

Figure 1. Schematic workflow of the presented study.

Ethical aspects

Written consent was obtained from all study participants, and all studies were approved by the corresponding Institutional Review Board. Compliance with the 1964 Declaration of Helsinki, the Austrian Drug Law (Arzneimittelgesetz, AMG) and the requirements of Good Clinical Practice of the European Community (CPMP/ICH/135/95) will be ensured. The CORSA study was approved by the institutional review boards (EK 33/2010 and EK 1160/2016).

Results

Study participants

Eighty-one participants recruited within B-PREDICT provided a FIT tube and a stool nucleic acid collection and preservation tube (Norgen). The median age of patients was 63.4 years and the median BMI was 27.6. Additionally, five volunteers were recruited with a median age of 30.8 years and a median BMI of 23.1 ().

Table 1. Characteristics of study participants.

Norgen and FIT samples produce similar numbers of reads and sequences

The denoised reads contained 6,097 ASVs. Taxonomic classification of all ASVs yielded 241 species, 240 genera, 80 families, 47 orders, 21 classes, and 14 phyla, with varying proportions of reads classified at each taxonomic rank (Fig. S1). The median richness was 263 ASVs for FIT samples and 265 ASVs for Norgen samples () and the median number of reads after filtering was 60,283 for FIT and 59,266 for Norgen ().

Figure 2. Figure based on patient samples. ASVs are colored according to their phylum and sized according to their abundance. A: Scatterplot of ASVs with mean center log ratio (CLR) transformed abundances of FIT and Norgen samples B: Scatterplot showing the relationship between each ASV‘s ICC estimate (with confidence interval) and the sum of the logarithms of its abundances. ICCs above 0.9 (dashed line) indicate excellent reliability. A boxplot of the ICCs is provided additionally. C: ICC estimates and confidence intervals for all calculated metrics computed on FIT and Norgen samples. is giving all calculated beta diversities. The ICC of the Shannon, Simpson and Inverse Simpson index are all above 0.75, with the Shannon index providing the highest reliability between FIT and Norgen. The Bray-Curtis dissimilarity and the Jaccard index results are almost in perfect agreement. Unweighted UniFrac also displays an excellent ICC, while the weighted version results in only good reliability. D, E, F: Violin plots comparing Norgen and FIT samples based on the richness of the samples (d), number of reads per sample (e) and the prevalence of ASVs identified in at least 5% of the samples (f).

Figure 2. Figure based on patient samples. ASVs are colored according to their phylum and sized according to their abundance. A: Scatterplot of ASVs with mean center log ratio (CLR) transformed abundances of FIT and Norgen samples B: Scatterplot showing the relationship between each ASV‘s ICC estimate (with confidence interval) and the sum of the logarithms of its abundances. ICCs above 0.9 (dashed line) indicate excellent reliability. A boxplot of the ICCs is provided additionally. C: ICC estimates and confidence intervals for all calculated metrics computed on FIT and Norgen samples. Table 2 is giving all calculated beta diversities. The ICC of the Shannon, Simpson and Inverse Simpson index are all above 0.75, with the Shannon index providing the highest reliability between FIT and Norgen. The Bray-Curtis dissimilarity and the Jaccard index results are almost in perfect agreement. Unweighted UniFrac also displays an excellent ICC, while the weighted version results in only good reliability. D, E, F: Violin plots comparing Norgen and FIT samples based on the richness of the samples (d), number of reads per sample (e) and the prevalence of ASVs identified in at least 5% of the samples (f).

Table 2. Beta diversities figures.

Of the identified ASVs 1,029 (16.9%) were detected in more than 5% of the samples after filtering. The median prevalence of these ASVs (i.e. percentage of samples in which an ASV was detected) was 11.5% in FIT samples and 12.5% in Norgen samples ().

Average CLR abundances similar between FIT and Norgen

The average CLR abundances of ASVs display high similarity between the FIT and the Norgen samples (). Among the 10 ASVs with the highest differences between sample types, seven are more abundant in FIT samples. Of these, the highest differences can be observed for an ASV belonging to the genus Escherichia-Shigella of the Phylum Proteobacteria and the rest belong to the genera Enterococcus, Lactococcus, Streptococcus, Leuconostoc. Of the three ASVs with higher abundance in Norgen samples one belongs to the genus Oscillibacter and two could not be classified on the genus rank. Complete results, including the comparisons on each taxonomic rank are available in Table S1 and Fig. S1.

ICCs positively associated with abundance of ASVs

The ASV-specific ICCs between the FIT and Norgen samples of the patients () display a positive association with the summed log abundances of the ASV. Low summed log abundances are in many cases accompanied by low ICCs and large confidence intervals. This indicates, that the estimates lack in precision for many of the rarer ASVs. Overall, the ICCs’ first quartile is 0.759, the median is 0.892, and the third quartile is 0.951. A common interpretation is, that an ICC higher than 0.75 indicates good reliability and an ICC higher 0.9 indicates excellent reliability.Citation30 Fig. S1 provides visualizations of this analysis for taxonomic ranks from species to phylum and Table S2 contains complete ICC estimates and confidence intervals for all bacterial taxa and ASVs. These results confirm an association between abundances and reliability. Additionally, these results indicate higher reliability for higher ranks. This is probably due to the fact that higher ranks result in deeper aggregation and higher proportions of classified reads. ICCs were also estimated based on the volunteer samples, which consisted of triplicates for each sample type. The “between FIT and Norgen” ICCs were therefore calculated based on the means of the respective samples. Additionally, this allowed for the estimation of the ICCs within the FIT and within the Norgen samples (Fig. S2). However, these were calculated as being obtained from three separate random raters (i.e. triplicate samples), resulting in lower and less stable estimates than the “between FIT and Norgen” ICCs, making a direct comparison of these results impossible. Nevertheless, this analysis shows that even separate stool samples of the same sample type and from the same subject contain noteworthy heterogeneity. The ICC estimates for the alpha and beta diversities and their confidence intervals can be seen in . In addition, is giving all beta diversities. The Shannon, Simpson, and Inverse Simpson indices all display ICCs above 0.75, with Shannon providing the highest reliability between FIT and Norgen. In the case of the beta diversities, the Bray-Curtis dissimilarity and the Jaccard index result in almost perfect agreement. Unweighted UniFrac also displays an excellent ICC, while the weighted version results in only good reliability.

Samples form subject-specific clusters

The inter-subject distances (i.e. all possible distances between two samples from different subjects) displayed a median of 82.4, a maximum of 113.0 and a minimum of 50.1, which is higher than the maximum of all intra-subject distances, namely 41.2. The intra-subject distances consist of the distances between the FIT and the Norgen samples of each patient (1 distance per patient; median = 26.5) and each volunteer (9 distances per volunteer; median = 26.4) as well as the distances between the FIT (3 distances per volunteer; median = 25.4), respectively, Norgen (3 distances per volunteer; median = 23.5) triplicates of each volunteer (). The intra-volunteer distances were significantly different (Kruskal-Wallis test: p = < 0.001) and of the subsequent pairwise tests only the comparison between “FIT to Norgen” distances and “Norgen to Norgen“ distances reached statistical significance (Wilcoxon test: p = < 0.001). Based on these distances, a hierarchical clustering was performed on all samples. All samples clustered together according to the subject who provided them before being joined with samples of other subjects ().

Figure 3. A: [gray box] Boxplot of distances between samples of different subjects (regardless of sample type), [green box] between FIT and Norgen samples of the same patient and [red box] between FIT and Norgen samples, [blue box] between FIT samples and [yellow box] between Norgen samples of the same volunteer. B: Result of a hierarchical clustering algorithm based on the distances between all samples. Samples originating from the same subject are connected with colored bars.

Figure 3. A: [gray box] Boxplot of distances between samples of different subjects (regardless of sample type), [green box] between FIT and Norgen samples of the same patient and [red box] between FIT and Norgen samples, [blue box] between FIT samples and [yellow box] between Norgen samples of the same volunteer. B: Result of a hierarchical clustering algorithm based on the distances between all samples. Samples originating from the same subject are connected with colored bars.

PCA of volunteer samples reveals no separability of FIT and Norgen samples

The first four principal components of the ASVs CLR abundances in the volunteer samples are shown in and reveal no sample type-specific clusters. Only the samples recruited from volunteer no. 4 display some slight separability between FIT and Norgen samples. However, all other samples cluster randomly around a subject-specific center, regardless of the sample type.

Figure 4. A: Taxonomic tree displaying significant differences between FIT and Norgen samples based on the ALDEX analysis. Taxa are labeled with an ID and the first letters of their name. Full taxa names are given in . B: Scatterplots of the first four principal components extracted from the volunteer samples. Each of the five volunteers is represented by a number.

Figure 4. A: Taxonomic tree displaying significant differences between FIT and Norgen samples based on the ALDEX analysis. Taxa are labeled with an ID and the first letters of their name. Full taxa names are given in Table 3. B: Scatterplots of the first four principal components extracted from the volunteer samples. Each of the five volunteers is represented by a number.

Table 3. List of full names for taxa displayed in . Effect sizes were calculated with ALDEx2 and are only shown for significant (after p-value correction) differences. Negative values indicate higher abundances in FIT samples, while positive values correspond to higher abundances in Norgen samples.

Differential abundance detected at various taxonomic ranks

Bacterial abundances of the patients’ FIT and Norgen samples were compared at all taxonomic ranks (species to phylum) and the significant results are presented in as a taxonomic tree. Some branches of the tree display consistent differences between the sample types. For example, all significantly differentially abundant taxa belonging to the phylum Actinobacteriota or the class Bacilli are more abundant in FIT samples, while all the significant taxa belonging to the class Bacteroidales are more abundant in the Norgen samples. However, there are also inconsistent branches, like the family of Lachnospiraceae, which contain both, genera more abundant in FIT and genera more abundant in Norgen samples. Complete results of the ALDEx2 analysis are available in Table S3.

Sample type explains only small proportion of sum of squares

Linear models were fitted on the CLR abundances of the volunteer samples for all ASVs detected in at least 3 of the 30 samples. The resulting proportions of sum of squares explained by subject and sample type as well as the residual proportions are shown in a ternary plot in and the corresponding boxplots in . This shows that most of the variance in the ASVs’ CLR abundances can be explained by the subject compared to only small amounts which are explained by the sample type. Some ASVs display a high proportion of residual variance, which overall constitutes a much bigger issue for the repeatability of results. This is also evident from the results of separate models for FIT and Norgen using ASVs detected in at least three samples of the respective type (). This model specification shows that the amounts of variance explained by subject are slightly lower (i.e. residual variance is higher) for FIT than for Norgen. For both sample types, there is a peak at proportions near 1, which is slightly less pronounced for FIT and corresponds to a lower mean of 0.930 for FIT, compared to 0.936 for Norgen.

Figure 5. Analyses of triplicate samples of the five volunteers (3x FIT, 3x Norgen). A: Ternary plot displaying the variance components of each ASV as a single point based on a linear model with subject and sample type as explanatory variables. Considering only ASVs detected in at least three volunteer samples. ASVs are colored according to the results of the corresponding significance tests on the patient samples. B: Boxplots of the variance components used in A. C: Violin plots comparing the proportions explained by subject in separate models for FIT and Norgen. Only ASVs detected in at least three samples of the respective sample type were used.

Figure 5. Analyses of triplicate samples of the five volunteers (3x FIT, 3x Norgen). A: Ternary plot displaying the variance components of each ASV as a single point based on a linear model with subject and sample type as explanatory variables. Considering only ASVs detected in at least three volunteer samples. ASVs are colored according to the results of the corresponding significance tests on the patient samples. B: Boxplots of the variance components used in A. C: Violin plots comparing the proportions explained by subject in separate models for FIT and Norgen. Only ASVs detected in at least three samples of the respective sample type were used.

Discussion

Several CRC screening programs such as B-PREDICT implemented a two-stage screening, using FIT for the initial screening. The combination of conventional screening methods such as FIT with microbiome-based methods could be a promising tool for optimizing early detection of CRC. To investigate the usability of FIT samples for gut microbiome analysis, we compared FIT samples as well as stool samples collected in conventional Preservation Tubes (Norgen) from participants of the CRC screening program B-PREDICT and additional volunteers.

Our findings are mostly in accordance with previously published studies. Multiple prior studies concluded that microbial composition and diversity were largely explained by between-participants differences and only marginally by the collection methods.Citation40,Citation41 Furthermore, different studies demonstrated that FIT tubes used for fecal occult blood sample collection have the potential to be used for sample collection for microbiome studies.Citation42,Citation43 Besides others,Citation44,Citation45 Gudra et al. have evaluated fecal sample stability in the commonly used OC-Sensor (Eiken Chemical, Tokyo, Japan), the same FIT tube applied in the B-PREDICT study, under various storage conditions employing two different sequencing platforms. They did not find a significant difference between immediately frozen samples and samples stored for 2 days at 4°C and for 2 days at 20°C.Citation46 Masi and colleagues expanded upon these finding by investigating the performance of FIT samples in the English Bowel Cancer Screening Programme to understand the role of gut microbiome in colorectal neoplasia holds great promise. In concordance with other studiesCitation16,Citation47 exploring the potential of FITs for microbiome sequencing, they concluded that fecal microbiome diversity and taxonomic profiles were consistent across test conditions.Citation48

Sinha et al. demonstrated in their study comprising 20 volunteers that the Fecal Occult Blood Test (FOBT) is a reasonable sample collection method with optimal stability and reproducibility for 16S rRNA microbiome profiling.Citation17 Furthermore, a recent study by Zouiouich and colleagues, investigating the impact of sample collection and storage method on the accuracy and stability of 16S rRNA sequencing, could show that stability ICCs were high for FIT tubes that were collected in course of a colorectal cancer screening setting. The authors concluded that commonly stool collection cards and different types of FIT tubes are acceptable tools for microbiome measurements and have the utility for developing microbiome-focused cohorts nested within screening programs.Citation19 In addition, a further study comparing microbiome stability and accuracy across different fecal sample collection methods, commonly used in ongoing CRC screening program, concluded that the interindividual variability was much higher than the variability introduced by the collection method. However, they authors found that different types of FIT tubes did not seem to perform equally in terms of relative abundance of phyla and genera, which support observations from previous studies.Citation16 Furthermore, a recent study using FIT as well as fresh frozen facal samples of 30 volunteers of an Estonian screening program concluded that the variation between individuals was greater than the differences introduced by the collection strategy and that the vast majority of the genera were stable for up to 7 days.Citation49 Moreover, a study by Grobbee and colleagues could show that fecal microbial content can be measured in FIT samples and remains stable for over six days. Results of their qPCR measurements of positive FIT samples illustrated that the total bacterial load was higher in colorectal cancer patients and patients diagnosed with a high-grade dysplasia.Citation50

Our results indicate, that the microbial communities obtained from Norgen samples and FIT tubes are highly similar, mainly differing in two specific attributes. Norgen samples display a lower residual variance, i.e. higher repeatability. We have shown, that the median FIT to FIT distance is 8.0% higher than the median Norgen to Norgen distance, representing the increase in unaccounted variation across the complete microbiome profile. Furthermore, there are differences in abundances of several taxa due to sample type. Although the overall effect of the sample type on the microbiome profile is only slight, significant differences between FIT and Norgen were detected for some taxa within B-PREDICT participants. These results are supported by the analysis of triplicate volunteer samples. However, it is also evident that even for the ASVs affected by the sample type, the resulting microbial abundance is much more strongly influenced by the subject. Subject-specific agreement of ASV-abundances is only slightly affected by sample type and clearly more negatively affected by residual variance, which probably arises due to issues like zero-inflationCitation51 and false-positive detection, which impact low-abundance taxa more strongly and are inherent to microbiome analysis. Generally, taxa with low abundances are associated with lower agreement and lower ICCs. Therefore, increasing the taxonomic rank on which an analysis is performed (i.e. from genus to family) leads to results indicating higher reliability. In contrast to the majority of already published data we could prove that FIT samples, a broadly used pre-screening test in CRC screening programs, hold the potential to be applied as additional diagnostic strategy to detect shifts in microbiome profiles and thereby may guide individual patient surveillance. Overall, we could show that FIT samples can be used for profiling the microbiota in a CRC screening setting.

A limitation of our study is that no homogenization of sample material during sampling was performed, thereby inevitably introducing variation into samples from the same subject. To assess a baseline of this variation, triplicate samples were obtained from volunteers and incorporated into the analysis. However, FIT samples analyzed in the present study were obtained in course of the regular B-PREDICT process representing a usual sampling procedure within a CRC screening. A further limitation of the presented study is the application of 16S rRNA sequencing depending on a single gene, the 16S small subunit ribosomal RNA gene, known to be limited by short read lengths obtained as well as the limitation to two different hypervariable regions V3 and V4.Citation52 However, as the main objective of the present study was to evaluate the usability of FIT cartridges for microbiome analysis in a colorectal cancer screening setting, we selected 16S rRNA sequencing being proven as a reliable and efficient option for taxonomic classification. Furthermore, 16S rRNA sequencing has enhanced microbiome studies by improving accuracy and making tests cost-effective holding the potential to be applied as a routine diagnostic method to detect shifts in microbiome profiles.Citation53

Our findings, taken together with previous studies, demonstrate the potential of FIT, as obtained through a national CRC screening program, to provide a convenient, representative, and cost-effective means of studying fecal microbiota in a large population.

Besides the validation of our results in larger international study cohorts, our next research steps will include an association study aiming to link microbiome profiles to clinical outcomes and patient histories. Due to the medical trend moving toward personalized medicine, there is a huge demand of novel noninvasive biomarkers to stratify patients according to their risk to develop cancer and to tailor individual surveillance. Results from our ongoing work will contribute to the improvement of targeted and cost-effectiveness medicine by combining conventional CRC screening methods such as FIT with innovative microbiome-based methods, and the identification of better biomarkers for patient risk stratification, needed to guide clinical follow-up, surveillance and targeted screening. Furthermore, as sequencing technologies are becoming cheaper, clinics will integrate genetic analysis into their routine. Microbiome analysis is expected to play a main role in optimizing future clinical routine.

Conclusions

In conclusion, the present study supports previous findings indicating that microbial data obtained from different collection methods are relatively stable and may be an appropriate method to collect fecal samples for gut-based microbiome profiling in CRC screening studies to optimize current CRC screening. However, validation in larger studies as well as association studies, linking microbiome profiles and clinical outcomes, are warranted.

Abbreviations

ASV: Amplicon sequence variants; B-PREDICT: Burgenland Prevention Trial of Colorectal Cancer Disease with Immunological Testing; BMI: Body mass index; CRC: Colorectal cancer; CORSA: Colorectal Cancer Study of Austria; CLR: Centre log ratio; FIT: Fecal immunochemical test; FOBT: Fecal occult blood test; ICC: Intraclass correlation coefficient

Declarations

Ethical Approval and Consent to participate

Written consent was obtained from all study participants and all studies were approved by the corresponding Institutional Review Board. Compliance with the 1964 Declaration of Helsinki, the Austrian Drug Law (Arzneimittelgesetz, AMG) and the requirements of Good Clinical Practice of the European Community (CPMP/ICH/135/95) will be ensured. The CORSA study was approved by the institutional review boards (EK 33/2010 and EK 1160/2016).

Availability of data and materials

Raw sequencing data and patient metadata are available at the NCBI Sequence

Read Archive (BioProject PRJNA801143).

R-scripts used in the analysis can be found hear: https://github.com/martin-borkovec/corsa-microbiome.

Authors’ contributions

SB, MB and AG designed and coordinated the study; AG supervised this study. AG together with BS and NG received funding to conduct the study. FB, PG, TG, MH, GL and RL collected samples and coordinated sample recruitment; SB carried out DNA isolation and sample preparation. SB, MB, AB, AF, CJ and AG participated in data analysis and result interpretation; MB prepared figures and tables; SB, MB and AG drafted the manuscript; All authors have read and approved the manuscript.

Preprint

The present manuscript has been uploaded as Author’s Original Manuscript for preprint at https://doi.org/10.21203/rs.3.rs-1294888/v1.

Supplemental material

Supplemental Material

Download Zip (644.4 KB)

Acknowledgments

We kindly thank all individuals who agreed to participate in CORSA. Furthermore, we thank all cooperating physicians and students.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/19490976.2023.2176119

Additional information

Funding

This study was funded by the “Österreichische Forschungsförderungsgesellschaft” FFG BRIDGE (grant 880626, to Andrea Gsur) and was supported by COST Action CA17118.

References

  • Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality Worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–16. doi:10.3322/caac.21660.
  • Keum N, Giovannucci E. Global burden of colorectal cancer: emerging trends, risk factors and prevention strategies. Nat Rev Gastroenterol Hepatol. 2019;16(12):713–732. doi:10.1038/s41575-019-0189-8.
  • Vuik FE, Nieuwenburg SA, Bardou M, Lansdorp-Vogelaar I, Dinis-Ribeiro M, Bento MJ, Zadnik V, Pellisé M, Esteban L, Kaminski MF, et al. Increasing incidence of colorectal cancer in young adults in Europe over the last 25 years. Gut. 2019;68(10):1820–1826. doi:10.1136/gutjnl-2018-317592.
  • Fearon ER. Molecular genetics of colorectal cancer. Annu Rev Pathol. 2011;6(1):479–507. doi:10.1146/annurev-pathol-011110-130235.
  • Nikolouzakis TK, Vassilopoulou L, Fragkiadaki P, Mariolis Sapsakos T, Papadakis GZ, Spandidos DA, Tsatsakis MA, Tsiaoussis J. Improving diagnosis, prognosis and prediction by using biomarkers in CRC patients (Review). Oncol Rep. 2018;39(6):2455–2472. doi:10.3892/or.2018.6330.
  • Hewitson P, Glasziou P, Watson E, Towler B, Irwig L. Cochrane systematic review of colorectal cancer screening using the fecal occult blood test (hemoccult): an update. Am J Gastroenterol. 2008;103(6):1541–1549. doi:10.1111/j.1572-0241.2008.01875.x.
  • Brenner H, Stock C, Hoffmeister M. Effect of screening sigmoidoscopy and screening colonoscopy on colorectal cancer incidence and mortality: systematic review and meta-analysis of randomised controlled trials and observational studies. Bmj. 2014;348(apr09 1):g2467. doi:10.1136/bmj.g2467.
  • Miller EA, Pinsky PF, Schoen RE, Prorok PC, Church TR. Effect of flexible sigmoidoscopy screening on colorectal cancer incidence and mortality: long-term follow-up of the randomised US PLCO cancer screening trial. Lancet Gastroenterol Hepatol. 2019;4(2):101–110. doi:10.1016/S2468-1253(18)30358-3.
  • Gsur A, Baierl A, Brezina S. Colorectal Cancer Study of Austria (CORSA): a population-based multicenter study. Biology (Basel). 2021;10:8.
  • Jahn B, Sroczynski G, Bundo M, Mühlberger N, Puntscher S, Todorovic J, Rochau U, Oberaigner W, Koffijberg H, Fischer T, et al. Effectiveness, benefit harm and cost effectiveness of colorectal cancer screening in Austria. BMC Gastroenterol. 2019;19(1):209. doi:10.1186/s12876-019-1121-y.
  • Chang LC, Shun CT, Hsu WF, Tu CH, Tsai PY, Lin BR, Liang J-T, Wu M-S, Chiu H-M. Fecal immunochemical test detects sessile serrated adenomas and polyps with a low level of sensitivity. Clinical Gastroenterology and Hepatology: the Official Clinical Practice Journal of the American Gastroenterological Association. 2017;15(6):872–9.e1. doi:10.1016/j.cgh.2016.07.029.
  • Niedermaier T, Balavarca Y, Stage-Specific BH. Sensitivity of fecal immunochemical tests for detecting colorectal cancer: systematic review and meta-analysis. Am J Gastroenterol. 2020;115(1):56–69. doi:10.14309/ajg.0000000000000465.
  • Cougnoux A, Dalmasso G, Martinez R, Buc E, Delmas J, Gibold L, Sauvanet P, Darcha C, Déchelotte P, Bonnet M, et al. Bacterial genotoxin colibactin promotes colon tumour growth by inducing a senescence-associated secretory phenotype. Gut. 2014;63(12):1932–1942. doi:10.1136/gutjnl-2013-305257.
  • Wu S, Rhee KJ, Albesiano E, Rabizadeh S, Wu X, Yen HR, Huso DL, Brancati FL, Wick E, McAllister F, et al. A human colonic commensal promotes colon tumorigenesis via activation of T helper type 17 T cell responses. Nat Med. 2009;15(9):1016–1022. doi:10.1038/nm.2015.
  • Thomas AM, Manghi P, Asnicar F, Pasolli E, Armanini F, Zolfo M, Beghini F, Manara S, Karcher N, Pozzi C, et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat Med. 2019;25(4):667–678. doi:10.1038/s41591-019-0405-7.
  • Baxter NT, MTt R, Rogers MA, Schloss PD. Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med. 2016;8(1):37. doi:10.1186/s13073-016-0290-3.
  • Sinha R, Chen J, Amir A, Vogtmann E, Shi J, Inman KS, Flores R, Sampson J, Knight R, Chia N, et al. Collecting fecal samples for microbiome analyses in epidemiology studies. Cancer Epidemiol Biomarkers Prev. 2016;25(2):407–416. doi:10.1158/1055-9965.EPI-15-0951.
  • Wu Z, Hullings AG, Ghanbari R, Etemadi A, Wan Y, Zhu B, Poustchi H, Fahraji BB, Sakhvidi MJZ, Shi J, et al. Comparison of fecal and oral collection methods for studies of the human microbiota in two Iranian cohorts. BMC Microbiol. 2021;21(1):324. doi:10.1186/s12866-021-02387-9.
  • Zouiouich S, Mariadassou M, Rué O, Vogtmann E, Huybrechts I, Severi G, Boutron-Ruault M-C, Senore C, Naccarati A, Mengozzi G, et al. Comparison of fecal sample collection methods for microbial analysis embedded within colorectal cancer screening programs. Cancer Epidemiol Biomarkers Prev. 2022 Feb;31(2):305–314. doi:10.1158/1055-9965.EPI-21-0188.
  • Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, Glöckner FO. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41(1):e1. doi:10.1093/nar/gks808.
  • Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal. 2011;17:10–12.
  • Weinstein MM, Prem A, Jin M, Tang S, Bhasin JM. FIGARO: an efficient and objective tool for optimizing microbiome rRNA gene trimming parameters. bioRxiv. 2019;610394. doi:10.1101/610394.
  • Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–583. doi:10.1038/nmeth.3869.
  • Murali A, Bhargava A, Wright ES. IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences. Microbiome. 2018;6(1):140. doi:10.1186/s40168-018-0521-5.
  • Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2012;41(D1):D590–D6. doi:10.1093/nar/gks1219.
  • Gloor GB, Reid G. Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data. Canadian Journal of Microbiology. 2016;62(8):692–703. doi:10.1139/cjm-2015-0821.
  • Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this is not optional. Frontiers in Microbiology. 2017;8:2224. doi:10.3389/fmicb.2017.02224.
  • Palarea-Albaladejo J, Martín-Fernández JA. zCompositions — r package for multivariate imputation of left-censored data under a compositional approach. Chemometrics and Intelligent Laboratory Sys. 2015;143:85–96. doi:10.1016/j.chemolab.2015.02.019.
  • Hintze JL, Nelson RD. Violin plots: a box plot-density trace synergism. Am Stat. 1998;52:181–184.
  • Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163. doi:10.1016/j.jcm.2016.02.012.
  • Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963;58(301):236–244. doi:10.1080/01621459.1963.10500845.
  • Fernandes AD, Reid JNS, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome. 2014;2(1):15. doi:10.1186/2049-2618-2-15.
  • Gloor GB. Gloor lab musings [Internet] 2021. [cited 23 Dec 2021]. Available from: https://gloorlab.blogspot.com/2021/03/measuring-effect-size-in-aldex2.html.
  • Cohen J. Statistical power analysis for the behavioral sciences: academic press; 2013.
  • Hamilton NE, Ferry M. ggtern: ternary diagrams using ggplot2. J Statistical Software, Code Snippets. 2018;87:1–17.
  • R Core Team. R: a language and environment for statistical computing. Vienna (Austria): R Foundation for Statistical Computing; 2021.
  • Wickham H ggplot2: elegant Graphics for Data Analysis. Springer-Verlag New York; 2016.
  • Alboukadel K. ggpubr: ‘ggplot2’ Based Publication Ready Plots. 2020.
  • Pedersen TL. ggraph: an implementation of grammar of graphics for graphs and networks. 2021.
  • Vogtmann E, Chen J, Kibriya MG, Chen Y, Islam T, Eunes M, Ahmed A, Naher J, Rahman A, Amir A, et al. Comparison of fecal collection methods for microbiota studies in Bangladesh. Appl Environ Microbiol. 2017;83(10):10. doi:10.1128/AEM.00361-17.
  • Wang Z, Zolnik CP, Qiu Y, Usyk M, Wang T, Strickler HD, Isasi CR, Kaplan RC, Kurland IJ, Qi Q, et al. Comparison of fecal collection methods for microbiome and metabolomics studies. Front Cell Infect Microbiol. 2018;8:301. doi:10.3389/fcimb.2018.00301.
  • Byrd DA, Chen J, Vogtmann E, Hullings A, Song SJ, Amir A, Kibriya MG, Ahsan H, Chen Y, Nelson H, et al. Reproducibility, stability, and accuracy of microbial profiles by fecal sample collection method in three distinct populations. PLoS One. 2019;14(11):e0224757. doi:10.1371/journal.pone.0224757.
  • Rounge TB, Meisal R, Nordby JI, Ambur OH, de Lange T, Hoff G. Evaluating gut microbiota profiles from archived fecal samples. BMC Gastroenterol. 2018;18(1):171. doi:10.1186/s12876-018-0896-6.
  • Carroll IM, Ringel-Kulka T, Siddle JP, Klaenhammer TR, Ringel Y, Neufeld J. Characterization of the fecal microbiota using high-throughput sequencing reveals a stable microbial community during storage. PLoS One. 2012;7(10):e46953. doi:10.1371/journal.pone.0046953.
  • Cardona S, Eck A, Cassellas M, Gallart M, Alastrue C, Dore J, Azpiroz F, Roca J, Guarner F, Manichanh C, et al. Storage conditions of intestinal microbiota matter in metagenomic analysis. BMC Microbiol. 2012;12(1):158. doi:10.1186/1471-2180-12-158.
  • Gudra D, Shoaie S, Fridmanis D, Klovins J, Wefer H, Silamikelis I, Peculis R, Kalnina I, Elbere I, Radovica-Spalvina I, et al. A widely used sampling device in colorectal cancer screening programmes allows for large-scale microbiome studies. Gut. 2019;68(9):1723–1725. doi:10.1136/gutjnl-2018-316225.
  • Vogtmann E, Chen J, Amir A, Shi J, Abnet CC, Nelson H, Knight R, Chia N, Sinha R. Comparison of collection methods for fecal samples in microbiome studies. Am J Epidemiol. 2017;185(2):115–123. doi:10.1093/aje/kww177.
  • Masi AC, Koo S, Lamb CA, Hull MA, Sharp L, Nelson A, Hampton JS, Rees CJ, Stewart CJ. Using faecal immunochemical test (FIT) undertaken in a national screening programme for large-scale gut microbiota analysis. Gut. 2021;70(2):429–431. doi:10.1136/gutjnl-2020-321594.
  • Krigul KL, Aasmets O, Lüll K, Org T, Org E. Using fecal immunochemical tubes for the analysis of the gut microbiome has the potential to improve colorectal cancer screening. Sci Rep. 2021;11(1):19603. doi:10.1038/s41598-021-99046-w.
  • Grobbee EJ, Lam SY, Fuhler GM, Blakaj B, Konstantinov SR, Bruno MJ, Peppelenbosch MP, Kuipers EJ, Spaander MC. First steps towards combining faecal immunochemical testing with the gut microbiome in colorectal cancer screening. United Eur Gastroenterology J. 2020;8(3):293–302. doi:10.1177/2050640619890732.
  • Kaul A, Mandal S, Davidov O, Peddada SD. Analysis of microbiome data in the presence of excess zeros. Front Microbiol. 2017;8:2114. doi:10.3389/fmicb.2017.02114.
  • Poretsky R, Rodriguez RL, Luo C, Tsementzi D, Konstantinidis KT, Rodriguez-Valera F. Strengths and limitations of 16S rRNA gene amplicon sequencing in revealing temporal microbial community dynamics. PLoS One. 2014;9(4):e93827. doi:10.1371/journal.pone.0093827.
  • Gantuya B, El Serag HB, Saruuljavkhlan B, Azzaya D, Matsumoto T, Uchida T, Oyuntsetseg K, Oyunbileg N, Davaadorj D, Yamaoka Y, et al. Advantage of 16S rRNA amplicon sequencing in Helicobacter pylori diagnosis. Helicobacter. 2021;26(3):e12790. doi:10.1111/hel.12790