Search in:

Gut Microbes Volume 15, 2023 - Issue 1

Submit an article Journal homepage

Open access

1,581

Views

CrossRef citations to date

Altmetric

Research Paper

Ordering taxa in image convolution networks improves microbiome-based machine learning accuracy

Oshrit Shtossela Department of Mathematics, Bar-Ilan University, Ramat Gan, IsraelView further author information

Haim Isakova Department of Mathematics, Bar-Ilan University, Ramat Gan, IsraelView further author information

Sondra Turjemanb The Azrieli Faculty of Medicine, Bar-Ilan University, Safed, IsraelView further author information

Omry Korenb The Azrieli Faculty of Medicine, Bar-Ilan University, Safed, IsraelView further author information

Yoram Louzouna Department of Mathematics, Bar-Ilan University, Ramat Gan, IsraelCorrespondence[email protected]
View further author information

Article: 2224474 | Received 15 Dec 2022, Accepted 08 Jun 2023, Published online: 21 Jun 2023

Cite this article
https://doi.org/10.1080/19490976.2023.2224474
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Table 1. Different approaches to the microbiome ML limitations discussed in the introduction.

Download CSV Display Table

Table 2. Table of datasets.

Download CSV Display Table

Figure 1. iMic’s and gMic’s architectures and AUCs.

(a) gMic+v architecture: We position all observed taxa in the leaves of the taxonomy tree (cladogram), and set their value to the preprocessed frequency to each leaf. Each internal node is the average of its direct descendants. These values are the input to a GCN layer with the adjacency matrix of the cladogram. The GCN layer is followed by two fully connected layers with binary output. (b) iMic’s architecture: The values in the cladogram are as in gMic+v. The cladogram is then used to populate a 2-dimensional matrix. Each row in the image represents a taxonomic level. The order in each row is based on a recursive hierarchical clustering of the sample values preserving the structure of the tree. The image is the input of a CNN followed by 2 fully connected layers with binary output. (c) Comparison between model performance: The average AUC is measured on the external test set on nine different phenotypes. Each subplot is a phenotype. The stars represent the significance of the

p

-value (after Benjamini Hochberg correction) on the external test set. If there were differences in the significance on the 10 CVs and the external test set, the different corrected p-value of the 10 CVs is reported in brackets, *-

p \leq 0.05

, **-

p \leq 0.01

, ***-

p \leq 0.001

. For the parallel results of 10 CVs see Supp. Mat. Fig. S2. The rightmost set of plots is the baseline. The green bars are the current best baseline. The light blue bar to the right is the best baseline obtained using the MIPMLP. The central pink bars are the iMic AUC using either a one or two-dimensional CNN. The leftmost bars are for gMic (either gMic or gMic+v). We also added the iMic results to allow for a comparison.

Figure 1. iMic’s and gMic’s architectures and AUCs.

Table 3. Sequential datasets details.

Download CSV Display Table

Table 4. 10 CVs mean performances with standard deviation on external test sets; the std is the std among CV folds.

Download CSV Display Table

Table 5. Features can be added to iMic’s learning. Average AUCs of iMic-CNN2 with and without non-microbial features as well as average results of naive models with non-microbial features. The results are the average AUCs on an external test with 10 CVs $\pm$ their standard deviations (stds).

Display Table

Figure 2. iMic copes with the ML challenges above better than other methods.

(a) Average test AUC (over 10 CVs) as a function of the different sparsity levels, where the first point is the AUC of the original sparsity level (

72 %

, “baseline”) on the Cirrhosis dataset. iMic has the highest AUCs for all simulated sparsity levels (purple line). The error bars represent the standard errors. (b) Average change in AUC (AUC – baseline AUC) as a function of the sparsity level on the Cirrhosis dataset. (c) Overall average in AUC change in all the other datasets apart from Cirrhosis. (d) Average AUC as a function of the number of samples in the training set (Cirrhosis dataset). The error bars represent the standard errors of each model over the 10 CVs. (e) Average change in AUC (AUC – baseline AUC) as a function of the percent of samples in the training set. (f) Overall average AUC change over all the algorithms that managed to learn (baseline AUC

> 0.55)

as a function of the percent of samples in the training set. (g) Importance of ordering taxa. The x-axis represents the average AUC over 10 CVs and the y-axis represents the different datasets used. The deep purple bars represent the AUC on the images without taxa reordering, while the light purple bars represent the AUC on the images with the Dendrogram reordering with standard errors. All the differences between the AUCs are significant after Benjamini Hochberg correction

(p

-value

< 0.001)

. All the AUCs are calculated on an external test set for each CV. Quite similar results were obtained on the 10 CVs.

Figure 2. iMic copes with the ML challenges above better than other methods.

Figure 3. Interpretation of iMic’s results.

(a, b) Cladogram projections: To visualize the taxa contributing to each class, the healthy class (a) and the CD class (b), we projected the most significant microbes back on the cladogram. The purple points on the cladograms represent taxa that are in the top decile of the gradients. The taxa in bold are important taxa that are consistent with the literature. (c, d) Grad-Cam images: Each image represents the average contribution of each input value to the gradients of the neural network back-propagation, as computed by the Grad-Cam algorithm. We put the Grad-Cam after the first CNN layer. The results presented here are from the CD dataset. (c) represents the average gradients for the healthy subjects of the cohort and (d) represents the average gradients for the CD subjects. The color reflects the average values of the gradients, such that the blue colors represent low gradients, and the yellow colors represent the high gradients, using the ‘viridis’ colormap. The differences between the two heatmaps represent the contribution of different taxa to the prediction of different phenotypes. Note that the main contribution to the classification is at the genus and family level (rows 6 and 5). Similar results were obtained for the other datasets (Fig. S11 in Supp. Mat.) (e-h). Interpretation tests on the CD dataset (e), the IBD dataset (f), the Cirrhosis dataset (g), and the Ravel dataset (h). Average AUC values over 10 CVs on the external test set. The x-axis represents the fraction of removed columns. The dark bars represent the performance when all of the columns with Grad-Cams values lower than this fraction have been removed and the light bars represent the performance when the columns with scores above this fraction have been removed. The black line represents the average AUC over 10 CVs of the original model with all the input columns. Results from the other datasets were similar, see Supp. Mat Fig. S11. Removing the top scoring columns always reduced the performance. Removing the bottom scoring columns increases or does not change the AUC.

Figure 4. 3D learning.

(a) iMic 3D Architecture: The ASV frequencies of each snapshot are preprocessed and combined into images as in the static iMic. The images from the different time points are combined into a 3D image, which is the input of a 3-dimensional CNN followed by two fully connected layers that return the predicted phenotype. (b) Performance of 3D learning vs PhyLoSTM. The AUCs of the 3D-iMic are consistently higher than the AUCs of the phyLoSTM on all the tags and datasets we checked

(n = 5)

. The standard errors among the CVs are also shown. phyLoSTM is the current state-of-the-art for these datasets (two-sided T-test,

p

-value

< 0.0005

). To visualize the three-dimensional gradients (as in ), we studied a CNN with a time window of 3 (i.e., 3 consecutive images combined using convolution). We projected the Grad-Cam images to the R, G, and B channels of an image. Each channel represents another time point where R = earliest, G = middle, and B = latest time point. (c,d) Images after Grad-Cam: Each pixel represents the value of the backpropagated gradients after the CNN layer. The 2-dimensional image is the combination of the three channels above. (i.e., the gradients of the first/second/third time step are in red/green/blue). The left image is for normal birth subjects in the DiGiulio dataset, and the right image is for pre-term birth subjects. (e,f) Grad-Cam projection. Projection of the above heatmaps on the cladogram as in . The taxa in bold are important taxa that are consistent with the literature.

Table 6. Notations.

Table

Table

Table

van der Giessen J, Binyamin D, Belogolovski A, Frishman S, Tenenbaum-Gavish K, Hadar E, Louzoun Y, Petrus Peppelenbosch M, van der Woude CJ, Koren O, et al. Modulation of cytokine patterns and microbiome during pregnancy in IBD. Gut. 2020;69(3):473–486. doi:10.1136/gutjnl-2019-318263.

PubMed Web of Science ®Google Scholar

Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207. doi:10.1038/nature11234.

PubMed Web of Science ®Google Scholar

Goldberg MR, Mor H, Magid Neriya D, Magzal F, Muller E, Appel MY, Nachshon L, Borenstein E, Tamir S, Louzoun Y, et al. Microbial signature in IgE-mediated food allergies. Genome Med. 2020;12(1):1–18. doi:10.1186/s13073-020-00789-4.

Web of Science ®Google Scholar

Ravel J, Gajer P, Zaid Abdo GMS, Koenig SSK, McCulle SL, Karlebach S, Gorle R, Russell J, Tacket CO, Brotman RM. Vaginal microbiome of reproductive-age women. Proc Natl Acad Sci USA. 2011;108(supplement_1):4680–4687. doi:10.1073/pnas.1002611107.

PubMed Web of Science ®Google Scholar

Vatanen T, Kostic AD, d’Hennezel E, Siljander H, Franzosa EA, Yassour M, Kolde R, Vlamakis H, Arthur TD, Hämäläinen A-M, et al. Variation in microbiome LPS immunogenicity contributes to autoimmunity in humans. Cell. 2016;165(4):842–853. doi:10.1016/j.cell.2016.04.007.

PubMed Web of Science ®Google Scholar

DiGiulio DB, Callahan BJ, McMurdie PJ, Costello EK, Lyell DJ, Robaczewska A, Sun CL, Goltsman DSA, Wong RJ, Shaw G, et al. Temporal and spatial variation of the human microbiota during pregnancy. Proc Natl Acad Sci USA. 2015;112(35):11060–11065. doi:10.1073/pnas.1502875112.

PubMed Web of Science ®Google Scholar

Supplemental material

Supplemental Material

Download MS Word (3.6 MB)

Data availability statement

All datasets are available at https://github.com/oshritshtossel/iMic/tree/master/Raw_data.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Ordering taxa in image convolution networks improves microbiome-based machine learning accuracy