Posterior contraction rate of sparse latent feature models with application to proteomics

Tong Lia Department of Statistics, Columbia University, New York, NY, USA

Tianjian Zhoub Department of Statistics, Colorado State University, Fort Collins, CO, USACorrespondence[email protected] [email protected]

https://orcid.org/0000-0002-7196-4232

Kam-Wah Tsuic Department of Statistics, University of Wisconsin–Madison, Madison, WI, USA

Lin Weid Research Institute, NorthShore University HealthSystem, Evanston, IL, USA

Yuan Jie Department of Public Health Sciences, University of Chicago, Chicago, IL, USA

Figures & data

Figure 1. An example of a tree structure $T$ , which is a directed graph with random variables at the nodes (marked as circles). Entries of the kth column of Z, $z_{j k}$ 's, are at the leaves. The lengths of all edges of $T$ , $t_{i}$ 's and $η_{l}$ 's, are marked on the figure. In particular, $η_{l}$ 's represent the lengths between each leaf ( $z_{j k}$ , shaded nodes) and its parent node ( $ζ_{l}$ , dotted nodes). The total edge lengths $S (T)$ is the summation of the lengths of all edges of $T$ . In this example, $S (T) = \sum_{1 \leq i \leq 6} t_{i} + (2 η_{1} + η_{2} + 3 η_{3} + η_{4} + 4 η_{5} + η_{6})$ . The condition in case (2) of Lemma 3.3 in Section 3 means $inf_{1 \leq l \leq 6} η_{l} \geq η_{0}$ for some $η_{0} > 0$ .

Table 1. Simulation results for different combinations of n and p.

Display Table

Figure 2. Simulation results for different combinations of n and p. Plot of $‖ Z_{n} Z_{n}^{T} - Z_{n}^{*} Z_{n}^{* T} ‖ / ϵ_{n}$ versus n, where $‖ Z_{n} Z_{n}^{T} - Z_{n}^{*} Z_{n}^{* T} ‖$ is the spectral norm for the residual of the similarity matrix, and $ϵ_{n}$ is the posterior contraction rate defined in Theorem 3.4. The ratio converges to zero as n increases, demonstrating the theoretical results. The vertical error bars represent one standard error.

Table 2. Simulation results for different s.

Display Table

Figure 3. The inferred binary feature matrix $\hat{Z}$ for the TCGA RPPA dataset. The dataset consists of 100 patients, with 20 patients for each of the 5 cancer type, BRCA, DLBC, GBM, KIRC and LUAD. A shaded gray rectangle indicates the corresponding patient j possesses feature k, i.e., the corresponding matrix element ${\hat{Z}}_{j k} = 1$ . The columns are in descending order of the number of objects possessing each feature. The rows are reordered for better display.

Figure 3. The inferred binary feature matrix Z^ for the TCGA RPPA dataset. The dataset consists of 100 patients, with 20 patients for each of the 5 cancer type, BRCA, DLBC, GBM, KIRC and LUAD. A shaded gray rectangle indicates the corresponding patient j possesses feature k, i.e., the corresponding matrix element Z^jk=1. The columns are in descending order of the number of objects possessing each feature. The rows are reordered for better display.

Supplemental material

Supplemental Material

Download PDF (275.5 KB)

Posterior contraction rate of sparse latent feature models with application to proteomics

Table 1. Simulation results for different combinations of n and p.

Table 2. Simulation results for different s.

Supplemental Material

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Posterior contraction rate of sparse latent feature models with application to proteomics

Figures & data

Table 1. Simulation results for different combinations of n and p.

Table 2. Simulation results for different s.

Supplemental Material

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date