3,261
Views
6
CrossRef citations to date
0
Altmetric
Theory and Methods

Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer

, ORCID Icon, &
Pages 1274-1285 | Received 31 Mar 2021, Accepted 15 Feb 2023, Published online: 12 Apr 2023
 

Abstract

Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transfer learning for high-dimensional Generalized Linear Models (GLMs). A novel algorithm, TransHDGLM, that integrates data from the target study and the source studies is proposed. Minimax rate of convergence for estimation is established and the proposed estimator is shown to be rate-optimal. Statistical inference for the target regression coefficients is also studied. Asymptotic normality for a debiased estimator is established, which can be used for constructing coordinate-wise confidence intervals of the regression coefficients. Numerical studies show significant improvement in estimation and inference accuracy over GLMs that only use the target data. The proposed methods are applied to a real data study concerning the classification of colorectal cancer using gut microbiomes, and are shown to enhance the classification accuracy in comparison to methods that only use the target data. Supplementary materials for this article are available online.

Supplementary Materials

In the supplementary materials, we provide the proofs of theorems and more results for numerical experiments and data applications.

Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

This research was supported by NIH grants R01GM123056 and R01GM129781. Sai Li’s research was also supported by NSFC(grant no. 12201630), the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China. Linjun Zhang’s research was also supported in part by NSF grant DMS-2015378. Tony Cai’s research was also supported in part by NSF grant DMS-2015259.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.