145
Views
0
CrossRef citations to date
0
Altmetric
Research Article

MFAI: A Scalable Bayesian Matrix Factorization Approach to Leveraging Auxiliary Information

ORCID Icon, , , ORCID Icon, ORCID Icon & ORCID Icon
Received 06 Mar 2023, Accepted 11 Feb 2024, Published online: 25 Mar 2024
 

Abstract

In various practical situations, matrix factorization methods suffer from poor data quality, such as high data sparsity and low signal-to-noise ratio (SNR). Here, we consider a matrix factorization problem by using auxiliary information, which is massively available in real-world applications, to overcome the challenges caused by poor data quality. Unlike existing methods that mainly rely on simple linear models to combine auxiliary information with the main data matrix, we propose to integrate gradient boosted trees in the probabilistic matrix factorization framework to effectively leverage auxiliary information (MFAI). Thus, MFAI naturally inherits several salient features of gradient boosted trees, such as the capability of flexibly modeling nonlinear relationships and robustness to irrelevant features and missing values in auxiliary information. The parameters in MFAI can be automatically determined under the empirical Bayes framework, making it adaptive to the utilization of auxiliary information and immune to overfitting. Moreover, MFAI is computationally efficient and scalable to large datasets by exploiting variational inference. We demonstrate the advantages of MFAI through comprehensive numerical results from simulation studies and real data analyses. Our approach is implemented in the R package mfair available at https://github.com/YangLabHKUST/mfair. Supplementary materials for this article are available online.

Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

This work is supported in part by Hong Kong Research Grant Council Grants 16307818, 16301419, 16308120, and 16307221; Hong Kong Innovation and Technology Fund Grant PRP/029/19FX; The Hong Kong University of Science and Technology Startup Grants R9405 and Z0428 from the Big Data Institute; and City University of Hong Kong Startup Grant 7200746. The computation tasks for this work were performed using the X-GPU cluster supported by the Research Grants Council Collaborative Research Fund Grant C6021-19EF.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 180.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.