Abstract
As edge devices become increasingly powerful, data analytics are gradually moving from a centralized to a decentralized regime where edge computing resources are exploited to process more of the data locally. This regime of analytics is coined as Federated Data Analytics (FDA). Despite the recent success stories of FDA, most literature focuses exclusively on deep neural networks. In this work, we take a step back to develop an FDA treatment for one of the most fundamental statistical models: linear regression. Our treatment is built upon hierarchical modeling that allows borrowing strength across multiple groups. To this end, we propose two federated hierarchical model structures that provide a shared representation across devices to facilitate information sharing. Notably, our proposed frameworks are capable of providing uncertainty quantification, variable selection, hypothesis testing, and fast adaptation to new unseen data. We validate our methods on a range of real-life applications, including condition monitoring for aircraft engines. The results show that our FDA treatment for linear models can serve as a competing benchmark model for the future development of federated algorithms.
Additional information
Funding
Notes on contributors
Xubo Yue
Xubo Yue is a PhD candidate in the Department of Industrial & Operations Engineering at the University of Michigan. His research focuses on federated and distributed data analytics. Currently, he is developing federated data analytics methods that rethink how both prescriptive and predictive analytics are achieved within IoT-enabled systems, specifically manufacturing and renewable energy. He has received several best paper awards from the Institute for Operations Research and the Management Sciences (INFORMS), the Institute of Industrial and Systems Engineers (IISE), and other renowned organizations.
Raed Al Kontar
Raed Kontar is an assistant professor in the Industrial & Operations engineering department at the University of Michigan and an affiliate with the Michigan Institute for Data Science. Raed’s research focuses on distributed and federated probabilistic modeling. Raed obtained an undergraduate degree in civil & environmental engineering from the American University of Beirut in 2014, a master’s degree in statistics in 2017 and a PhD degree in industrial & systems engineering in 2018, both from the University of Wisconsin-Madison. Raed received the NSF CAREER award in 2022. His research is currently supported by both NSF and NIH.
Ana María Estrada Gómez
Ana María Estrada Gómez is an assistant professor in the School of Industrial Engineering at Purdue University. She received a BSc in industrial engineering and a B.Sc. in mathematics from la Universidad de los Andes in 2013 and 2015, respectively. She also holds a M.Sc. in industrial engineering from la Universidad de los Andes (2015), and a M.Sc. in statistics from Georgia Tech (2018). In 2021, she received her PhD in industrial engineering with a specialization in statistics from Georgia Tech. Her research interests lie in developing efficient methodologies and algorithms for modeling, monitoring, and diagnosing complex systems collecting high-dimensional data, using statistics and machine learning tools. The methods that she has developed have been applied in the manufacturing, environmental, and healthcare sectors. She is the recipient of the SPES + Q&P Best Student Paper Award from ASA, the QSR Best Poster Award from INFORMS, and the IISE Doctoral Colloquium Best Poster Award. At Georgia Tech, she was recognized with the Graduate Teaching Fellowship, granted by the Center for Teaching and Learning, and with the Stewart Fellowship, awarded by the School of Industrial and Systems Engineering. She has also been appointed as a Latina Trailblazer in Engineering Fellow by Purdue’s College of Engineering.