Abstract
Predicting impending failure of hard disk drives (HDDs) is crucial to avoid losing essential data and service downtime. However, most HDD failure prediction is being challenged by using labelled data itself to evaluate failure rate, while the fact that HDDs deteriorate gradually cannot be described and exploited suitably. Most works on the Self-Monitoring and Reporting Technology (SMART) system attributes utilize simple and traditional methods from machine learning and statistics to achieve HDD failure prediction. So, we propose a novel two-level prediction model Dab, hard Drive failure prediction based on deep Auto-coder and Big data learning, to exploit SMART data for better online HDD failure prediction, constructing detection sub-models of anomaly and health degree. With better accuracy, better performance, better prediction earnings, and proactive fault tolerance, Dab has reduced false alarm rate (FAR) and maintenance cost, and improved failure detection rate (FDR), reliability and robustness of large-scale storage systems.
Acknowledgments
The authors would like to thank anonymous reviewers who helped us in giving comments to this paper.
Data availability statement
The data that support the findings of this study are openly available in the Backblaze Hard Drive Data and Stats at https://www.backblaze.com/b2/hard-drive-test-data.html, reference number Backblaze (Citation2020).
Disclosure statement
No potential conflict of interest was reported by the authors.