380
Views
7
CrossRef citations to date
0
Altmetric
Applications and Case Studies

A Bayesian Reliability Analysis of Neutron-Induced Errors in High Performance Computing Hardware

, , , , &
Pages 429-440 | Received 01 Sep 2011, Published online: 01 Jul 2013

REFERENCES

  • Abrahamowicz , M. , Ciampi , A. and Ramsay , J. 1992 . “Nonparametric Density Estimation for Censored Survival Data: Regression-Spline Approach,” . Canadian Journal of Statistics , 20 : 171 – 185 .
  • Ando , H. , Kan , R. , Tosaka , Y. , Takahisa , K. and Hatanaka , K. 2008 . “Validation of Hardware Error Recovery Mechanisms for the Sparc64 v Microprocessor,” . In Proceedings of the IEEE International Conference on Dependable Systems and Networks 62 – 69 .
  • Baumann , R. 2005 . “Radiation-Induced Soft Errors in Advanced Semiconductor Technologies,” . IEEE Transactions on Device and Materials Reliability , 5 : 305 – 316 .
  • Bowers , K. J. , Albright , B. J. , Yin , L. , Daughton , W. , Roytershteyn , V. , Bergen , B. and Kwan , T. J. T. 2009 . “Advances in Petascale Kinetic Plasma Simulation With VPIC and Roadrunner,” . Journal of Physics: Conference Series , 180 : 012055
  • Constantinescu , C. 2005 . “Neutron SER Characterization of Microprocessors,” . In Proceedings of the International Conference on Dependable Systems and Networks 754 – 759 .
  • Cox , D. 1972 . “Regression Models and Life-Tables,” . Journal of the Royal Statistical Society, Series B , 34 : 187 – 220 .
  • DuBois , A. , Conner , C. , Michalak , S. , Taylor , G. and DuBois , D. 2009 . Application of the IBM Cell Processor to Real-Time Cross-Correlation of a Large Antenna Array Radio Telescope , Technical Report LA-UR-09-03483, Los Alamos, NM : Los Alamos National Laboratory .
  • Fan , J. and Li , R. 2001 . “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” . Journal of the American Statistical Association , 96 : 1348 – 1360 .
  • Fan , J. 2002 . “Variable Selection for Cox's Proportional Hazards Model and Frailty,” . The Annals of Statistics , 30 : 74 – 99 .
  • Faraggi , D. and Simon , R. 1998 . “Bayesian Variable Selection Method for Censored Survival Data,” . Biometrics , 54 : 1475 – 1485 .
  • Finkelstein , D. 1986 . “A Proportional Hazards Model for Interval-Censored Failure Time Data,” . Biometrics , 42 : 845 – 854 .
  • Gelfand , A. 1990 . “Sampling-Based Approaches to Calculating Marginal Densities,” . Journal of the American Statistical Association , 85 : 398 – 409 .
  • Gelman , A. 2006 . “Prior Distributions for Variance Parameters in Hierarchical Models” (Comment on article by Browne and Draper), . Bayesian Analysis , 1 : 515 – 534 .
  • George , E. and McCulloch , R. 1993 . “Variable Selection via Gibbs Sampling,” . Journal of the American Statistical Association , 88 : 881 – 889 .
  • George , E. 1997 . “Approaches for Bayesian Variable Selection,” . Statistica Sinica , 7 : 339 – 373 .
  • Goetghebeur , E. and Ryan , L. 2000 . “Semiparametric Regression Analysis of Interval-Censored Data,” . Biometrics , 56 : 1139 – 1144 .
  • Goggins , W. , Finkelstein , D. , Schoenfeld , D. and Zaslavsky , A. 1998 . “A Markov Chain Monte Carlo EM Algorithm for Analyzing Interval-Censored Data Under the Cox Proportional Hazards Model,” . Biometrics , 54 : 1498 – 1507 .
  • Hong , T. , Michalak , S. , Graves , T. , Ackaret , J. and Rao , S. 2009 . “Neutron Beam Irradiation Study of Workload Dependence of SER in a Microprocessor,” . In 5th SELSE Proceedings 4
  • Ibrahim , J. , Chen , M. and MacEachern , S. 1999 . “Bayesian Variable Selection for Proportional Hazards Models,” . Canadian Journal of Statistics , 27 : 701 – 717 .
  • JEDEC Solid State Technology Association. 2001 . “JEDEC Standard JESD89: Measurement and Reporting of Alpha Particles and Terrestrial Cosmic-Ray-Induced Soft Errors in Semiconductor Devices,” . Scientific Computing [Online] , Available at http://www4.tsl.uu.se/bumpen/jedec.pdf.
  • Kinney , S. and Dunson , D. 2007 . “Fixed and Random Effects Selection in Linear and Logistic Models,” . Biometrics , 63 : 690 – 698 .
  • Kistler , M. , Gunnels , J. A. , Brokenshire , D. A. and Benton , B. 2009 . “Programming the Linpack Benchmark for Roadrunner,” . IBM Journal of Research and Development , 53 : 9 – 11 .
  • Koch , K. 2008 . “Roadrunner Platform Overview” [Online] . Available at http://www.lanl.gov/orgs/hpc/roadrunner/pdfs/Koch%20-%20Roadrunner%20Overview/RR%20Seminar%20-%20System%20Overview.pdf
  • Kooperberg , C. and Clarkson , D. 1996 . “Hazard Regression With Interval-Censored Data,” . Biometrics , 53 : 1485 – 1494 .
  • Lee , K. , Chakraborty , S. and Sun , J. 2011 . “Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data,” . unpublished manuscript, University of Missouri Columbia
  • Meuer , H. 2008 . “31st top500 List Topped by First-Ever Petaflop/s Supercomputer,” . Scientific Computing , 25 : 20 [Online]. Available at http://www.scientificcomputing.com/31st-TOP500-List-Topped-by-First-ever-Petaflops-Supercomputer.aspx
  • Michalak , S. , Hamada , M. and Hengartner , N. 2013 . Analysis of Interval-Censored Data with Random Unknown End Points: An Application to Soft Error Rate Estimation,” . Journal of the Royal Statistical Society , 62 : 473 – 486 . , Series C,
  • Michalak , S. , DuBois , A. , Storlie , C. , Quinn , H. , Rust , W. , DuBois , D. , Modl , D. , Manuzzato , A. and Blanchard , S. 2011 . “Neutron Beam Testing of High Performance Computing Hardware,” . In IEEE Radiation Effects Data Workshop Proceedings 1 – 8 .
  • Michalak , S. 2012 . “Assessment of the Impact of Cosmic-Ray-Induced Neutrons on Hardware in the Roadrunner Supercomputer,” . IEEE Transactions on Device and Materials Reliability , 12 : 445 – 454 .
  • Michalak , S. , Harris , K. , Hengartner , N. , Takala , B. and Wender , S. 2005 . “Predicting the Number of Fatal Soft Errors in Los Alamos National Laboratory's ASC Q Supercomputer,” . IEEE Transactions on Device and Materials Reliability , 5 : 329 – 335 .
  • O’Hara , R. and Sillanpää , M. 2009 . “A Review of Bayesian Variable Selection Methods: What, How and Which,” . Bayesian Analysis , 4 : 85 – 118 .
  • Park , T. and Casella , G. 2008 . “Bayesian Lasso,” . Journal of the American Statistical Association , 103 : 681 – 686 .
  • Raftery , A. , Madigan , D. and Volinsky , C. 1995 . “Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance,” . In Bayesian Statistics 5 , Edited by: Bernardo , J. M. , Berger , J. O. , Dawid , A. P. and Smith , A. F. M. 323 – 349 . Oxford : Oxford University Press .
  • Reich , B. , Storlie , C. and Bondell , H. 2009 . “Variable Selection in Bayesian Smoothing Spline ANOVA Models: Application to Deterministic Computer Codes,” . Technometrics , 51 : 110 – 120 .
  • Sanda , P. , Kellington , J. , Kudva , P. , Kalla , R. , McBeth , R. , Ackaret , J. , Lockwood , R. , Schumann , J. and Jones , C. 2008 . “Soft-Error Resilience of the IBM Power6 Processor,” . IBM Journal of Research and Development , 52 : 275 – 284 .
  • Stein , M. 1999 . Interpolation of Spatial Data , New York : Springer-Verlag .
  • Sun , J. 1997 . “Regression Analysis of Interval-Censored Failure Time Data,” . Statistics in Medicine , 16 : 497 – 504 .
  • Sun , J. 2006 . The Statistical Analysis of Interval-Censored Failure Time Data , New York : Springer .
  • Tibshirani , R. 1997 . “The Lasso Method for Variable Selection in the Cox Model,” . Statistics in Medicine , 16 : 385 – 395 .
  • Volinsky , C. , Madigan , D. , Raftery , A. and Kronmal , R. 1997 . “Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke,” . Applied Statistics , 46 : 433 – 448 .
  • Wahba , G. 1990 . Spline Models for Observational Data , Philadelphia , PA : Society for Industrial and Applied Mathematics .
  • Wender , S. 2003 . “Neutron Single Event Effects Testing at LANSCE,” . In IEEE International Reliability Physics Symposium 9
  • Xilinx. 2007 . “Virtex-ii Platform fpgas Complete Data Sheet” . [Online]. Available at http://www.xilinx.com/support/documentation/data_sheets/ds031.pdf
  • Zhang , H. and Lu , W. 2007 . “Adaptive Lasso for Cox's Proportional Hazards Model,” . Biometrika , 94 : 691 – 703 .
  • Ziegler , J. 1996 . “Terrestrial Cosmic Rays,” . IBM Journal of Research and Development , 40 : 19 – 40 .
  • Ziegler , J. and Lanford , W. 1981 . “The Effect of Sea-Level Cosmic Rays on Electronic Devices,” . Journal of Applied Physics , 52 : 4305 – 4312 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.