Abstract
Complex inference from simulation ensembles used in uncertainty quantification leads to twin computational challenges of managing large amount of data and performing CPU-intensive computing. While algorithmic innovations using surrogates, localization and parallelization can make the problem feasible, one still has very large data and compute tasks. The problem of dealing with large data gets compounded when data warehousing and data mining are intertwined with computationally expensive tasks. We present here an approach to solving this problem by using a mix of hardware suitable for each task in a carefully orchestrated workflow. The computing environment is essentially an integration of Netezza database and high-performance cluster. It is based on the simple idea of segregating the data-intensive and compute-intensive tasks and assigning the right architecture for them. We present here the layout of the computing model and the new computational scheme adopted to generate probabilistic hazard maps.
1999 AMS Subject Classification::
Acknowledgements
This research was supported in part by NSF grants ACI 1118260, DMS 1228217. We would like to acknowledge the discussions with Taruna Seth on Netezza implementation.