132
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Data-driven uncertainty quantification for systematic coarse-grained models

, , ORCID Icon, ORCID Icon & ORCID Icon
Pages 348-368 | Received 31 Dec 2019, Accepted 05 Apr 2020, Published online: 06 Nov 2020
 

ABSTRACT

In this work, we present methodologies for the quantification of confidence in bottom-up coarse-grained models for molecular and macromolecular systems. Coarse-graining methods have been extensively used in the past decades in order to extend the length and time scales accessible by simulation methodologies. The quantification, though, of induced errors due to the limited availability of fine-grained data is not yet established. Here, we employ rigorous statistical methods to deduce guarantees for the optimal coarse models obtained via approximations of the multi-body potential of mean force, with the relative entropy, the relative entropy rate minimization, and the force-matching methods. Specifically, we present and apply statistical approaches, such as bootstrap and jackknife, to infer confidence sets for a limited number of samples, i.e., molecular configurations. Moreover, we estimate asymptotic confidence intervals assuming adequate sampling of the phase space. We demonstrate the need for non-asymptotic methods and quantify confidence sets through two applications. The first is a two-scale fast/slow diffusion process projected on the slow process. With this benchmark example, we establish the methodology for both independent and time-series data. Second, we apply these uncertainty quantification approaches on a polymeric bulk system. We consider an atomistic polyethylene melt as the prototype system for developing coarse-graining tools for macromolecular systems. For this system, we estimate the coarse-grained force field and present confidence levels with respect to the number of available microscopic data.

Acknowledgments

The research of M.K. was partially supported by NSF TRIPODS CISE-1934846 and by the Air Force Office of Scientific Research (AFOSR) under the grant FA-9550-18-1-0214. The research of T. J. was partially supported by the National Science Foundation (NSF) under the grant DMS-1515712 and by the Air Force Office of Scientific Research (AFOSR) under the grant FA-9550-18-1-0214. VH acknowledges support by project “SimEA”, funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 810660.

E.K. acknowledges support by the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under grant agreement No [52].

Supplementary Material

Supplemental data for this article can be accessed on the publisher’s website.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.