References
- Bergen B, Hülsemann F. Hierarchical hybrid grids: A framework for efficient multigrid on high performance architectures. Technical Report, Lehrstuhl für Systemsimulation, Universität Erlangen 5. 2003.
- Bergen B, Hülsemann F, Rüde U. Is 1.7×1010 Unknowns the largest finite element system that can be solved today? Proc. 2005 ACM/IEEE Conf. on Supercomputing, SC '05, ACM, Seattle, WA; 2005, p. 5–5.
- Gmeiner B, Huber M, John L, et al. A quantitative performance study for Stokes solvers at the extreme scale. J Comput Sci. 2016;17:509–521. doi: 10.1016/j.jocs.2016.06.006
- Feichtinger C, Donath S, Köstler H, et al. WaLBerla: HPC software design for computational engineering simulations. J Comput Sci. 2011;2:105–112. doi: 10.1016/j.jocs.2011.01.004
- Köstler H, Rüde U. The CSE software challenge–covering the complete stack. IT-Inform Technol. 2013;55:91–96. doi: 10.1524/itit.2013.0010
- Godenschwager C, Schornbaum F, Bauer M, et al. A framework for hybrid parallel flow simulations with a trillion cells in complex geometries. Proc. Int. Conf. on High Performance Computing, Networking, Storage and Analysis, SC '13, ACM, Denver, CO; 2013, p. 35:1–35:12.
- Schornbaum F, Rüde U. Massively parallel algorithms for the lattice Boltzmann method on nonuniform grids, SIAM J Sci Comput. 38 (2016), C96–C126.
- Bergen B, Hülsemann F. Hierarchical hybrid grids: data structures and core algorithms for multigrid. Numer Linear Algebra Appl. 2004;11:279–291. doi: 10.1002/nla.382
- Waluga C, Wohlmuth B, Rüde U. Mass-corrections for the conservative coupling of flow and transport on collocated meshes. J Comput Phys. 2016;305:319–332. doi: 10.1016/j.jcp.2015.10.044
- Gradl T. Data structures and algorithms for the optimization of hierarchical hybrid multigrid methods [doctoral thesis]. Erlangen (Germany): Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU); 2015.
- Bergen BK. Hierarchical Hybrid Grids: Data structures and core algorithms for efficient finite element simulations on supercomputers. Erlangen (Germany): SCS Publishing House; 2006.
- Bergen B, Gradl T, Rüde U, et al. A massively parallel multigrid method for finite elements. Comput Sci Eng. 2006;8:56–62. doi: 10.1109/MCSE.2006.102
- Gmeiner B, Rüde U, Stengel H, et al. Performance and scalability of hierarchical hybrid multigrid solvers for stokes systems. SIAM J Sci Comput. 2015;37:C143–C168. doi: 10.1137/130941353
- Gmeiner B, Rüde U, Stengel H, et al. Towards textbook efficiency for parallel multigrid. Numer Math Theory Methods Appl. 2015;8:22–46. doi: 10.4208/nmtma.2015.w10si
- Bauer S, Mohr M, Rüde U, et al. A two-scale approach for efficient on-the-fly operator assembly in massively parallel high performance multigrid codes. Appl Numer Math. 2017;122:14–38. doi: 10.1016/j.apnum.2017.07.006
- Bauer S, Drzisga D, Mohr M, et al. A stencil scaling approach for accelerating matrix-free finite element implementations. SIAM J Sci Comput. 2017; CoRR abs/1709.06793.
- Elman HC, Silvester DJ, Wathen AJ. Finite elements and fast iterative solvers: with applications in incompressible fluid dynamics. Oxford: OUP Oxford; 2014.
- Schornbaum F, Rüde U. Extreme-scale block-structured adaptive mesh refinement. SIAM J Sci Comput. 2018;40:C358–C387. doi: 10.1137/17M1128411
- Kohl N, Hötzer J, Schornbaum F, et al. A scalable and extensible checkpointing scheme for massively parallel simulations. Int J High Perform Comput Appl. 2018;0:1–19.
- Huber M, Gmeiner B, Rüde U, et al. Resilience for massively parallel multigrid solvers. SIAM J Sci Comput. 2016;38:S217–S239. doi: 10.1137/15M1026122
- Bastian P, Blatt M, Dedner A, et al. A generic grid interface for parallel and adaptive scientific computing. Part I: abstract framework. Computing. 2008;82:103–119. doi: 10.1007/s00607-008-0003-x
- Bastian P, Blatt M, Dedner A, et al. A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE. Computing. 2008;82:121–138. doi: 10.1007/s00607-008-0004-9
- Dedner A, Klöfkorn R, Nolte M, et al. A generic interface for parallel and adaptive discretization schemes: abstraction principles and the DUNE-FEM module. Computing. 2010;90:165–196. doi: 10.1007/s00607-010-0110-3
- Arndt D, Bangerth W, Davydov D, et al. The deal.II library, version 8.5. J Numer Math. 2017;25:137–146. doi: 10.1515/jnma-2017-0058
- Bangerth W, Hartmann R, Kanschat G. deal.II – a general purpose object oriented finite element library. ACM Trans Math Softw. 2007;33:24/1–24/27. doi: 10.1145/1268776.1268779
- Kirk BS, Peterson JW, Stogner RH, et al. libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations. Eng Comput. 2006;22:237–254. doi: 10.1007/s00366-006-0049-3
- Vogel A, Reiter S, Rupp M, et al. UG 4: a novel flexible software system for simulating PDE based models on high performance computers. Comp Vis Sci. 2013;16:165–179. doi: 10.1007/s00791-014-0232-9
- Cantwell C, Moxey D, Comerford A, et al. Nektar++: an open-source spectral/hp element framework. Comput Phys Commun. 2015;192:205–219. doi: 10.1016/j.cpc.2015.02.008
- Lottes JW, Fischer PF, Kerkemeier SG. Nek5000 web page; 2008. Avialble from: http://nek5000.mcs.anl.gov.
- Burstedde C, Wilcox LC, Ghattas O. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J Sci Comput. 2011;33:1103–1133. doi: 10.1137/100791634
- Dubey A, Almgren A, Bell J, et al. A survey of high level frameworks in block-structured adaptive mesh refinement packages. J Parallel Distrib Comput. 2014;74:3217–3227. doi: 10.1016/j.jpdc.2014.07.001
- Peplinski A, Fischer PF, Schlatter P. Parallel performance of H-type adaptive mesh refinement for Nek5000. Proc. 2016 Exascale Appl. Softw. Conf., EASC '16, Stockholm, Sweden, ACM, 2016, p. 4:1–4:9.
- Acun B, Gupta A, Jain N, et al. Parallel programming with migratable objects: Charm++ in practice. Proc. Int. Conf. on High Performance Computing, Networking, Storage and Analysis, SC '14, New Orleans, Louisana, IEEE Press, Piscataway, NJ, USA, 2014, pp. 647–658.
- Karypis G, Kumar V. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J Parallel Distrib Comput. 1998;48:71–95. doi: 10.1006/jpdc.1997.1403
- Chevalier C, Pellegrini F. PT-Scotch: a tool for efficient parallel graph ordering. Parallel Comput. 2008;34:318–331. doi: 10.1016/j.parco.2007.12.001
- Boman EG, Çatalyürek UV, Chevalier C, et al. The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: partitioning, ordering and coloring. Sci Program. 2012;20:129–150.
- Bank RE, Sherman AH. A refinement algorithm and dynamic data structure for finite element meshes. Austin: Computer Science Department, University of Texas at Austin; 1980.
- Kuckuk S, Köstler H. Automatic Generation of Massively Parallel Codes from ExaSlang. Computation. 2016;4:27. doi: 10.3390/computation4030027
- Kowarschik M, Rüde U, Weiss C, et al. Cache-aware multigrid methods for solving Poisson's equation in two dimensions. Computing. 2000;64:381–399. doi: 10.1007/s006070070032
- Kowarschik M, Rüde U, Thürey N, et al. Performance optimization of 3D multigrid on hierarchical memory architectures. Int. Workshop Appl. Parallel Comput.; Springer; 2002; Berlin. p. 307–316.
- Lawson CL, Hanson RJ, Kincaid DR, et al. Basic linear algebra subprograms for Fortran usage. ACM Trans Math Softw. 1979;5:308–323. doi: 10.1145/355841.355847
- Balay S, Gropp WD, McInnes LC, et al. Efficient management of parallelism in object oriented numerical software libraries. In: Arge E, Brauste AM, Langtangen HP, editors. Modern software tools in scientific computing. Basel: Birkhäuser Press; 1997. p. 163–202.
- John L, Rüde U, Wohlmuth B, et al. On the analysis of block smoothers for saddle point problems, arXiv preprint arXiv:1612.01333; 2016.
- Brezzi F, Douglas J. Stabilized mixed methods for the Stokes problem. Numer Math. 1988;53:225–235. doi: 10.1007/BF01395886
- Brandt A, Livne OE. Multigrid techniques: 1984 guide with applications to fluid dynamics Vol. 67. Philadelphia (PA): SIAM; 2011.