423
Views
81
CrossRef citations to date
0
Altmetric
Original Articles

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

, &
Pages 221-256 | Received 01 Dec 2006, Accepted 01 Oct 2006, Published online: 06 Apr 2009

Keep up to date with the latest research on this topic with citation updates for this article.

Read on this site (1)

Luc Buatois, Guillaume Caumon & Bruno Lévy. (2009) Concurrent number cruncher: a GPU implementation of a general sparse linear solver. International Journal of Parallel, Emergent and Distributed Systems 24:3, pages 205-223.
Read now

Articles from other publishers (80)

Alessio Netti, Yang Peng, Patrik Omland, Michael Paulitsch, Jorge Parra, Gustavo Espinosa, Udit Agarwal, Abraham Chan & Karthik Pattabiraman. (2023) Mixed precision support in HPC applications: What about reliability?. Journal of Parallel and Distributed Computing 181, pages 104746.
Crossref
Ruiheng Li, Jinpeng Wang, Wenxin Kong, Nian Yu, Tianyang Li & Chao Wang. (2023) An adaptive hybrid grids finite-element approach for plane wave three-dimensional electromagnetic modeling. Computers & Geosciences 180, pages 105437.
Crossref
Noel Chalmers, Abhishek Mishra, Damon McDougall & Tim Warburton. (2023) HipBone: A performance-portable graphics processing unit-accelerated C++ version of the NekBone benchmark. The International Journal of High Performance Computing Applications 37:5, pages 560-577.
Crossref
Barbara G. Simpson, Minjie ZhuAkiri Seki & Michael Scott. (2023) Challenges in GPU-Accelerated Nonlinear Dynamic Analysis for Structural Systems. Journal of Structural Engineering 149:3.
Crossref
Christoph KleinRobert Strzodka. (2023) Preconditioning Sparse Matrices with Alternating and Multiplicative Operator Splittings. SIAM Journal on Scientific Computing 45:1, pages A25-A48.
Crossref
Jan Ackmann, Peter D. Dueben, Tim Palmer & Piotr K. Smolarkiewicz. (2022) Mixed‐Precision for Linear Solvers in Global Geophysical Flows. Journal of Advances in Modeling Earth Systems 14:9.
Crossref
Nicholas J. Higham & Theo Mary. (2022) Mixed precision algorithms in numerical linear algebra. Acta Numerica 31, pages 347-414.
Crossref
Laurence Kedward & Christian B. Allen. (2022) Summary of Investigations into Finite Volume Methods on GPUs. Summary of Investigations into Finite Volume Methods on GPUs.
Richard J. Clancy, Matt Menickelly, Jan Hückelheim, Paul Hovland, Prani Nalluri & Rebecca Gjini. 2022. Computational Science – ICCS 2022. Computational Science – ICCS 2022 445 459 .
Laurence Kedward & Christian B. Allen. (2021) Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures. Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures.
Martin Kronbichler. 2021. Efficient High-Order Discretizations for Computational Fluid Dynamics. Efficient High-Order Discretizations for Computational Fluid Dynamics 57 115 .
Takeshi Iwashita, Kengo Suzuki & Takeshi Fukaya. (2020) An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement. An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.
Shengquan Wang, Chao Wang, Yong Cai & Guangyao Li. (2020) A novel parallel finite element procedure for nonlinear dynamic problems using GPU and mixed-precision algorithm. Engineering Computations 37:6, pages 2193-2211.
Crossref
Khalid Ahmad, Hari Sundar & Mary Hall. (2019) Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs. ACM Transactions on Architecture and Code Optimization 16:4, pages 1-24.
Crossref
Hartwig Anzt, Goran Flegar, Thomas Grützmacher & Enrique S Quintana-Ortí. (2019) Toward a modular precision ecosystem for high-performance computing. The International Journal of High Performance Computing Applications 33:6, pages 1069-1078.
Crossref
Martin Kronbichler & Karl Ljungkvist. (2019) Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors. ACM Transactions on Parallel Computing 6:1, pages 1-32.
Crossref
Fernando Fernandes dos Santos, Caio Lunardi, Daniel Oliveira, Fabiano Libano & Paolo Rech. (2019) Reliability Evaluation of Mixed-Precision Architectures. Reliability Evaluation of Mixed-Precision Architectures.
Thomas Grützmacher & Hartwig Anzt. 2019. Euro-Par 2018: Parallel Processing Workshops. Euro-Par 2018: Parallel Processing Workshops 434 443 .
Andrew Dawson, Peter D. Düben, David A. MacLeod & Tim N. Palmer. (2017) Reliable low precision simulations in land surface models. Climate Dynamics 51:7-8, pages 2657-2666.
Crossref
James Shee, Evan J. Arthur, Shiwei Zhang, David R. Reichman & Richard A. Friesner. (2018) Phaseless Auxiliary-Field Quantum Monte Carlo on Graphical Processing Units. Journal of Chemical Theory and Computation 14:8, pages 4109-4121.
Crossref
Nileshchandra K Pikle, Shailesh R Sathe & Arvind Y Vyavhare. (2018) GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review. Sādhanā 43:7.
Crossref
Michael O Lam & Jeffrey K Hollingsworth. (2016) Fine-grained floating-point precision analysis. The International Journal of High Performance Computing Applications 32:2, pages 231-245.
Crossref
Shane Fogerty, Siddhartha Bishnu, Yuliana Zamora, Laura Monroe, Steve Poole, Michael Lam, Joe Schoonover & Robert Robey. (2017) Thoughtful Precision in Mini-Apps. Thoughtful Precision in Mini-Apps.
Peter D. Düben, Aneesh Subramanian, Andrew Dawson & T. N. Palmer. (2017) A study of reduced numerical precision to make superparameterization more competitive using a hardware emulator in the OpenIFS model. Journal of Advances in Modeling Earth Systems 9:1, pages 566-584.
Crossref
Marijn P. Zwier & Wessel W. Wits. (2017) Physics in Design: Real-time Numerical Simulation Integrated into the CAD Environment. Procedia CIRP 60, pages 98-103.
Crossref
Tobias Thornes, Peter Düben & Tim Palmer. (2017) On the use of scale‐dependent precision in Earth System modelling. Quarterly Journal of the Royal Meteorological Society 143:703, pages 897-908.
Crossref
Roberto Olivares‐Amaya, Adrian Jinich, Mark A. Watson & Alán Aspuru‐Guzik. 2016. Electronic Structure Calculations on Graphics Processing Units. Electronic Structure Calculations on Graphics Processing Units 259 278 .
Amir M. Mirzendehdel & Krishnan Suresh. (2015) A Deflated Assembly Free Approach to Large-Scale Implicit Structural Dynamics. Journal of Computational and Nonlinear Dynamics 10:6.
Crossref
Andrew D. Brown, Rob Mills, Kier James Dugan, Jeff S. Reeve & Steve B. Furber. (2015) Reliable computation with unreliable computers. IET Computers & Digital Techniques 9:4, pages 230-237.
Crossref
Cássio Sozinho Amorim, Kazuto Ebihara, Ai Yamakage, Yukio Tanaka & Masatoshi Sato. (2015) Majorana braiding dynamics in nanowires. Physical Review B 91:17.
Crossref
Jean Chamberlain Chedjou & Kyandoghere Kyamakya. (2015) A Universal Concept Based on Cellular Neural Networks for Ultrafast and Flexible Solving of Differential Equations. IEEE Transactions on Neural Networks and Learning Systems 26:4, pages 749-762.
Crossref
Dimitar Lukarski, Hartwig Anzt, Stanimire Tomov & Jack Dongarra. (2014) Hybrid Multi-elimination ILU Preconditioners on GPUs. Hybrid Multi-elimination ILU Preconditioners on GPUs.
Doron Sabo, Oded Barzelay, Shlomo Weiss & Miriam Furst. (2014) Fast evaluation of a time-domain non-linear cochlear model on GPUs. Journal of Computational Physics 265, pages 97-112.
Crossref
Krzysztof Banaś, Przemysław Płaszewski & Paweł Macioł. (2014) Numerical integration on GPUs for higher order finite elements. Computers & Mathematics with Applications 67:6, pages 1319-1344.
Crossref
Rong Tian. (2014) Simulation at Extreme-Scale: Co-Design Thinking and Practices. Archives of Computational Methods in Engineering 21:1, pages 39-58.
Crossref
Peter Huthwaite. (2014) Accelerated finite element elastodynamic simulations using the GPU. Journal of Computational Physics 257, pages 687-707.
Crossref
Ronan Mendonça Amorim & Rodrigo Weber dos Santos. (2013) Solving the cardiac bidomain equations using graphics processing units. Journal of Computational Science 4:5, pages 370-376.
Crossref
Rashid Hassani, Amirreza Fazely, Riaz-Ul-Ahsan Choudhury & Peter Luksch. (2013) Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA. Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA.
S. P. Vanka. (2013) 2012 Freeman Scholar Lecture: Computational Fluid Dynamics on Graphics Processing Units. Journal of Fluids Engineering 135:6.
Crossref
Serban Georgescu, Peter Chow & Hiroshi Okuda. (2013) GPU Acceleration for FEM-Based Structural Analysis. Archives of Computational Methods in Engineering 20:2, pages 111-121.
Crossref
S. Galal, O. Shacham, J. S. Brunhaver, Jing Pu, A. Vassiliev & M. Horowitz. (2013) FPU Generator for Design Space Exploration. FPU Generator for Design Space Exploration.
Dominik Göddeke, Dimitri Komatitsch, Markus Geveler, Dirk Ribbrock, Nikola Rajovic, Nikola Puzovic & Alex Ramirez. (2013) Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster. Journal of Computational Physics 237, pages 132-150.
Crossref
Jianfei Zhang & Lei Zhang. (2013) Efficient CUDA Polynomial Preconditioned Conjugate Gradient Solver for Finite Element Computation of Elasticity Problems. Mathematical Problems in Engineering 2013, pages 1-12.
Crossref
Doron Sabo, Shlomo Weiss & Miriam Furst. (2013) A Parallel Algorithm for a Physiological Non-linear Model of the Cochlea. Procedia Computer Science 18, pages 682-691.
Crossref
Jiangyong Ren, ChaoWei Wang, Yingrui Wang & Rong Tian. 2013. High Performance Computing. High Performance Computing 151 165 .
Hartwig Anzt, Maribel Castillo, Juan C. Fernández, Vincent Heuveline, Francisco D. Igual, Rafael Mayo & Enrique S. Quintana-Ortí. (2011) Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors. Computer Science - Research and Development 27:4, pages 299-307.
Crossref
Pablo Igounet, Ernesto Dufrechou, Martin Pedemonte & Pablo Ezzatti. (2012) A Study on Mixed Precision Techniques for a GPU-based SIP Solver. A Study on Mixed Precision Techniques for a GPU-based SIP Solver.
Shaojing Li, Ruinan Chang, A. Boag & V. Lomakin. (2012) Fast Electromagnetic Integral-Equation Solvers on Graphics Processing Units. IEEE Antennas and Propagation Magazine 54:5, pages 71-87.
Crossref
Choon Lih Hoo, Sallehuddin Mohamed Haris & Nik Abdullah Nik Mohamed. (2012) A floating point conversion algorithm for mixed precision computations. Journal of Zhejiang University SCIENCE C 13:9, pages 711-718.
Crossref
C. Bonati, G. Cossu, M. DʼElia & P. Incardona. (2012) QCD simulations with staggered fermions on GPUs. Computer Physics Communications 183:4, pages 853-863.
Crossref
Hartwig Anzt, Piotr Luszczek, Jack Dongarra & Vincent Heuveline. 2012. Euro-Par 2012 Parallel Processing. Euro-Par 2012 Parallel Processing 908 919 .
Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2012. Applied Parallel and Scientific Computing. Applied Parallel and Scientific Computing 237 247 .
O. Fluck, C. Vetter, W. Wein, A. Kamen, B. Preim & R. Westermann. (2011) A survey of medical image registration on graphics hardware. Computer Methods and Programs in Biomedicine 104:3, pages e45-e57.
Crossref
TETSU NARUMI, TSUYOSHI HAMADA, KEIGO NITADORI, RYUJI SAKAMAKI & KENJI YASUOKA. (2011) FAST QUASI DOUBLE-PRECISION METHOD WITH SINGLE-PRECISION HARDWARE TO ACCELERATE SCIENTIFIC APPLICATIONS. International Journal of Computational Methods 08:03, pages 561-581.
Crossref
Shuhan Qi, Xuan Wang & Shaohuai Shi. (2011) Mixed Precision Method for GPU-based FFT. Mixed Precision Method for GPU-based FFT.
Hartwig Anzt, Vincent Heuveline, Bjorn Rocker, Maribel Castillo, Juan C. Fern´ndez, Rafael Mayo & Enrique S. Quintana-Orti. (2011) Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems. Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems.
Tao Yuan, Zhu Mingfa, Xiao Limin, Ruan Li, Dongyi Guan, Siming Chen & Ding Yi. (2011) Research on the Accuracy of Single Precision on Graphics Processing Unit. Research on the Accuracy of Single Precision on Graphics Processing Unit.
M. Papadrakakis, G. Stavroulakis & A. Karatarakis. (2011) A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures. Computer Methods in Applied Mechanics and Engineering 200:13-16, pages 1490-1508.
Crossref
Dominik Goddeke & Robert Strzodka. (2011) Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid. IEEE Transactions on Parallel and Distributed Systems 22:1, pages 22-32.
Crossref
Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010 58 70 .
Björn Rocker, Mariana Kolberg & Vincent Heuveline. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010 394 407 .
Serban Georgescu & Hiroshi Okuda. 2010. Software Automatic Tuning. Software Automatic Tuning 103 119 .
Serban Georgescu & Hiroshi Okuda. (2010) Conjugate gradients on multiple GPUs. International Journal for Numerical Methods in Fluids 64:10-12, pages 1254-1273.
Crossref
Dimitri Komatitsch, Gordon Erlebacher, Dominik Göddeke & David Michéa. (2010) High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster. Journal of Computational Physics 229:20, pages 7692-7714.
Crossref
Joseph M. Elble, Nikolaos V. Sahinidis & Panagiotis Vouzis. (2010) GPU computing with Kaczmarz’s and other iterative algorithms for linear systems. Parallel Computing 36:5-6, pages 215-231.
Crossref
Paweł Macioł, Przemysław Płaszewski & Krzysztof Banaś. (2010) 3D finite element numerical integration on GPUs. Procedia Computer Science 1:1, pages 1093-1100.
Crossref
Emanouil Atanassov, Aneta Karaivanova & Sofiya Ivanovska. 2010. Large-Scale Scientific Computing. Large-Scale Scientific Computing 459 466 .
R. Lamb, M. Crossley & S. Waller. (2009) A fast two-dimensional floodplain inundation model. Proceedings of the Institution of Civil Engineers - Water Management 162:6, pages 363-370.
Crossref
Anirudh Maringanti, Viraj Athavale & Sachin B. Patkar. (2009) Acceleration of conjugate gradient method for circuit simulation using CUDA. Acceleration of conjugate gradient method for circuit simulation using CUDA.
Danny van Dyk, Markus Geveler, Sven Mallach, Dirk Ribbrock, Dominik Göddeke & Carsten Gutwenger. (2009) HONEI: A collection of libraries for numerical computations targeting multiple processor architectures. Computer Physics Communications 180:12, pages 2534-2543.
Crossref
Eddie Wadbro & Martin Berggren. (2009) Megapixel Topology Optimization on a Graphics Processing Unit. SIAM Review 51:4, pages 707-721.
Crossref
Dominik Goddeke, Sven H.M. Buijssen, Hilmar Wobker & Stefan Turek. (2009) GPU acceleration of an unmodified parallel finite element Navier-Stokes solver. GPU acceleration of an unmodified parallel finite element Navier-Stokes solver.
Dimitri Komatitsch, David Michéa & Gordon Erlebacher. (2009) Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. Journal of Parallel and Distributed Computing 69:5, pages 451-460.
Crossref
Dominique Aubert, Mehdi Amini & Romaric David. 2009. Computational Science – ICCS 2009. Computational Science – ICCS 2009 874 883 .
Hugo Leclerc, Jean-Noël Périé, Stéphane Roux & François Hild. 2009. Computer Vision/Computer Graphics CollaborationTechniques. Computer Vision/Computer Graphics CollaborationTechniques 161 171 .
Olaf Schenk, Matthias Christen & Helmar Burkhart. (2008) Algorithmic performance studies on graphics processing units. Journal of Parallel and Distributed Computing 68:10, pages 1360-1369.
Crossref
J. H. van Hateren. (2008) Fast Recursive Filters for Simulating Nonlinear Dynamic Systems. Neural Computation 20:7, pages 1821-1846.
Crossref
Z.A. Taylor, M. Cheng & S. Ourselin. (2008) High-Speed Nonlinear Finite Element Analysis for Surgical Simulation Using Graphics Processing Units. IEEE Transactions on Medical Imaging 27:5, pages 650-663.
Crossref
Dominik Göddeke, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick McCormick, Sven H.M. Buijssen, Matthias Grajewski & Stefan Turek. (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Computing 33:10-11, pages 685-699.
Crossref
John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krüger, Aaron E. Lefohn & Timothy J. Purcell. (2007) A Survey of General‐Purpose Computation on Graphics Hardware. Computer Graphics Forum 26:1, pages 80-113.
Crossref

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.