Search in:

International Journal of Parallel, Emergent and Distributed Systems Volume 22, 2007 - Issue 4: Applied parallel computing. Guest Editors: Ulrich Ruumlde and Frank Huumllsemann

Submit an article Journal homepage

423

Views

CrossRef citations to date

Altmetric

Original Articles

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Dominik Göddeke Universität Dortmund, Fachbereich Mathematik, Vogelpothsweg 87, 44 227, Dortmund, GermanyCorrespondence[email protected]

Robert Strzodka Stanford University, Max Planck Center, 353 Serra Street, Stanford, CA, 94305, USA

Stefan Turek Universität Dortmund, Fachbereich Mathematik, Vogelpothsweg 87, 44 227, Dortmund, Germany

Pages 221-256 | Received 01 Dec 2006, Accepted 01 Oct 2006, Published online: 06 Apr 2009

Cite this article
https://doi.org/10.1080/17445760601122076

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

Citations (81)

Keep up to date with the latest research on this topic with citation updates for this article.

Subscribe to citation updates

Read on this site (1)

Luc Buatois, Guillaume Caumon & Bruno Lévy. (2009) Concurrent number cruncher: a GPU implementation of a general sparse linear solver. International Journal of Parallel, Emergent and Distributed Systems 24:3, pages 205-223.
Read now

Articles from other publishers (80)

Alessio Netti, Yang Peng, Patrik Omland, Michael Paulitsch, Jorge Parra, Gustavo Espinosa, Udit Agarwal, Abraham Chan & Karthik Pattabiraman. (2023) Mixed precision support in HPC applications: What about reliability?. Journal of Parallel and Distributed Computing 181, pages 104746.
Crossref

Ruiheng Li, Jinpeng Wang, Wenxin Kong, Nian Yu, Tianyang Li & Chao Wang. (2023) An adaptive hybrid grids finite-element approach for plane wave three-dimensional electromagnetic modeling. Computers & Geosciences 180, pages 105437.
Crossref

Noel Chalmers, Abhishek Mishra, Damon McDougall & Tim Warburton. (2023) HipBone: A performance-portable graphics processing unit-accelerated C++ version of the NekBone benchmark. The International Journal of High Performance Computing Applications 37:5, pages 560-577.
Crossref

Barbara G. Simpson, Minjie ZhuAkiri Seki & Michael Scott. (2023) Challenges in GPU-Accelerated Nonlinear Dynamic Analysis for Structural Systems. Journal of Structural Engineering 149:3.
Crossref

Christoph KleinRobert Strzodka. (2023) Preconditioning Sparse Matrices with Alternating and Multiplicative Operator Splittings. SIAM Journal on Scientific Computing 45:1, pages A25-A48.
Crossref

Jan Ackmann, Peter D. Dueben, Tim Palmer & Piotr K. Smolarkiewicz. (2022) Mixed‐Precision for Linear Solvers in Global Geophysical Flows. Journal of Advances in Modeling Earth Systems 14:9.
Crossref

Nicholas J. Higham & Theo Mary. (2022) Mixed precision algorithms in numerical linear algebra. Acta Numerica 31, pages 347-414.
Crossref

Laurence Kedward & Christian B. Allen. (2022) Summary of Investigations into Finite Volume Methods on GPUs. Summary of Investigations into Finite Volume Methods on GPUs.

Richard J. Clancy, Matt Menickelly, Jan Hückelheim, Paul Hovland, Prani Nalluri & Rebecca Gjini. 2022. Computational Science – ICCS 2022. Computational Science – ICCS 2022 445 459 .

Laurence Kedward & Christian B. Allen. (2021) Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures. Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures.

Martin Kronbichler. 2021. Efficient High-Order Discretizations for Computational Fluid Dynamics. Efficient High-Order Discretizations for Computational Fluid Dynamics 57 115 .

Takeshi Iwashita, Kengo Suzuki & Takeshi Fukaya. (2020) An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement. An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.

Shengquan Wang, Chao Wang, Yong Cai & Guangyao Li. (2020) A novel parallel finite element procedure for nonlinear dynamic problems using GPU and mixed-precision algorithm. Engineering Computations 37:6, pages 2193-2211.
Crossref

Khalid Ahmad, Hari Sundar & Mary Hall. (2019) Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs. ACM Transactions on Architecture and Code Optimization 16:4, pages 1-24.
Crossref

Hartwig Anzt, Goran Flegar, Thomas Grützmacher & Enrique S Quintana-Ortí. (2019) Toward a modular precision ecosystem for high-performance computing. The International Journal of High Performance Computing Applications 33:6, pages 1069-1078.
Crossref

Martin Kronbichler & Karl Ljungkvist. (2019) Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors. ACM Transactions on Parallel Computing 6:1, pages 1-32.
Crossref

Fernando Fernandes dos Santos, Caio Lunardi, Daniel Oliveira, Fabiano Libano & Paolo Rech. (2019) Reliability Evaluation of Mixed-Precision Architectures. Reliability Evaluation of Mixed-Precision Architectures.

Thomas Grützmacher & Hartwig Anzt. 2019. Euro-Par 2018: Parallel Processing Workshops. Euro-Par 2018: Parallel Processing Workshops 434 443 .

Andrew Dawson, Peter D. Düben, David A. MacLeod & Tim N. Palmer. (2017) Reliable low precision simulations in land surface models. Climate Dynamics 51:7-8, pages 2657-2666.
Crossref

James Shee, Evan J. Arthur, Shiwei Zhang, David R. Reichman & Richard A. Friesner. (2018) Phaseless Auxiliary-Field Quantum Monte Carlo on Graphical Processing Units. Journal of Chemical Theory and Computation 14:8, pages 4109-4121.
Crossref

Nileshchandra K Pikle, Shailesh R Sathe & Arvind Y Vyavhare. (2018) GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review. Sādhanā 43:7.
Crossref

Michael O Lam & Jeffrey K Hollingsworth. (2016) Fine-grained floating-point precision analysis. The International Journal of High Performance Computing Applications 32:2, pages 231-245.
Crossref

Shane Fogerty, Siddhartha Bishnu, Yuliana Zamora, Laura Monroe, Steve Poole, Michael Lam, Joe Schoonover & Robert Robey. (2017) Thoughtful Precision in Mini-Apps. Thoughtful Precision in Mini-Apps.

Peter D. Düben, Aneesh Subramanian, Andrew Dawson & T. N. Palmer. (2017) A study of reduced numerical precision to make superparameterization more competitive using a hardware emulator in the OpenIFS model. Journal of Advances in Modeling Earth Systems 9:1, pages 566-584.
Crossref

Marijn P. Zwier & Wessel W. Wits. (2017) Physics in Design: Real-time Numerical Simulation Integrated into the CAD Environment. Procedia CIRP 60, pages 98-103.
Crossref

Tobias Thornes, Peter Düben & Tim Palmer. (2017) On the use of scale‐dependent precision in Earth System modelling. Quarterly Journal of the Royal Meteorological Society 143:703, pages 897-908.
Crossref

Roberto Olivares‐Amaya, Adrian Jinich, Mark A. Watson & Alán Aspuru‐Guzik. 2016. Electronic Structure Calculations on Graphics Processing Units. Electronic Structure Calculations on Graphics Processing Units 259 278 .

Amir M. Mirzendehdel & Krishnan Suresh. (2015) A Deflated Assembly Free Approach to Large-Scale Implicit Structural Dynamics. Journal of Computational and Nonlinear Dynamics 10:6.
Crossref

Andrew D. Brown, Rob Mills, Kier James Dugan, Jeff S. Reeve & Steve B. Furber. (2015) Reliable computation with unreliable computers. IET Computers & Digital Techniques 9:4, pages 230-237.
Crossref

Cássio Sozinho Amorim, Kazuto Ebihara, Ai Yamakage, Yukio Tanaka & Masatoshi Sato. (2015) Majorana braiding dynamics in nanowires. Physical Review B 91:17.
Crossref

Jean Chamberlain Chedjou & Kyandoghere Kyamakya. (2015) A Universal Concept Based on Cellular Neural Networks for Ultrafast and Flexible Solving of Differential Equations. IEEE Transactions on Neural Networks and Learning Systems 26:4, pages 749-762.
Crossref

Dimitar Lukarski, Hartwig Anzt, Stanimire Tomov & Jack Dongarra. (2014) Hybrid Multi-elimination ILU Preconditioners on GPUs. Hybrid Multi-elimination ILU Preconditioners on GPUs.

Doron Sabo, Oded Barzelay, Shlomo Weiss & Miriam Furst. (2014) Fast evaluation of a time-domain non-linear cochlear model on GPUs. Journal of Computational Physics 265, pages 97-112.
Crossref

Krzysztof Banaś, Przemysław Płaszewski & Paweł Macioł. (2014) Numerical integration on GPUs for higher order finite elements. Computers & Mathematics with Applications 67:6, pages 1319-1344.
Crossref

Rong Tian. (2014) Simulation at Extreme-Scale: Co-Design Thinking and Practices. Archives of Computational Methods in Engineering 21:1, pages 39-58.
Crossref

Peter Huthwaite. (2014) Accelerated finite element elastodynamic simulations using the GPU. Journal of Computational Physics 257, pages 687-707.
Crossref

Ronan Mendonça Amorim & Rodrigo Weber dos Santos. (2013) Solving the cardiac bidomain equations using graphics processing units. Journal of Computational Science 4:5, pages 370-376.
Crossref

Rashid Hassani, Amirreza Fazely, Riaz-Ul-Ahsan Choudhury & Peter Luksch. (2013) Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA. Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA.

S. P. Vanka. (2013) 2012 Freeman Scholar Lecture: Computational Fluid Dynamics on Graphics Processing Units. Journal of Fluids Engineering 135:6.
Crossref

Serban Georgescu, Peter Chow & Hiroshi Okuda. (2013) GPU Acceleration for FEM-Based Structural Analysis. Archives of Computational Methods in Engineering 20:2, pages 111-121.
Crossref

S. Galal, O. Shacham, J. S. Brunhaver, Jing Pu, A. Vassiliev & M. Horowitz. (2013) FPU Generator for Design Space Exploration. FPU Generator for Design Space Exploration.

Dominik Göddeke, Dimitri Komatitsch, Markus Geveler, Dirk Ribbrock, Nikola Rajovic, Nikola Puzovic & Alex Ramirez. (2013) Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster. Journal of Computational Physics 237, pages 132-150.
Crossref

Jianfei Zhang & Lei Zhang. (2013) Efficient CUDA Polynomial Preconditioned Conjugate Gradient Solver for Finite Element Computation of Elasticity Problems. Mathematical Problems in Engineering 2013, pages 1-12.
Crossref

Doron Sabo, Shlomo Weiss & Miriam Furst. (2013) A Parallel Algorithm for a Physiological Non-linear Model of the Cochlea. Procedia Computer Science 18, pages 682-691.
Crossref

Jiangyong Ren, ChaoWei Wang, Yingrui Wang & Rong Tian. 2013. High Performance Computing. High Performance Computing 151 165 .

Hartwig Anzt, Maribel Castillo, Juan C. Fernández, Vincent Heuveline, Francisco D. Igual, Rafael Mayo & Enrique S. Quintana-Ortí. (2011) Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors. Computer Science - Research and Development 27:4, pages 299-307.
Crossref

Pablo Igounet, Ernesto Dufrechou, Martin Pedemonte & Pablo Ezzatti. (2012) A Study on Mixed Precision Techniques for a GPU-based SIP Solver. A Study on Mixed Precision Techniques for a GPU-based SIP Solver.

Shaojing Li, Ruinan Chang, A. Boag & V. Lomakin. (2012) Fast Electromagnetic Integral-Equation Solvers on Graphics Processing Units. IEEE Antennas and Propagation Magazine 54:5, pages 71-87.
Crossref

Choon Lih Hoo, Sallehuddin Mohamed Haris & Nik Abdullah Nik Mohamed. (2012) A floating point conversion algorithm for mixed precision computations. Journal of Zhejiang University SCIENCE C 13:9, pages 711-718.
Crossref

C. Bonati, G. Cossu, M. DʼElia & P. Incardona. (2012) QCD simulations with staggered fermions on GPUs. Computer Physics Communications 183:4, pages 853-863.
Crossref

Hartwig Anzt, Piotr Luszczek, Jack Dongarra & Vincent Heuveline. 2012. Euro-Par 2012 Parallel Processing. Euro-Par 2012 Parallel Processing 908 919 .

Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2012. Applied Parallel and Scientific Computing. Applied Parallel and Scientific Computing 237 247 .

O. Fluck, C. Vetter, W. Wein, A. Kamen, B. Preim & R. Westermann. (2011) A survey of medical image registration on graphics hardware. Computer Methods and Programs in Biomedicine 104:3, pages e45-e57.
Crossref

TETSU NARUMI, TSUYOSHI HAMADA, KEIGO NITADORI, RYUJI SAKAMAKI & KENJI YASUOKA. (2011) FAST QUASI DOUBLE-PRECISION METHOD WITH SINGLE-PRECISION HARDWARE TO ACCELERATE SCIENTIFIC APPLICATIONS. International Journal of Computational Methods 08:03, pages 561-581.
Crossref

Shuhan Qi, Xuan Wang & Shaohuai Shi. (2011) Mixed Precision Method for GPU-based FFT. Mixed Precision Method for GPU-based FFT.

Hartwig Anzt, Vincent Heuveline, Bjorn Rocker, Maribel Castillo, Juan C. Fern´ndez, Rafael Mayo & Enrique S. Quintana-Orti. (2011) Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems. Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems.

Tao Yuan, Zhu Mingfa, Xiao Limin, Ruan Li, Dongyi Guan, Siming Chen & Ding Yi. (2011) Research on the Accuracy of Single Precision on Graphics Processing Unit. Research on the Accuracy of Single Precision on Graphics Processing Unit.

M. Papadrakakis, G. Stavroulakis & A. Karatarakis. (2011) A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures. Computer Methods in Applied Mechanics and Engineering 200:13-16, pages 1490-1508.
Crossref

Dominik Goddeke & Robert Strzodka. (2011) Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid. IEEE Transactions on Parallel and Distributed Systems 22:1, pages 22-32.
Crossref

Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010 58 70 .

Björn Rocker, Mariana Kolberg & Vincent Heuveline. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010 394 407 .

Serban Georgescu & Hiroshi Okuda. 2010. Software Automatic Tuning. Software Automatic Tuning 103 119 .

Serban Georgescu & Hiroshi Okuda. (2010) Conjugate gradients on multiple GPUs. International Journal for Numerical Methods in Fluids 64:10-12, pages 1254-1273.
Crossref

Dimitri Komatitsch, Gordon Erlebacher, Dominik Göddeke & David Michéa. (2010) High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster. Journal of Computational Physics 229:20, pages 7692-7714.
Crossref

Joseph M. Elble, Nikolaos V. Sahinidis & Panagiotis Vouzis. (2010) GPU computing with Kaczmarz’s and other iterative algorithms for linear systems. Parallel Computing 36:5-6, pages 215-231.
Crossref

Paweł Macioł, Przemysław Płaszewski & Krzysztof Banaś. (2010) 3D finite element numerical integration on GPUs. Procedia Computer Science 1:1, pages 1093-1100.
Crossref

Emanouil Atanassov, Aneta Karaivanova & Sofiya Ivanovska. 2010. Large-Scale Scientific Computing. Large-Scale Scientific Computing 459 466 .

R. Lamb, M. Crossley & S. Waller. (2009) A fast two-dimensional floodplain inundation model. Proceedings of the Institution of Civil Engineers - Water Management 162:6, pages 363-370.
Crossref

Anirudh Maringanti, Viraj Athavale & Sachin B. Patkar. (2009) Acceleration of conjugate gradient method for circuit simulation using CUDA. Acceleration of conjugate gradient method for circuit simulation using CUDA.

Danny van Dyk, Markus Geveler, Sven Mallach, Dirk Ribbrock, Dominik Göddeke & Carsten Gutwenger. (2009) HONEI: A collection of libraries for numerical computations targeting multiple processor architectures. Computer Physics Communications 180:12, pages 2534-2543.
Crossref

Eddie Wadbro & Martin Berggren. (2009) Megapixel Topology Optimization on a Graphics Processing Unit. SIAM Review 51:4, pages 707-721.
Crossref

Dominik Goddeke, Sven H.M. Buijssen, Hilmar Wobker & Stefan Turek. (2009) GPU acceleration of an unmodified parallel finite element Navier-Stokes solver. GPU acceleration of an unmodified parallel finite element Navier-Stokes solver.

Dimitri Komatitsch, David Michéa & Gordon Erlebacher. (2009) Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. Journal of Parallel and Distributed Computing 69:5, pages 451-460.
Crossref

Dominique Aubert, Mehdi Amini & Romaric David. 2009. Computational Science – ICCS 2009. Computational Science – ICCS 2009 874 883 .

Hugo Leclerc, Jean-Noël Périé, Stéphane Roux & François Hild. 2009. Computer Vision/Computer Graphics CollaborationTechniques. Computer Vision/Computer Graphics CollaborationTechniques 161 171 .

Olaf Schenk, Matthias Christen & Helmar Burkhart. (2008) Algorithmic performance studies on graphics processing units. Journal of Parallel and Distributed Computing 68:10, pages 1360-1369.
Crossref

J. H. van Hateren. (2008) Fast Recursive Filters for Simulating Nonlinear Dynamic Systems. Neural Computation 20:7, pages 1821-1846.
Crossref

Z.A. Taylor, M. Cheng & S. Ourselin. (2008) High-Speed Nonlinear Finite Element Analysis for Surgical Simulation Using Graphics Processing Units. IEEE Transactions on Medical Imaging 27:5, pages 650-663.
Crossref

Dominik Göddeke, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick McCormick, Sven H.M. Buijssen, Matthias Grajewski & Stefan Turek. (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Computing 33:10-11, pages 685-699.
Crossref

John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krüger, Aaron E. Lefohn & Timothy J. Purcell. (2007) A Survey of General‐Purpose Computation on Graphics Hardware. Computer Graphics Forum 26:1, pages 80-113.
Crossref

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Articles from other publishers (80)

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Citations (81)

Read on this site (1)

Articles from other publishers (80)

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date