Alessio Netti, Yang Peng, Patrik Omland, Michael Paulitsch, Jorge Parra, Gustavo Espinosa, Udit Agarwal, Abraham Chan & Karthik Pattabiraman. (2023) Mixed precision support in HPC applications: What about reliability?. Journal of Parallel and Distributed Computing 181, pages 104746.
Crossref
Ruiheng Li, Jinpeng Wang, Wenxin Kong, Nian Yu, Tianyang Li & Chao Wang. (2023) An adaptive hybrid grids finite-element approach for plane wave three-dimensional electromagnetic modeling. Computers & Geosciences 180, pages 105437.
Crossref
Noel Chalmers, Abhishek Mishra, Damon McDougall & Tim Warburton. (2023) HipBone: A performance-portable graphics processing unit-accelerated C++ version of the NekBone benchmark. The International Journal of High Performance Computing Applications 37:5, pages 560-577.
Crossref
Barbara G. Simpson, Minjie ZhuAkiri Seki & Michael Scott. (2023) Challenges in GPU-Accelerated Nonlinear Dynamic Analysis for Structural Systems. Journal of Structural Engineering 149:3.
Crossref
Christoph KleinRobert Strzodka. (2023) Preconditioning Sparse Matrices with Alternating and Multiplicative Operator Splittings. SIAM Journal on Scientific Computing 45:1, pages A25-A48.
Crossref
Jan Ackmann, Peter D. Dueben, Tim Palmer & Piotr K. Smolarkiewicz. (2022) Mixed‐Precision for Linear Solvers in Global Geophysical Flows. Journal of Advances in Modeling Earth Systems 14:9.
Crossref
Nicholas J. Higham & Theo Mary. (2022) Mixed precision algorithms in numerical linear algebra. Acta Numerica 31, pages 347-414.
Crossref
Laurence Kedward & Christian B. Allen. (2022) Summary of Investigations into Finite Volume Methods on GPUs. Summary of Investigations into Finite Volume Methods on GPUs.
Richard J. Clancy, Matt Menickelly, Jan Hückelheim, Paul Hovland, Prani Nalluri & Rebecca Gjini. 2022. Computational Science – ICCS 2022. Computational Science – ICCS 2022
445
459
.
Laurence Kedward & Christian B. Allen. (2021) Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures. Optimisation of a Finite-Volume Test-bench Code for Highly Parallel Architectures.
Martin Kronbichler. 2021. Efficient High-Order Discretizations for Computational Fluid Dynamics. Efficient High-Order Discretizations for Computational Fluid Dynamics
57
115
.
Takeshi Iwashita, Kengo Suzuki & Takeshi Fukaya. (2020) An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement. An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.
Shengquan Wang, Chao Wang, Yong Cai & Guangyao Li. (2020) A novel parallel finite element procedure for nonlinear dynamic problems using GPU and mixed-precision algorithm. Engineering Computations 37:6, pages 2193-2211.
Crossref
Khalid Ahmad, Hari Sundar & Mary Hall. (2019) Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs. ACM Transactions on Architecture and Code Optimization 16:4, pages 1-24.
Crossref
Hartwig Anzt, Goran Flegar, Thomas Grützmacher & Enrique S Quintana-Ortí. (2019) Toward a modular precision ecosystem for high-performance computing. The International Journal of High Performance Computing Applications 33:6, pages 1069-1078.
Crossref
Martin Kronbichler & Karl Ljungkvist. (2019) Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors. ACM Transactions on Parallel Computing 6:1, pages 1-32.
Crossref
Fernando Fernandes dos Santos, Caio Lunardi, Daniel Oliveira, Fabiano Libano & Paolo Rech. (2019) Reliability Evaluation of Mixed-Precision Architectures. Reliability Evaluation of Mixed-Precision Architectures.
Thomas Grützmacher & Hartwig Anzt. 2019. Euro-Par 2018: Parallel Processing Workshops. Euro-Par 2018: Parallel Processing Workshops
434
443
.
Andrew Dawson, Peter D. Düben, David A. MacLeod & Tim N. Palmer. (2017) Reliable low precision simulations in land surface models. Climate Dynamics 51:7-8, pages 2657-2666.
Crossref
James Shee, Evan J. Arthur, Shiwei Zhang, David R. Reichman & Richard A. Friesner. (2018) Phaseless Auxiliary-Field Quantum Monte Carlo on Graphical Processing Units. Journal of Chemical Theory and Computation 14:8, pages 4109-4121.
Crossref
Nileshchandra K Pikle, Shailesh R Sathe & Arvind Y Vyavhare. (2018) GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review. Sādhanā 43:7.
Crossref
Michael O Lam & Jeffrey K Hollingsworth. (2016) Fine-grained floating-point precision analysis. The International Journal of High Performance Computing Applications 32:2, pages 231-245.
Crossref
Shane Fogerty, Siddhartha Bishnu, Yuliana Zamora, Laura Monroe, Steve Poole, Michael Lam, Joe Schoonover & Robert Robey. (2017) Thoughtful Precision in Mini-Apps. Thoughtful Precision in Mini-Apps.
Peter D. Düben, Aneesh Subramanian, Andrew Dawson & T. N. Palmer. (2017) A study of reduced numerical precision to make superparameterization more competitive using a hardware emulator in the OpenIFS model. Journal of Advances in Modeling Earth Systems 9:1, pages 566-584.
Crossref
Marijn P. Zwier & Wessel W. Wits. (2017) Physics in Design: Real-time Numerical Simulation Integrated into the CAD Environment. Procedia CIRP 60, pages 98-103.
Crossref
Tobias Thornes, Peter Düben & Tim Palmer. (2017) On the use of scale‐dependent precision in Earth System modelling. Quarterly Journal of the Royal Meteorological Society 143:703, pages 897-908.
Crossref
Roberto Olivares‐Amaya, Adrian Jinich, Mark A. Watson & Alán Aspuru‐Guzik. 2016. Electronic Structure Calculations on Graphics Processing Units. Electronic Structure Calculations on Graphics Processing Units
259
278
.
Amir M. Mirzendehdel & Krishnan Suresh. (2015) A Deflated Assembly Free Approach to Large-Scale Implicit Structural Dynamics. Journal of Computational and Nonlinear Dynamics 10:6.
Crossref
Andrew D. Brown, Rob Mills, Kier James Dugan, Jeff S. Reeve & Steve B. Furber. (2015) Reliable computation with unreliable computers. IET Computers & Digital Techniques 9:4, pages 230-237.
Crossref
Cássio Sozinho Amorim, Kazuto Ebihara, Ai Yamakage, Yukio Tanaka & Masatoshi Sato. (2015) Majorana braiding dynamics in nanowires. Physical Review B 91:17.
Crossref
Jean Chamberlain Chedjou & Kyandoghere Kyamakya. (2015) A Universal Concept Based on Cellular Neural Networks for Ultrafast and Flexible Solving of Differential Equations. IEEE Transactions on Neural Networks and Learning Systems 26:4, pages 749-762.
Crossref
Dimitar Lukarski, Hartwig Anzt, Stanimire Tomov & Jack Dongarra. (2014) Hybrid Multi-elimination ILU Preconditioners on GPUs. Hybrid Multi-elimination ILU Preconditioners on GPUs.
Doron Sabo, Oded Barzelay, Shlomo Weiss & Miriam Furst. (2014) Fast evaluation of a time-domain non-linear cochlear model on GPUs. Journal of Computational Physics 265, pages 97-112.
Crossref
Krzysztof Banaś, Przemysław Płaszewski & Paweł Macioł. (2014) Numerical integration on GPUs for higher order finite elements. Computers & Mathematics with Applications 67:6, pages 1319-1344.
Crossref
Rong Tian. (2014) Simulation at Extreme-Scale: Co-Design Thinking and Practices. Archives of Computational Methods in Engineering 21:1, pages 39-58.
Crossref
Peter Huthwaite. (2014) Accelerated finite element elastodynamic simulations using the GPU. Journal of Computational Physics 257, pages 687-707.
Crossref
Ronan Mendonça Amorim & Rodrigo Weber dos Santos. (2013) Solving the cardiac bidomain equations using graphics processing units. Journal of Computational Science 4:5, pages 370-376.
Crossref
Rashid Hassani, Amirreza Fazely, Riaz-Ul-Ahsan Choudhury & Peter Luksch. (2013) Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA. Analysis of Sparse Matrix-Vector Multiplication Using Iterative Method in CUDA.
S. P. Vanka. (2013) 2012 Freeman Scholar Lecture: Computational Fluid Dynamics on Graphics Processing Units. Journal of Fluids Engineering 135:6.
Crossref
Serban Georgescu, Peter Chow & Hiroshi Okuda. (2013) GPU Acceleration for FEM-Based Structural Analysis. Archives of Computational Methods in Engineering 20:2, pages 111-121.
Crossref
S. Galal, O. Shacham, J. S. Brunhaver, Jing Pu, A. Vassiliev & M. Horowitz. (2013) FPU Generator for Design Space Exploration. FPU Generator for Design Space Exploration.
Dominik Göddeke, Dimitri Komatitsch, Markus Geveler, Dirk Ribbrock, Nikola Rajovic, Nikola Puzovic & Alex Ramirez. (2013) Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster. Journal of Computational Physics 237, pages 132-150.
Crossref
Jianfei Zhang & Lei Zhang. (2013) Efficient CUDA Polynomial Preconditioned Conjugate Gradient Solver for Finite Element Computation of Elasticity Problems. Mathematical Problems in Engineering 2013, pages 1-12.
Crossref
Doron Sabo, Shlomo Weiss & Miriam Furst. (2013) A Parallel Algorithm for a Physiological Non-linear Model of the Cochlea. Procedia Computer Science 18, pages 682-691.
Crossref
Jiangyong Ren, ChaoWei Wang, Yingrui Wang & Rong Tian. 2013. High Performance Computing. High Performance Computing
151
165
.
Hartwig Anzt, Maribel Castillo, Juan C. Fernández, Vincent Heuveline, Francisco D. Igual, Rafael Mayo & Enrique S. Quintana-Ortí. (2011) Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors. Computer Science - Research and Development 27:4, pages 299-307.
Crossref
Pablo Igounet, Ernesto Dufrechou, Martin Pedemonte & Pablo Ezzatti. (2012) A Study on Mixed Precision Techniques for a GPU-based SIP Solver. A Study on Mixed Precision Techniques for a GPU-based SIP Solver.
Shaojing Li, Ruinan Chang, A. Boag & V. Lomakin. (2012) Fast Electromagnetic Integral-Equation Solvers on Graphics Processing Units. IEEE Antennas and Propagation Magazine 54:5, pages 71-87.
Crossref
Choon Lih Hoo, Sallehuddin Mohamed Haris & Nik Abdullah Nik Mohamed. (2012) A floating point conversion algorithm for mixed precision computations. Journal of Zhejiang University SCIENCE C 13:9, pages 711-718.
Crossref
C. Bonati, G. Cossu, M. DʼElia & P. Incardona. (2012) QCD simulations with staggered fermions on GPUs. Computer Physics Communications 183:4, pages 853-863.
Crossref
Hartwig Anzt, Piotr Luszczek, Jack Dongarra & Vincent Heuveline. 2012. Euro-Par 2012 Parallel Processing. Euro-Par 2012 Parallel Processing
908
919
.
Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2012. Applied Parallel and Scientific Computing. Applied Parallel and Scientific Computing
237
247
.
O. Fluck, C. Vetter, W. Wein, A. Kamen, B. Preim & R. Westermann. (2011) A survey of medical image registration on graphics hardware. Computer Methods and Programs in Biomedicine 104:3, pages e45-e57.
Crossref
TETSU NARUMI, TSUYOSHI HAMADA, KEIGO NITADORI, RYUJI SAKAMAKI & KENJI YASUOKA. (2011) FAST QUASI DOUBLE-PRECISION METHOD WITH SINGLE-PRECISION HARDWARE TO ACCELERATE SCIENTIFIC APPLICATIONS. International Journal of Computational Methods 08:03, pages 561-581.
Crossref
Shuhan Qi, Xuan Wang & Shaohuai Shi. (2011) Mixed Precision Method for GPU-based FFT. Mixed Precision Method for GPU-based FFT.
Hartwig Anzt, Vincent Heuveline, Bjorn Rocker, Maribel Castillo, Juan C. Fern´ndez, Rafael Mayo & Enrique S. Quintana-Orti. (2011) Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems. Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems.
Tao Yuan, Zhu Mingfa, Xiao Limin, Ruan Li, Dongyi Guan, Siming Chen & Ding Yi. (2011) Research on the Accuracy of Single Precision on Graphics Processing Unit. Research on the Accuracy of Single Precision on Graphics Processing Unit.
M. Papadrakakis, G. Stavroulakis & A. Karatarakis. (2011) A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures. Computer Methods in Applied Mechanics and Engineering 200:13-16, pages 1490-1508.
Crossref
Dominik Goddeke & Robert Strzodka. (2011) Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid. IEEE Transactions on Parallel and Distributed Systems 22:1, pages 22-32.
Crossref
Hartwig Anzt, Vincent Heuveline & Björn Rocker. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010
58
70
.
Björn Rocker, Mariana Kolberg & Vincent Heuveline. 2011. High Performance Computing for Computational Science – VECPAR 2010. High Performance Computing for Computational Science – VECPAR 2010
394
407
.
Serban Georgescu & Hiroshi Okuda. 2010. Software Automatic Tuning. Software Automatic Tuning
103
119
.
Serban Georgescu & Hiroshi Okuda. (2010) Conjugate gradients on multiple GPUs. International Journal for Numerical Methods in Fluids 64:10-12, pages 1254-1273.
Crossref
Dimitri Komatitsch, Gordon Erlebacher, Dominik Göddeke & David Michéa. (2010) High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster. Journal of Computational Physics 229:20, pages 7692-7714.
Crossref
Joseph M. Elble, Nikolaos V. Sahinidis & Panagiotis Vouzis. (2010) GPU computing with Kaczmarz’s and other iterative algorithms for linear systems. Parallel Computing 36:5-6, pages 215-231.
Crossref
Paweł Macioł, Przemysław Płaszewski & Krzysztof Banaś. (2010) 3D finite element numerical integration on GPUs. Procedia Computer Science 1:1, pages 1093-1100.
Crossref
Emanouil Atanassov, Aneta Karaivanova & Sofiya Ivanovska. 2010. Large-Scale Scientific Computing. Large-Scale Scientific Computing
459
466
.
R. Lamb, M. Crossley & S. Waller. (2009) A fast two-dimensional floodplain inundation model. Proceedings of the Institution of Civil Engineers - Water Management 162:6, pages 363-370.
Crossref
Anirudh Maringanti, Viraj Athavale & Sachin B. Patkar. (2009) Acceleration of conjugate gradient method for circuit simulation using CUDA. Acceleration of conjugate gradient method for circuit simulation using CUDA.
Danny van Dyk, Markus Geveler, Sven Mallach, Dirk Ribbrock, Dominik Göddeke & Carsten Gutwenger. (2009) HONEI: A collection of libraries for numerical computations targeting multiple processor architectures. Computer Physics Communications 180:12, pages 2534-2543.
Crossref
Eddie Wadbro & Martin Berggren. (2009) Megapixel Topology Optimization on a Graphics Processing Unit. SIAM Review 51:4, pages 707-721.
Crossref
Dominik Goddeke, Sven H.M. Buijssen, Hilmar Wobker & Stefan Turek. (2009) GPU acceleration of an unmodified parallel finite element Navier-Stokes solver. GPU acceleration of an unmodified parallel finite element Navier-Stokes solver.
Dimitri Komatitsch, David Michéa & Gordon Erlebacher. (2009) Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. Journal of Parallel and Distributed Computing 69:5, pages 451-460.
Crossref
Dominique Aubert, Mehdi Amini & Romaric David. 2009. Computational Science – ICCS 2009. Computational Science – ICCS 2009
874
883
.
Hugo Leclerc, Jean-Noël Périé, Stéphane Roux & François Hild. 2009. Computer Vision/Computer Graphics CollaborationTechniques. Computer Vision/Computer Graphics CollaborationTechniques
161
171
.
Olaf Schenk, Matthias Christen & Helmar Burkhart. (2008) Algorithmic performance studies on graphics processing units. Journal of Parallel and Distributed Computing 68:10, pages 1360-1369.
Crossref
J. H. van Hateren. (2008) Fast Recursive Filters for Simulating Nonlinear Dynamic Systems. Neural Computation 20:7, pages 1821-1846.
Crossref
Z.A. Taylor, M. Cheng & S. Ourselin. (2008) High-Speed Nonlinear Finite Element Analysis for Surgical Simulation Using Graphics Processing Units. IEEE Transactions on Medical Imaging 27:5, pages 650-663.
Crossref
Dominik Göddeke, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick McCormick, Sven H.M. Buijssen, Matthias Grajewski & Stefan Turek. (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Computing 33:10-11, pages 685-699.
Crossref
John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krüger, Aaron E. Lefohn & Timothy J. Purcell. (2007) A Survey of General‐Purpose Computation on Graphics Hardware. Computer Graphics Forum 26:1, pages 80-113.
Crossref