Search in:

International Journal of Parallel, Emergent and Distributed Systems Volume 22, 2007 - Issue 4: Applied parallel computing. Guest Editors: Ulrich Ruumlde and Frank Huumllsemann

Submit an article Journal homepage

423

Views

CrossRef citations to date

Altmetric

Original Articles

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Dominik Göddeke Universität Dortmund, Fachbereich Mathematik, Vogelpothsweg 87, 44 227, Dortmund, GermanyCorrespondence[email protected]

Robert Strzodka Stanford University, Max Planck Center, 353 Serra Street, Stanford, CA, 94305, USA

Stefan Turek Universität Dortmund, Fachbereich Mathematik, Vogelpothsweg 87, 44 227, Dortmund, Germany

Pages 221-256 | Received 01 Dec 2006, Accepted 01 Oct 2006, Published online: 06 Apr 2009

Cite this article
https://doi.org/10.1080/17445760601122076

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Wilkes , M. 2000 . “ The memory gap (keynote) ” . In Solving the Memory Wall Problem Workshop http://www.ece.neu.edu/conf/wall2k/wilkes1.pdf
Google Scholar
Ho , C.H. , Leong , P. , Luk , W. , Wilton , S. and Lopez-Buedo , S. 2006 . “ Virtual embedded blocks: a methodology for evaluating embedded elements in FPGAs ” . In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'06)
Google Scholar
Dekker , T.J. 1971 . A floating-point technique for extending the available precision . Numerische Mathematik , 18 : 224 – 242 .
Web of Science ®Google Scholar
Knuth , D.E. 1997 . The Art of Computer Programming, Volume 2 (3rd ed.): Seminumerical Algorithms , Boston, MA : Addison-Wesley Longman Publishing Co., Inc. .
Google Scholar
Møller , O. 1965 . Quasi double-precision in floating point addition . BIT , 5 ( 1 ) : 37 – 50 .
Google Scholar
Hida , Y. , Li , X.S. and Bailey , D.H. 2001 . “ Algorithms for quad-double precision floating point arithmetic ” . In Proceedings of the 15th Symposium on Computer Arithmetic Edited by: Burgess , N. and Ciminiera , L. 155 – 162 .
Google Scholar
Li , X.S. , Demmel , J.W. , Bailey , D.H. , Henry , G. , Hida , Y. , Iskandar , J. , Kahan , W. , Kang , S.Y. , Kapur , A. , Martin , M.C. , Thompson , B.J. , Tung , T. and Yoo , D.J. 2002 . Design, implementation and testing of extended and mixed precision BLAS . ACM Transactions on Mathematical Software , 28 ( 2 ) : 152 – 205 .
Web of Science ®Google Scholar
Priest , D.M. 1991 . “ Algorithms for arbitrary precision floating point arithmetic ” . In 10th IEEE Symposium on Computer Arithmetic 132 – 143 .
Google Scholar
Shewchuk , J.R. 1997 . Adaptive precision floating-point arithmetic and fast robust geometric predicates . Discrete & Computational Geometry , 18 ( 3 ) : 305 – 363 . October
Web of Science ®Google Scholar
Bailey, D.H., Hida, Y., Jeyabalan, K., Li, X.S. and Thompson, B., 2006, High-precision software directory, http://crd.lbl.gov/∼dhbailey/mpdist/ (http://crd.lbl.gov/~dhbailey/mpdist/)
Google Scholar
Free Software Foundation, Inc., GNU Multiple Precision Arithmetic Library, 4.2.1 edition, 2006. http://www.swox.com/gmp .
Google Scholar
Wilkinson , J.H. 1963 . Rounding Errors in Algebraic Processes , New York, NY : Dover Publications, Incorporated .
Google Scholar
Martin , R.S. , Peters , G. and Wilkinson , J.H. 1966 . Handbook series linear algebra: iterative refinement of the solution of a positive definite system of equations . Numerische Mathematik , 8 : 203 – 216 .
Web of Science ®Google Scholar
Bowdler , H.J. , Martin , R.S. , Peters , G. and Wilkinson , J.H. 1966 . Handbook series linear algebra: solution of real and complex systems of linear equations . Numerische Mathematik , 8 : 217 – 234 .
Web of Science ®Google Scholar
Demmel , J. , Hida , Y. , Kahan , W. , Li , X.S. , Mukherjeek , S. and Riedy , E.J. 2006 . Error bounds from extra precise iterative refinement . ACM Transactions on Mathematical Software , 32 ( 2 ) : 325 – 351 . June
Web of Science ®Google Scholar
Zielke , G. and Drygalla , V. 2003 . Genaue Lösung linearer Gleichungssysteme . GAMM-Mitteilungen , 2 ( 1 ) : 7 – 107 .
Google Scholar
Turner , K. and Walker , H.F. 1992 . Efficient high accuracy solutions with GMRES(m) . SIAM Journal on Scientific and Statistical Computing archive , 13 ( 3 ) : 815 – 825 .
Web of Science ®Google Scholar
Geddes , K.O. and Zheng , W.W. 2003 . “ Exploiting fast hardware floating point in high precision computation ” . In ISSAC'03: Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation , 111 – 118 . New York, NY : ACM Press .
Google Scholar
Langou , J. , Langou , J. , Luszczek , P. , Kurzak , J. , Buttari , A. and Dongarra , J.J. 2006 . “ Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems) ” . In Proceedings of the ACM/IEEE SuperComputing 2006 (SC'06) to appear
Google Scholar
Stewart , G.W. 1973 . Introduction to Matrix Computations , San Diego : Academic Press .
Google Scholar
Higham , N.J. 2002 . Accuracy and Stability of Numerical Algorithms , 2nd ed. , Philadelphia, PA : Society for Industrial and Applied Mathematics .
Google Scholar
Hartenstein , R. 2001 . A decade of reconfigurable computing: a visionary retrospective . Design, Automation and Test in Europe—DATE , : 2001 March
Google Scholar
Hartenstein , R. 2003 . “ Data-stream-based computing: models and architectural resources ” . In International Conference on Microelectronics, Devices and Materials (MIDEM 2003) Ocotober
Google Scholar
Sankaralingam , K. , Nagarajan , R. , Liu , H. , Kim , C. , Huh , J. , Burger , D. , Keckler , S.W. and Moore , C.R. 2003 . “ Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture ” . In ISCA 2003 422 – 433 .
Google Scholar
Guo , Z. , Najjar , W. , Vahid , F. and Vissers , K. 2004 . “ A quantitative analysis of the speedup factors of FPGAs over processors ” . In ACM/IEEE International Symposium on Field-Programmable Gate Arrays
Google Scholar
Taylor , M.B. , Kim , J.S. , Miller , J. , Wentzlaff , D. , Ghodrat , F. , Greenwald , B. , Hoffman , H. , Johnson , P. , Lee , J. , Lee , W. , Ma , A. , Saraf , A. , Seneski , M. , Shnidman , N. , Strumpen , V. , Frank , M. , Amarasinghe , S.P. and Agarwal , A. 2002 . The raw microprocessor: a computational fabric for software circuits and general purpose programs . IEEE Micro , 22 ( 2 ) : 25 – 35 .
Web of Science ®Google Scholar
Suh , J. , Kim , E. , Crago , S.P. , Srinivasan , L. and French , M.C. 2003 . “ A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels ” . In Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA-03), Volume 31 of Computer Architecture News , Edited by: DeGroot , D. 410 – 421 . New York, NY : ACM Press . June
Google Scholar
Strzodka, R., 2004, Hardware Efficient PDE Solvers in Quantized Image Processing. PhD thesis, University of Duisburg-Essen.
Google Scholar
Fatahalian , K. , Knight , T.J. , Houston , M. , Erez , M. , Horn , D.R. , Leem , L. , Park , J.Y. , Ren , M. , Aiken , A. , Dally , W.J. and Hanrahan , P. 2006 . “ Sequoia: programming the memory hierarchy ” . In Proceedings of the ACM/IEEE SuperComputing 2006 (SC'06) to appear
Google Scholar
Clearspeed, CSX600. www.clearspeed.com/downloads/CSX600Processor.pdf , 2006.
Google Scholar
IBM Sony, Toshiba. Cell BE. http://www.ibm.com/developerworks/power/cell .
Google Scholar
Mercury. Cell BE. http://www.mc.com/cell/ .
Google Scholar
Williams , S. , Shalf , J. , Oliker , L. , Kamil , S. , Husbands , P. and Yelick , K. 2006 . “ The potential of the cell processor for scientific computing ” . In CF '06: Proceedings of the 3rd Conference on Computing Frontiers , 9 – 20 . New York, NY : ACM Press .
Google Scholar
AGEIA. PhysX. http://www.ageia.com/products/physx.html .
Google Scholar
Göddeke , D. and Strzodka , R. Scientific computing on graphics hardware, tutorial at the 6th International Conference on Computational Science (ICCS 2006)
Google Scholar
Owens , J.D. , Luebke , D. , Govindaraju , N. , Harris , M.J. , Krüger , J. , Lefohn , A.E. and Purcell , T. 2005 . “ A survey of general-purpose computation on graphics hardware ” . In Eurographics 2005, State of the Art Reports 21 – 51 .
Google Scholar
GPGPU. General-purpose computation using graphics hardware, http://www.gpgpu.org .
Google Scholar
Strzodka , R. , Doggett , M. and Kolb , A. 2005 . Scientific computation for simulations on programmable graphics hardware . Simulation Modelling Practice and Theory, Special Issue: Programmable Graphics Hardware , 13 ( 8 ) : 667 – 680 .
Web of Science ®Google Scholar
Hillesland , K. and Lastra , A. 2004 . “ GPU floating-point paranoia ” . In Proceedings of GP2
Google Scholar
Daumas , M. , Da Graça , G. and Defour , D. 2006 . “ Caractéristiques arithmétiques des processeurs graphiques ” . In Symposium en Architecture de Machines
Google Scholar
Da Graça , G. and Defour , D. 2006 . “ Implementation of float–float operators on graphics hardware ” . In 7th Conference on Real Numbers and Computers, RNC7 23 – 32 .
Google Scholar
Hitz , M.A. and Payne , B.R. 2006 . “ Implementation of residue number systems on GPUs ” . In ACM SIGGRAPH Conference Abstracts and Applications
Google Scholar
Thall , A. 2006 . “ Extended-precision floating-point numbers for GPU computation ” . In ACM SIGGRAPH Conference Abstracts and Applications
Google Scholar
Dale , K. , Sheaffer , J.W. , Kumar , V.V. , Luebke , D.P. , Humphreys , G. and Skadron , K. 2006 . “ Applications of small-scale reconfigurability to graphics processors ” . In Proceedings of the International Workshop on Applied Reconfigurable Computing (ARC2006) , Berlin : Springer .
Google Scholar
Belanovic , P. and Leeser , M. 2002 . “ A library of parameterized floating-point modules and their use ” . In FPL'02: Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications , 657 – 666 . London : Springer-Verlag .
Google Scholar
Fang , F. , Chen , T. and Rutenbar , R. 2002 . Lightweight floating-point arithmetic: case study of inverse discrete cosine transform . EURASIP Journal on Signal Processing, Special Issue on Applied Implementation of DSP and Communication Systems , : 879 – 892 .
Google Scholar
Gaffar , A.A. , Mencer , O. , Luk , W. and Cheung , P.Y.K. 2004 . “ Unifying bit-width optimisation for fixed-point and floating-point designs ” . In 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM04) 79 – 88 .
Google Scholar
Liang , J. , Tessier , R. and Mencer , O. 2003 . “ Floating point unit generation and evaluation for FPGAs ” . In FCCM'03: Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines , 185 Washington, DC : IEEE Computer Society .
Google Scholar
Dido , J. , Geraudie , N. , Loiseau , L. , Payeur , O. , Savaria , Y. and Poirier , D. 2002 . “ A flexible floating-point format for optimizing data-paths and operators in FPGA based DSPs ” . In FPGA'02: Proceedings of the 2002 ACM/SIGDA 10th International Symposium on Field Programmable Gate Arrays , 50 – 55 . New York, NY : ACM Press .
Google Scholar
Govindu , G. , Zhuo , L. , Choi , S. and Prasanna , V. 2004 . “ Analysis of high-performance floating-point arithmetic on FPGAs ” . In 18th International Parallel and Distributed Processing Symposium (IPDPS04), Workshop 3 149b
Google Scholar
Matousek , R. , Tichy , M. , Phol , Z. , Kadlec , J. , Softley , C. and Coleman , N. 2002 . “ Logarithmic number systems and floating-point arithmetics on FPGA ” . In 12th International Conference on Field Programmable Logic and Applications , 627 – 636 . London : Springer-Verlag .
Google Scholar
Haselman , M. , Beauchamp , M. , Wood , A. , Hauck , S. , Underwood , K. and Hemmert , K.S. 2005 . “ A comparison of floating point and logarithmic number systems on FPGAs ” . In 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05) 181 – 190 .
Google Scholar
Bondalapati , K. and Prasanna , V.K. 2002 . Reconfigurable computing systems . Proceedings of the IEEE ,
PubMed Web of Science ®Google Scholar
Compton , K. and Hauck , S. 2002 . Reconfigurable computing: a survey of systems and software . ACM Computing Surveys , 34 ( 2 ) : 171 – 210 .
Web of Science ®Google Scholar
Turek , S. 1999 . Efficient Solvers for Incompressible Flow Problems: An Algorithmic and Computational Approach , Berlin : Springer .
Google Scholar
Grajewski, M., Köster, M., Kilian, S. and Turek, S., 2005, Numerical analysis and practical aspects of a robust and efficient grid deformation method in the finite element context, Ergebnisberichte des Instituts für Angewandte Mathematik, Nr. 294, FB Mathematik, Universität Dortmund.
Google Scholar
Industrial Light & Magic, OpenEXR, implementation of the half data type.
Google Scholar
Strzodka , R. and Göddeke , D. 2006 . “ Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components ” . In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006)
Google Scholar
Göddeke , D. , Strzodka , R. and Turek , S. 2005 . “ Accelerating double precision FEM simulations with GPUs ” . In 18th Symposium Simulations Technique , Edited by: Hülsemann , F. , Kowarschik , M. and Rüde , U. 139 – 144 . Erlangen : SCS Publishing House e.V. . volume Frontiers in Simulation, ASIM 2005
Google Scholar
Göddeke , D. , Becker , Ch. and Turek , S. 2006 . “ Integrating GPUs as fast co-processors into the parallel FE package FEAST ” . In Proceedings of the 19th Symposium on Simulation Technique Edited by: Becker , M. and Szczerbicka , H. 277 – 282 .
Google Scholar
Altieri , M. , Becker , Ch. and Turek , S. 1999 . “ On the realistic performance of linear algebra components in iterative solvers ” . In High Performance Scientific and Engineering Computing: Proceedings of the International FORTWIHR Conference on HPSEC, volume 8 of Lecture Notes in Computational Science and Engineering , Edited by: Bungartz , H.-J. , Durst , F. and Zenger , Chr. 3 – 12 . Berlin : Springer .
Google Scholar
Kilian, S., 2001, Ein verallgemeinertes Gebietszerlegungs-/Mehrgitterkonzept auf Parallelrechnern. PhD thesis, Universität Dortmund.
Google Scholar
Becker , Ch. , Kilian , S. and Turek , S. 2002 . “ Hardware-oriented numerics and concepts for PDE software ” . In FUTURE 1095 , 1 – 23 . Amsterdam : Elsevier . International Conference on Computational Science ICCS2002
Google Scholar
Strzodka , R. and Göddeke , D. 2006 . “ Mixed precision methods for convergent iterative schemes ” . In Proceedings of the Workshop on Edge Computing Using New Commodity Architectures
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date