352
Views
82
CrossRef citations to date
0
Altmetric
Original Articles

Concurrent number cruncher: a GPU implementation of a general sparse linear solver

, &
Pages 205-223 | Received 10 Sep 2007, Accepted 27 Jun 2008, Published online: 02 Jun 2009

References

  • AMD, AMD Core Math Library (ACML), http://www.developer.amd.com/acml.jsp
  • Barrett , R. 1994 . Templates for the Solution of Linear Systems: Building Blocks For Iterative Methods , 2nd ed. , Philadelphia : SIAM .
  • Bolz , J. 2003 . Sparse matrix solvers on the GPU: conjugate gradients and multigrid . ACM Trans. Graph. (TOG) , 22 : 917 – 924 .
  • M. Botsch, D. Bommes, and L. Kobbelt, Efficient linear system solvers for mesh processing, IMA Conference on Mathematics of Surfaces XI, Lecture Notes in Computer Science (LNCS) 3604 (2005), pp. 62–83
  • L. Buatois, G. Caumon, and B. Lévy, Concurrent Number Cruncher: An efficient sparse linear solver on the GPU, High Performance Computation Conference (HPCC'07) Lecture Notes in Computer Science (LNCS), 2007
  • I. Buck, K. Fatahalian, and P. Hanrahan, GPUBench: Evaluating GPU performance for numerical and scientific applications, in Proceedings of the ACM Workshop on General-purpose Computing on Graphics Processors, 2004
  • Buck , I. 2004 . Brook for GPUs: stream computing on graphics hardware . ACM Trans. Graph. (TOG) , 23 : 777 – 786 .
  • E. Cuthill and J. McKee, Reducing the bandwidth of sparse symmetric matrices, in Proceedings of the 24th National Conference (1969), pp. 157–172
  • K. Fatahalian, J. Sugerman, and P. Hanrahan, Understanding the efficiency of GPU algorithms for matrix–matrix multiplication, HWWS '04 In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware (2004), pp. 133–137
  • Fernando , R. and Kilgard , M. 2003 . The Cg Tutorial: The Definitive Guide to Programmable Real-time Graphics , Boston : Addison-Wesley Longman Publishing Co., Inc .
  • Floater , M.S. and Hormann , K. 2005 . “ Surface parameterization: a tutorial and survey ” . In Multiresolution in Geometric Modelling , Edited by: Dodgson , N.A. , Floater , M.S. and Sabin , M.A. 157 – 186 . Heidelberg : Springer-Verlag .
  • Galoppo , N. 2005 . LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware . : 3 In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (SC)
  • N. Gibbs, W. Poole, and P. Stockmeyer, An algorithm for reducing the bandwidth and profile of a sparse matrix, Technical Report, College of William and Mary Williamsbourg, VA, Department of Mathematics, 1974
  • D. Göddeke, R. Strzodka, and S. Turek, Accelerating double precision FEM simulations with GPUs, Proceedings of the ASIM 2005 – 18th Symposium on Simulation Technique, 2005
  • Turek , S. 2007 . Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations . Int. J. Parallel, Emergent Distrib. Syst. , 22 : 221 – 256 .
  • GPGPU, General-Purpose computation on GPUs, www.gpgpu.org (http://www.gpgpu.org)
  • Hestenes , M. and Stiefel , E. 1952 . Methods of conjugate gradients for solving linear systems . J. Res. Nat. Bur. Stand. , 49 : 409 – 436 .
  • INTEL Math Kernel Library (MKL), www.intel.com/software/products/mkl (http://www.intel.com/software/products/mkl)
  • INTEL, Math Kernel Library (MKL) – LINPACK SMP benchmark package, www.intel.com/cd/software/products/asmo-na/eng/266857.htm (http://www.intel.com/cd/software/products/asmo-na/eng/266857.htm)
  • J. Jung and D. O'Leary, Cholesky decomposition and linear programming on a GPU, Workshop on Edge Computing Using New Commodity Architectures (EDGE), 2006
  • Krüger , J. and Westermann , R. 2003 . Linear algebra operators for GPU implementation of numerical algorithms . ACM Trans. Graph. (TOG) , 22 : 908 – 916 .
  • B. Lévy, Numerical methods for digital geometry processing, Israel Korea Bi-National Conference, 2005
  • LévyB., et al., Least squares conformal maps for automatic texture atlas generation, ACM SIGGRAPH'02, San-Antonio, Texas, USA, 2002
  • Mallet , J. 1992 . Discrete smooth interpolation (DSI) . Comput. Aided Des. , 24 : 263 – 270 .
  • McCool , M. and DuToit , S. 2004 . Metaprogramming GPUs with Sh , Wellesley : AK Peters .
  • Microsoft, Direct3d reference, http://www.msdn.microsoft.com
  • A. Nealen et al., Laplacian mesh optimization, in Proc. ACM GRAPHITE 2006, pp. 381–389
  • NVIDIA CUDA (Compute Unified Device Architecture), (2006), http://www.developer.nvidia.com/object/cuda.html
  • Peercy , M. , Segal , M. and Gerstmann , D. 2006 . A performance-oriented data-parallel virtual machine for GPUs . ACM SIGGRAPH'06 ,
  • Rost , R. 2004 . OpenGL Shading Language , Reading : Addison-Wesley Professional .
  • M. Segal and K. Akeley, The OpenGL graphics system: A specification, version 2.0 (2004), www.opengl.org (http://www.opengl.org)
  • J. Shewchuk, An introduction to the conjugate gradient method without the agonizing pain, Technical Report, CMU School of Computer Science, (1994), ftp://www.warp.cs.cmu.edu/quake-papers/painless-conjugate-gradient.ps (ftp://ftp://www.warp.cs.cmu.edu/quake-papers/painless-conjugate-gradient.ps)
  • O. Sorkine and D. Cohen-Or, Least-squares meshes, Proc. Shape Model. Int. (2004), pp. 191–199
  • R. Strzodka and D. Göddeke, Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components, Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'06) 2006, pp. 259–270

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.