70
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Algorithmic optimizations of a conjugate gradient solver on shared memory architectures

&
Pages 345-363 | Received 01 Apr 2005, Accepted 21 Nov 2005, Published online: 31 Jan 2007

References

  • Dieter an Mey . 2003 . “ Two OpenMP programming patterns ” . In Proceedings of the Fifth European Workshop on OpenMP (EWOMP2005)
  • Barrett , R. , Berry , M. , Chan , T.F. , Demmel , J. , Donato , J. , Dongarra , J. , Eijkhout , V. , Pozo , R. , Romine , C. and van der Vorst , H. 1994 . Templates for the solution of linear systems: building blocks for iterative methods . SIAM ,
  • Bircsak , J. , Craig , P. , Crowell , R. , Cvetanovic , Z. , Harris , J. , Alexander , N.C. and Offner , C.D. 2000 . Extending OpenMP for NUMA machines . Scientific Programming , 8 : 163 – 181 .
  • Brehm , J. and Jordan , H.F. 1989 . “ Parallelizing algorithms for mimd architectures with shared memory ” . In Proceedings of the 3rd International Conference on Supercomputing , 244 – 253 . ACM Press .
  • Mark Bull , J. and Johnson , C. 2002 . “ Data distribution, migration and replication on a cc-NUMA architecture ” . In Proceedings of the Fourth European Workshop on OpenMP http://www.caspur.it/ewomp2002/
  • Burgess, D.A. and Giles, M.B., Renumbering unstructured grids to improve the performance of codes on hierarchical memory machines, Technical report, Oxford University Computing Laboratory, Numerical Analysis Group, May 1995.
  • Charlesworth , A. 1998 . Starfire: extending the SMP envelope . IEEE Micro , 18 ( 1 ) : 39 – 49 .
  • Charlesworth , A. 2001 . “ The sun fireplane system interconnect ” . In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CDROM) , 7 ACM Press .
  • Chronopoulos , A.T. and Gear , C.W. 1989 . S-step iterative methods for symmetric linear systems . Journal of Computational and Applied Mathematics , 25 : 153 – 168 .
  • Cuthill , E. and McKee , J. 1969 . “ Reducing the bandwidth of sparse symmetric matrices ” . In Proceedings of the 1969 24th National Conference , 157 – 172 . ACM Press .
  • Dongarra, J. and Eijkhout, V., Finite-choice algorithm optimization in conjugate gradients, Technical Report UT-CS-03-502, Lapack Working Note 159. University of Tennessee Computer Science Report, January 2003.
  • Dongarra , J. , Foster , I. , Fox , G. , Gropp , W. , Kennedy , K. , Torczon , L. and White , A. 2003 . Sourcebook of Parallel Computing , Morgan Kaufmann .
  • Dongarra , J.J. , Duff , I.S. , Sorensen , D.C. and van der Vorst , H.A. 1998 . Numerical linear algebra for high performance computers . SIAM ,
  • Edelvik , F. Hybrid solvers for the Maxwell equations in time-domain . Doctoral thesis . Mathematics and Computer Science, Department of Information Technology, University of Uppsala .
  • Gibbs , N.E. Jr. , Poole , W.G. and Stockmeyer , P.K. 1976 . An algorithm for reducing the bandwith and profile of a sparse matrix . SIAM Journal on Numerical Analysis , 13 ( 2 ) : 236 – 250 .
  • Golub , Gene and O'Leary , D. 1989 . Some history of the conjugate gradient and Lanczos methods . SIAM Review , 31 : 50 – 102 .
  • Haveraaen , M. and Hundvebakke , H. 2001 . “ Some statistical performance estimation techniques for dynamic machines ” . In Norsk Informatikkonferanse (NIK 2001) http://www.nik.no/2001/17-haveraaen.pdf
  • Henrik Löf , S.H. and Norden , M. 2004 . “ Improving geographical locality of data for shared memory implementations of PDE solvers ” . In Computational Science—ICCS 2004: 4th International Conference, Kraków, Poland, June 6–9, 2004, Proceedings, Part II 9 – 16 . http://www.springerlink.com/openurl.asp?genre=article&issn=0302-9743&vo%lume=3037&spage=9
  • Karypsis , G. and Kumar , V. 1999 . A fast and highly quality multilevel scheme for partitioning irregular graphs . SIAM Journal on Scientific Computing , 20 ( 1 ) : 359 – 392 .
  • Laudon , J. and Lenoski , D. 1997 . “ The SGI origin: a ccNUMA highly scalable server ” . In Proceedings of the 24th Annual International Symposium on Computer Architecture , 241 – 251 . ACM Press .
  • Löf , H. and Holmgren , S. 2005 . “ Affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system ” . In Proceedings of the 19th ACM International Conference on Supercomputing , 387 – 392 . ACM Press .
  • Nikolopoulos , D.S. and Papatheodorou , T.S. 2000 . A transparent runtime data distribution engine for OpenMP . Scientific Programming , 8 : 143 – 162 .
  • Oliker , L. , Li , X. , Husbands , P. and Biswas , R. 2002 . Effects of ordering strategies and programming paradigms on sparse matrix computations . SIAM Review , 44 ( 3 ) : 373 – 393 .
  • Pinar , A. and Heath , M.T. 1999 . “ Improving performance of sparse matrix-vector multiplication ” . In Proceedings of the 1999 ACM/IEEE Conference on Supercomputing (CDROM) , 30 ACM Press .
  • Sun Microsystems, Solaris Memory Placement Optimization and Sun Fire servers, January 2003. http://www.sun.com/servers/wp/docs/mpo_v7_CUSTOMER.pdf .
  • Sun Microsystems, 2003, UltraSPARC III Cu user's manual, http://www.sun.com/processors/manuals
  • Toledo , S. 1997 . Improving the memory-system performance of sparse-matrix vector multiplication . IBM Journal of Research and Development , 41 ( 6 ) : 711 – 725 .
  • van der Vorst , H.A. 2003 . Iterative Krylov methods for large linear systems , Number 13 in Cambridge monographs on applied and computational mathematics Cambridge University Press .
  • Vuduc , R. , Demmel , J.W. , Yelick , K.A. , Kamil , S. , Nishtala , R. and Lee , B. 2002 . “ Performance optimizations and bounds for sparse matrix-vector multiply ” . In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing , 1 – 35 . IEEE Computer Society Press .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.