199
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

GPU-accelerated preconditioned GMRES method for two-dimensional Maxwell's equations

, , , &
Pages 2122-2144 | Received 15 Aug 2016, Accepted 01 Nov 2016, Published online: 03 Feb 2017

References

  • M. Alexander, L. Anton, and A. Arutyun, Automatically tuning sparse matrix–vector multiplication for GPU architectures, Proc. 5th International Conference on High Performance Embedded Architectures and Compilers (HiPEAC'10), Vol. 5952 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2010, pp. 111–125.
  • N. Bell and M. Garland, Implementing sparse matrix–vector multiplication on throughput-oriented processors, Proc. Conf. High Performance Computing Networking, Storage and Analysis (SC'09), ACM, New York, NY, 2009, pp. 14–19.
  • G. Blelloch, M. Heroux, and M. Zagha, Segmented operations for sparse matrix computation on vector multiprocessor, Technique report, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 1993.
  • J.X. Cai, Y.S. Wang, and Z.H. Qiao, Multisymplectic preissmann scheme for the time-domain Maxwell's equations, J. Math. Phys. 50(3)/033510 (2009), pp. 1–17. doi: 10.1063/1.3087421
  • J.W. Choi, A. Singh, and R.W. Vuduc, Model-driven autotuning of sparse matrix–vector multiply on GPUs, Proc. 15th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP'10), ACM, New York, NY, 2010, pp. 9–14.
  • E. Chow, A priori sparsity patterns for parallel sparse approximate inverse preconditioners, SIAM J. Sci. Comput. 21(5) (2000), pp. 1804–1822. doi: 10.1137/S106482759833913X
  • R. Couturier and S. Domas, Sparse systems solving on GPUs with GMRES, J. Supercomput. 59(3) (2012), pp. 1504–1516. doi: 10.1007/s11227-011-0562-z
  • H.-V. Dang and B. Schmidt, CUDA-enabled sparse matrix–vector multiplication on GPUs using atomic operations, Parallel Comput. 39(11) (2013), pp. 737–750. doi: 10.1016/j.parco.2013.09.005
  • T.A. Davis and Y. Hu, The university of florida sparse matrix collection, ACM Trans. Math. Software 38(1) (2011), pp. 1–25.
  • M.M. Dehnavi, D.M. Fernández, and D. Giannacopoulos, Finite element sparse matrix vector multiplication on graphic processing units, IEEE Trans. Magn. 46(8) (2010), pp. 2982–2985. doi: 10.1109/TMAG.2010.2043511
  • V. Galiano, H. Migallón, and V. Migallón, GPU-based parallel algorithms for sparse nonlinear systems, J. Parallel Distrib. Comput. 72(9) (2012), pp. 1098–1105. doi: 10.1016/j.jpdc.2011.10.016
  • J. Gao, R. Liang, and J. Wang, Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU, J. Parallel Distrib. Comput. 74(2) (2014), pp. 2088–2098. doi: 10.1016/j.jpdc.2013.10.002
  • J. Gao, Z. Li, R. Liang, and G. He, Adaptive optimization l1-minimization solvers on GPU, Int. J. Parallel Program. (2016), pp. 1–22. doi:http://dx.doi.org/10.1007/s10766-016-0430-9.
  • J. Gao, Y. Wang, and J. Wang, A novel multicgraphics processing unit parallel optimization framework for the sparse matrix–vector multiplication, Concurrency Comput. Pract. Exp. (2016), pp. 1–13. doi:http://dx.doi.org/10.1002/cpe.3936.
  • J. Gao, Y. Wang, J. Wang, and R. Liang, Adaptive optimization modeling of preconditioned conjugate gradient on multi-GPUs, ACM Trans. Parallel Comput. 3(3) (2016), pp. 1–33. Article 16. doi: 10.1145/2990849
  • J.L. Greathouse and M. Daga, Efficient sparse matrix–vector multiplication on GPUs using the CSR storage format, Proc. Int'l Conf. High Performance Computing, Networking, Storage and Analysis (SC'14), ACM, New York, NY, 2014, pp. 769–780.
  • M. Habu and T. Nodera, GMRES(m) algorithm with changing the restart cycles adaptively, Proceedings of Algorithmy Conference on Scientific Computing, Springer, Heidelberg, 2000, pp. 254–263.
  • G. He and J. Gao, A novel CSR-based sparse matrix–vector multiplication on GPUs, Math. Probl. Eng. 2016 (2016), pp. 1–12. Article ID 8471283.
  • D.R. Kincaid and D.M. Young, A brief review of the ITPACK project, J. Comput. Appl. Math. 24(1–2) (1998), pp. 121–127. doi: 10.1016/0377-0427(88)90347-0
  • M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop, A unified sparse matrix data format for efficient general sparse matrix–vector multiply on modern processors with wide simd units, SIAM J. Sci. Comput. 36(5) (2014), pp. C401–C423. doi: 10.1137/130930352
  • R. Li and Y. Saad, GPU-accelerated preconditioned iterative linear solvers, J. Supercomput. 63(2) (2013), pp. 443–466. doi: 10.1007/s11227-012-0825-3
  • Y. Liu and B. Schmidt, Faster CSR-based sparse matrix–vector multiplication on CUDA-enabled GPUs, IEEE 26th International Conference on Application-specific Systems, IEEE, Piscataway, NJ, 2015, pp. 82–89.
  • M. Naumov, M. Arsaev, P. Castonguay, J. Cohen, J. Demouth, J. Eaton, S. Layton, N. Markovskiy, I. Reguly, N. Sakharnykh, V. Sellappan, and R. Strzodka, Amgx: A library for GPU accelerated algebraic multigrid and preconditioned iterative methods, SIAM J. Sci. Comput. 37(5) (2015), pp. S602–S626. doi: 10.1137/140980260
  • NVIDIA, CUDA C Programming Guide 7.5, 2015. Available at http://docs.nvidia.com/cuda/cuda-c-programming-guide.
  • NVIDIA, CUBLAS Library 7.5, 2015. Available at http://docs.nvidia.com/cuda/cuda-c-programming-guide.
  • NVIDIA, CUDA C Best Practices Guide 7.5, 2015. Available at http://docs.nvidia.com/cuda/cuda-c-best-practices-guide.
  • NVIDIA, CUSPARSE Library 7.5, 2015. Available at https://developer.nvidia.com/cusparse.
  • Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, PA, 2003, second version.
  • B. Su and K. Keutzer, clSpMV: A cross-platform OpenCL SpMV framework on GPUs, Proc. of the Int'l Conf. on Supercomputing (ICS'12), ACM, New York, NY, 2012, pp. 353–364.
  • F. Vázquez, J.J. Fernández, and E.M. Garzón, A new approach for sparse matrix vector product on NVIDIA GPUs, Concurrency Comput. Pract. Exp. 23(8) (2011), pp. 815–826. doi: 10.1002/cpe.1658
  • F. Vázquez, J.J. Fernández, and E.M. Garzón, Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach, Parallel Comput. 38(8) (2012), pp. 408–420. doi: 10.1016/j.parco.2011.08.003
  • M. Wang, H. Klie, and M. Parashar, Solving sparse linear systems on NVIDIA Tesla GPUs, 9th International Conference on Computational Science-ICCS, Vol. 5544 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2009, pp. 864–873.
  • S. Yan, C. Li, Y. Zhang, and H. Zhou, yaSpMV: Yet another SpMV framework on GPUs, Proc. 19th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP'14), ACM, New York, NY, 2014, pp. 107–118.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.