Abstract
In this paper, conjugate residual squared (CRS) method for solving linear systems with non-symmetric coefficient matrices is proposed. Moreover, based on the ideas by Gu et al. [An improved bi-conjugate residual algorithm suitable for distributed parallel computing, Appl. Math. Comput. 186 (2007), pp. 1243–1253], we present an improved conjugate residual squared (ICRS) method, which is designed for distributed parallel environments. The improved method reduces two global synchronization points to one by changing the computation sequence in the CRS method and all inner products per iteration are independent, and communication time required for inner product can be overlapped with useful computation. Theoretical analysis shows that the ICRS method has better parallelism and scalability than the CRS method. Finally, some numerical experiments clearly show that the ICRS method can achieve better parallel performance with a higher scalability than the CRS method, and also the improvement percentage of communication is up to 47.33%, which meets our theoretical analysis.
Acknowledgements
The authors would like to thank the referees and Editor E.H. Twizell for their helpful and detailed suggestions for revising this manuscript. This research was supported by NSFC (10771030), the Scientific and Technological Key Project of the Chinese Ministry of Education (107098), the PhD Programs Fund of Chinese Universities (20070614001), Sichuan Province Project for Applied Basic Research (2008JY0052) and the Project for Academic Leader and Group of UESTC. In particular, T.-X. Gu is supported in part by Natural Science Foundation of China (10571017) and Foundation of National Key Laboratory of Computational Physics.