Abstract
While new power-efficient computer architectures exhibit spectacular theoretical peak performance, they require specific conditions to operate efficiently, which makes porting complex algorithms a challenge. Here, we report results of the semi-implicit method for pressure linked equations (SIMPLE) and the pressure implicit with operator splitting (PISO) methods implemented on the graphics processing unit (GPU). We examine the advantages and disadvantages of the full porting over a partial acceleration of these algorithms run on unstructured meshes. We found that the full-port strategy requires adjusting the internal data structures to the new hardware and proposed a convenient format for storing internal data structures on GPUs. Our implementation is validated on standard steady and unsteady problems and its computational efficiency is checked by comparing its results and run times with those of some standard software (OpenFOAM) run on central processing unit (CPU). The results show that a server-class GPU outperforms a server-class dual-socket multi-core CPU system running essentially the same algorithm by up to a factor of 4.
Acknowledgements
TT and KZ prepared this publication as part of the project of the City of Wrocław, entitled “Green Transfer” – academia-to-business knowledge transfer project co-financed by the European Union under the European Social Fund, under the Operational Programme Human Capital (OP HC): sub-measure 8.2.1. ZK and MM acknowledge support from Polish Ministry of Science and Higher Education Grant No. N N519 437939. F. Rikhtegar and V. Kurtcuoglu (ETH Zurich) provided the artery data. This research was supported in part by PL-Grid Infrastructure. A Tesla C2070 GPU donation from Nvidia is also gratefully acknowledged.