69
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A parallel strategy to accelerate neighborhood operation for raster data coordinating CPU and GPU

ORCID Icon, ORCID Icon & ORCID Icon
Received 15 Feb 2023, Accepted 16 Oct 2023, Published online: 07 Nov 2023
 

ABSTRACT

This study presents an asynchronous parallel strategy coordinating central processing unit (CPU) and graphic processing unit (GPU) to accelerate neighborhood operation (NO). Specifically, we propose a data partitioning method called multi-anchor task queuing and a task scheduling method called bi-direction task scheduling, which can support CPU and GPU to find the responsible data blocks rapidly and concurrently handle their tasks via a bi-direction merge. Moreover, we optimize the organization of threads distributed among the CPU and GPU. Experimental results show that when a 1.7 GB raster dataset is processed, the speedup ratio achieved by the proposed parallel algorithm reaches 29.63, which is 19% and 18% higher than those of the GPU and standard asynchronous parallel algorithm, respectively. Additionally, the load balance index is below 0.085, which is significantly better than the value achieved by a conventional algorithm. Thus, the strategy achieves a higher speedup ratio and more adaptable load balance, thereby accelerating the NO more efficiently. Further, the impacts of the data volume, computational intensity, organization mode of the GPU threads, and granularity of the GPU stream on the parallel efficiency are evaluated and discussed. We also test the efficiency of four other common NOs with our strategy.

Acknowledgments

The authors sincerely thank the anonymous reviewers and editors for their valuable feedback and constructive comments, which greatly contribute to improving this paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

CRediT authorship contribution statement

Zhixin Yu: Conceptualization, Methodology, Software, Visualization, Writing – original draft.

Chen Zhou: Conceptualization, Data Curation, Supervision, Validation, Writing – review & editing.

Manchun Li: Supervision, Writing – review & editing.

Data availability statement

The computer code and sample dataset that support the findings of this study are available at https://www.doi.org/10.17605/OSF.IO/AG3QC. The code was developed using C++. A CPU with multiple cores and a CUDA-enabled GPU are necessary. It is recommended to run the code on OpenMP 2.0, CUDA 11.2 and GDAL 3.2.0 or later.

Additional information

Funding

This work was supported by the National Natural Science Foundation of China [grant numbers 42271414 and 41901318].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 78.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.