Search in:

Advanced search

International Journal of Parallel, Emergent and Distributed Systems Volume 34, 2019 - Issue 2

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Original Articles

Supernode transformation on GPGPUs

Yong ChenDepartment of Computer Engineering, Santa Clara University, Santa Clara, CA, USACorrespondence[email protected]

Weijia ShangDepartment of Computer Engineering, Santa Clara University, Santa Clara, CA, USA

Pages 181-202 | Received 18 Aug 2016, Accepted 12 Feb 2017, Published online: 16 Mar 2017

Cite this article
https://doi.org/10.1080/17445760.2017.1296147
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Shang W , Fortes J . Time optimal linear schedules for algorithms with uniform dependencies. IEEE Trans Comput. 1991 Jun;40(6):723–742.10.1109/12.90251
Google Scholar
Hirschberg D . A linear space algorithm for computing maximal common subsequences. Commun ACM. 1975 Jun;18(6):341–343.10.1145/360825.360861
Google Scholar
Irigoin F , Triolet R . Supernode partitioning. Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages; San Diego, CA; 1988. p. 319–329.
Google Scholar
Sinharoy B , Szymanski B . Finding optimum wavefront of parallel computation. J Parallel Algor Appl. 1994;2(1–2):5–26.10.1080/10637199408915404
Google Scholar
Hodzic E , Shang W . On supernode transformation with minimized total running time. IEEE Trans Parallel Distrib Syst. 1998 May;9(5):417–428.10.1109/71.679213
Google Scholar
Hodzic E , Shang W . On time optimal supernode shape. IEEE Trans Parallel Distrib Syst. 2002 Dec;1220–1233.10.1109/TPDS.2002.1158261
Google Scholar
Goumas G , Sotiropoulos A , Koziris N . Minimizing completion time for loop tiling with computation and communication overlapping. Proceedings of IEEE Int’l Parallel and Distributed Processing Symposium (IPDPS’01); 2001 April; San Francisco, CA; 2001.
Google Scholar
Athanasaki M , Sotiropoulos A , Tsoukalas G , et al. Pipelined scheduling of tiled nested loops onto clusters of SMPs using memory mapped network interfaces. Proceedings of the 2002 ACM/IEEE conference on Supercomputing (SC2002); 2002 Nov; Baltimore, MD; 2002.
Google Scholar
Cohen A , Girbal S , Parello D , et al . Facilitating the search for compositions of program transformations. ACM ICS 2005: Proceeding of the 19th Annual International Conference on Supercomputing; New York, NY; 2005. p. 151–160.
Google Scholar
Feautrier P . Some efficient solutions to the affine scheduling problem. I. One-dimensional time. Int J Parallel Prog. 1992;21(5):313–347.10.1007/BF01407835
Google Scholar
Girbal S , Vasilache N , Bastoul C , et al . Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. Int J Parallel Prog. 2006;34(3):261–317.10.1007/s10766-006-0012-3
Google Scholar
Lim A , Liao S , Lam M . Blocking and array contraction across arbitrarily nested loops using affine partitioning. Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming; Snowbird, UT; 2001. p. 103–112.
Google Scholar
Lim A , Cheong G , Lam M . An affine partitioning algorithm to maximize parallelism and minimize communication. Proceedings of the 13th International Conference on Supercomputing; Rhodes; 1999. p. 228–237.
Google Scholar
Ahmed N , Mateev N , Pingali K . Synthesizing transformations for locality enhancement of imperfectly-nested loop nests. Int J Parallel Prog. 2001 Oct;29(5):493–544.
Google Scholar
Bondhugula U , Hartono A , Ramanujam J , et al . A practical automatic polyhedral parallelizer and locality optimizer. PLDI 2008 Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation; 2008 Jun; Tuscon, AZ; 2008.
Google Scholar
Ohta H , Saito Y , Kainaga M . Optimal tile size adjustment in compiling general DOACROSS loop nests. In: ACM press , editor. ICS ‘95 proceedings of the 9th international conference on supercomputing. Barcelona: ACM Press; 1995. p. 270–279.
Google Scholar
Calland PY , Dongarra J , Robert Y . Tiling with limited resources. Proceedings Conference Application Specific Systems, Architectures, and Processors. Zurich: IEEE Computer Society; 1997. p. 229–238.
Google Scholar
Boulet P , Dongarra J , Robert Y , et al . Tiling for heterogeneous computing platforms. Knoxville, TN : University of Tennessee; 1997 ( Technical Report UT-CS-97-373).
Google Scholar
Athanasaki M , Koukis E , Koziris N . Scheduling of tiled iteration spaces onto a cluster with a fixed number of SMP nodes. Proceedings of the 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing. Coruna: IEEE; 2004.
Google Scholar
Cormen T , Leiserson C , Rivest R , et al . Introduction to algorithms. Cambridge, MA: MIT Press; 2001.
Google Scholar
Ukiyama N , Imai H . Parallel multiple alignments and their implementation on CM5. Genome Inform; Yokohama. 1993 Dec;4:103–108.
Google Scholar
Yang J , Xu Y , Shang Y . An efficient parallel algorithm for longest common subsequence problem on GPUs. WCE 2010 – Proceedings of the World Congress on Engineering; London; 2010. p. 499–504.
Google Scholar
Jeffrey A . Complex analysis and applications. 2nd ed. Boca Raton, Fl: Chapman and Hall/CRC; 2005 Nov. p. 22–23.
Google Scholar
Nvidia CUDA Programming Guide 2.3. Santa Clara, CA: Nvidia Corporation; 2009.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Supernode transformation on GPGPUs

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Supernode transformation on GPGPUs

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date