Search in:

Advanced search

International Journal of Parallel, Emergent and Distributed Systems Volume 29, 2014 - Issue 1

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Article

Simple memory machine models for GPUs

Koji NakanoDepartment of Information Engineering, Hiroshima University, Kagamiyama 1-4-1, HigashiHiroshima, 739-8527, JapanCorrespondence[email protected]

Pages 17-37 | Received 02 May 2012, Accepted 12 Sep 2012, Published online: 27 Nov 2012

Cite this article
https://doi.org/10.1080/17445760.2012.731507
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

A.V.Aho, J.D.Ullman, and J.E.Hopcroft, Data Structures and Algorithms, Addison Wesley, Boston, 1983.
Google Scholar
S.G.Akl, Parallel Sorting Algorithms, Academic Press, Orlando, FL, 1985.
Google Scholar
K.E.Batcher, Sorting networks and their applications, in Proceedings of the AFIPS Spring Joint Computer Conference, American Federation of Information Processing Societies, Vol. 32, 1968, pp. 307–314.
Google Scholar
R.H.Bisseling, Parallel Scientific Computation: A Structured Approach Using BSP and MPI, Oxford University Press, Oxford, 2004.
Google Scholar
D.Culler, R.Karp, D.Patterson, A.Sahay, K.E.Schauser, E.Santos, R.Subramonian, and T.Eickenvon, LogP: Towards a realistic model of parallel computation, in Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM, 1993, pp. 1–12.
Google Scholar
M.J.Flynn, Some computer organizations and their effectiveness, IEEE Trans. Comput.C-21 (1972), pp. 948–960.
Web of Science ®Google Scholar
A.Gibbons and W.Rytter, Efficient Parallel Algorithms, Cambridge University Press, Cambridge, 1988.
Google Scholar
A.Gottlieb, R.Grishman, C.P.Kruskal, K.P.McAuliffe, L.Rudolph, and M.Snir, The nyu ultracomputer – designing an MIMD shared memory parallel computer, IEEE Trans. Comput.C-32 (1983), pp. 175–189.
Web of Science ®Google Scholar
N.K.Govindaraju, S.Larsen, J.Gray, and D.Manocha, A memory model for scientific algorithms on graphics processors, in Proceedings of the ACM/IEEE Conference on Supercomputing. ACM, 2006, p. 89.
Google Scholar
A.Grama, G.Karypis, V.Kumar, and A.Gupta, Introduction to Parallel Computing, Addison Wesley, Boston, 2003.
Google Scholar
W.W.Hwu, GPU Computing Gems Emerald Edition, Morgan Kaufmann, MA, 2011.
Google Scholar
Y.Ito, K.Ogawa, and K.Nakano, Fast ellipse detection algorithm using Hough transform on the GPU, in Proceedings of International Conference on Networking and Computing, IEEE Computer Society, December, 2011 pp. 313–319.
Google Scholar
D.H.Lawrie, Access and alignment of data in an array processor, IEEE Trans. Comput.C-24 (1975), pp. 1145–1155.
Web of Science ®Google Scholar
F.T.Leighton, Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes, Morgan Kaufmann, MA, 1991.
Google Scholar
D.Man, K.Uda, Y.Ito, and K.Nakano, A GPU implementation of computing euclidean distance map with efficient memory access, in Proceedings of International Conference on Networking and Computing, IEEE Computer Society, December, 2011, pp. 68–76.
Google Scholar
D.Man, K.Uda, H.Ueyama, Y.Ito, and K.Nakano, Implementations of a parallel algorithm for computing euclidean distance map in multicore processors and GPUs, Int. J. Netw. Comput.1 (2011), pp. 260–276.
Google Scholar
K.Nakano, Optimal sorting algorithms on bus-connected processor arrays, IEICE Trans. Fundam.E76-A (1993), pp. 2008–2015.
Google Scholar
K.Nishida, Y.Ito, and K.Nakano, Accelerating the dynamic programming for the matrix chain product on the GPU, in Proceedings of International Conference on Networking and Computing, IEEE Computer Society, December, 2011, pp. 320–326.
Google Scholar
K.Nishida, Y.Ito, and K.Nakano, Accelerating the dynamic programming for the optial poygon triangulation on the GPU, in Proceedings of International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP, LNCS 7439), Springer, September, 2012 pp. 1–15.
Google Scholar
NVIDIA Corporation, NVIDIA CUDA C best practice guide version 3.1 (document can be downloaded from http://developer.nvidia.com/cuda/nvidia-gpu-computing-documentation), 2010.
Google Scholar
NVIDIA Corporation, NVIDIA CUDA C programming guide version 4.0 (document can be downloaded from http://developer.nvidia.com/cuda/nvidia-gpu-computing-documentation), 2011.
Google Scholar
M.J.Quinn, Parallel Computing: Theory and Practice, McGraw-Hill, New York, 1994.
Google Scholar
G. Ruetsch and P. Micikevicius, Optimizing matrix transpose in CUDA, NVIDIA technical report2009.
Google Scholar
S.Ryoo, C.I.Rodrigues, S.S.Baghsorkhi, S.S.Stone, D.B.Kirk, and W.W.Hwumei, Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, in Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM, 2008, pp. 73–82.
Google Scholar
A.Uchida, Y.Ito, and K.Nakano, Fast and accurate template matching using pixel rearrangement on the GPU, in Proceedings of International Conference on Networking and Computing, IEEE Computer Society, December, 2011, pp. 153–159.
Google Scholar
R.Vaidyanathan and J.L.Trahan, Dynamic Reconfiguration: Architectures and Algorithms, Kluwer Academic/Plenum Publishers, New York, 2004.
Google Scholar
D.T. Wang, Modern dram memory systems: Performance analysis and a high performance, power-constrained DRAM scheduling algorithm, Ph.D. thesis, University of Maryland, USA, 2005.
Google Scholar
R.J.Wilson, Introduction to Graph Theory, 3rd ed., Longman, Harlow, Essex, 1985.
Google Scholar
Xilinx Inc, Virtex-5 FPGA users guide (document can be downloaded from http://www.xilinx.com/support/), 2009.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Simple memory machine models for GPUs

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Simple memory machine models for GPUs

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date