Search in:

International Journal of Parallel, Emergent and Distributed Systems Volume 25, 2010 - Issue 4: Performance evaluation of ubiquitous computing and networked systems

Submit an article Journal homepage

112

Views

CrossRef citations to date

Altmetric

Original Articles

Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale

Torsten Hoefler Open Systems Laboratory, Indiana University, 501 N. Morton Street, Bloomington, IN, 47404, USACorrespondence[email protected]

Timo Schneider Open Systems Laboratory, Indiana University, 501 N. Morton Street, Bloomington, IN, 47404, USAView further author information

Andrew Lumsdaine Open Systems Laboratory, Indiana University, 501 N. Morton Street, Bloomington, IN, 47404, USAView further author information

Pages 241-258 | Received 01 Dec 2008, Accepted 11 Feb 2009, Published online: 09 Jul 2010

Cite this article
https://doi.org/10.1080/17445760902894688

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Agarwal , S. , Garg , R. and Vishnoi , N. 2005 . “ The impact of noise on the scaling of collectives: A theoretical approach ” . In 12th Annual IEEE International Conference on High Performance Computing, Goa, India
Google Scholar
Alam , S.R. , Bhatia , N. and Vetter , J.S. 2007 . “ An exploration of performance attributes for symbolic modeling of emerging processing devices ” . In Lecture Notes in Computer Science , Edited by: Perrott , R.H. , Chapman , B.M. , Subhlok , J. , de Mello , R.F. and Yang , L.T. Vol. 4782 , 683 – 694 . New York : Springer .
Google Scholar
Bönisch , T. , Resch , M.M. and Berger , H. 1997 . “ Benchmark evaluation of the message-passing overhead on modern parallel architectures ” . In Parallel Computing: Fundamentals, Applications and New Directions, Proceedings of the Conference ParCo97 411 – 418 .
Google Scholar
Culler , D. , Karp , R. , Patterson , D. , Sahay , A. , Schauser , K.E. , Santos , E. , Subramonian , R. and von Eicken , T. 1993 . “ LogP: Towards a realistic model of parallel computation ” . In Principles Practice of Parallel Programming 1 – 12 .
Google Scholar
Culler , D. , Liu , L.T. , Martin , R.P. and Yoshikawa , C. February 1996 . LogP performance assessment of fast network interfaces . IEEE Micro , Vol. 16 : 35 – 43 .
Google Scholar
Gropp , W. and Lusk , E.L. 1999 . “ Reproducible measurements of mpi performance characteristics ” . In Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface , 11 – 18 . London, UK : Springer-Verlag .
Google Scholar
Hoefler , T. and Lumsdaine , A. 2008 . “ Message progression in parallel computing – To thread or not to thread? ” . In Proceedings of the 2008 IEEE International Conference on Cluster Computing , Tsukuba, Japan : IEEE Computer Society .
Google Scholar
Hoefler , T. and Lumsdaine , A. 2008 . “ Optimizing non-blocking collective operations for InfiniBand ” . In Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS). 04
Google Scholar
Hoefler , T. , Janisch , R. and Rehm , W. 2006 . “ Parallel scaling of Teter's minimization for ab initio calculations ” . In Presented at the Workshop HPC Nano in Conjunction With SC'06
Google Scholar
Hoefler , T. , Mehlan , T. , Mietke , F. and Rehm , W. 2006 . “ Fast barrier synchronization for InfiniBand ” . In Proceedings, 20th International Parallel and Distributed Processing Symposium IPDPS (CAC 06)
Google Scholar
Hoefler , T. , Mehlan , T. , Mietke , F. and Rehm , W. April 2006 . “ LogfP – A model for small messages in InfiniBand ” . In Proceedings, 20th International Parallel and Distributed Processing Symposium IPDPS (PMEO-PDS 06)
Google Scholar
Hoefler , T. , Lichei , A. and Rehm , W. 2007 . Low-overhead LogGP parameter assessment for modern interconnection networks .
Google Scholar
Hoefler , T. , Lumsdaine , A. and Rehm , W. 2007 . “ Implementation and performance analysis of non-blocking collective operations for mpi ” . In Proceedings of Supercomputing'07
Google Scholar
Hoefler , T. , Mehlan , T. , Lumsdaine , A. and Rehm , W. 2007 . “ Netgauge: A network performance measurement framework ” . In Proceedings of the High Performance Computing and Communications, 3rd International Conference, HPCC, Houston, USA, September 26–28 , Vol. 4782 , 659 – 671 . New York : Springer .
Google Scholar
Intel Corporation, Intel Application Notes – Using the RDTSC Instruction for Performance Monitoring, Technical report, Intel. 1997
Google Scholar
Iskra , K. , Beckman , P. , Yoshii , K. and Coghlan , S. 2006 . “ The influence of operating systems on the performance of collective operations at extreme scale ” . In Proceedings of Cluster Computing, 2006 IEEE International Conference
Google Scholar
Kohno , T. , Broido , A. and Claffy , K. 2005 . Remote physical device fingerprinting . IEEE Trans. Depend. Secure Comput. , 2 ( 2 ) : 93 – 108 .
Web of Science ®Google Scholar
P.J. Mucci, K. London, and J. Thurman, The MPIBench report, Technical report, CEWES/ERDC MSRC/PET, 1998
Google Scholar
Murdoch , S. 2006 . “ Hot or not: Revealing hidden services by their clock skew ” . In Proceedings of the 13th ACM conference on Computer and Communications Security 27 – 36 .
Google Scholar
Pallas GmbH, Pallas MPI Benchmarks – PMB, Part MPI-1, Technical report, 2000
Google Scholar
J. Pjesivac-Grbovic, Open MPI collective operation performance on thunderbird, Technical report, The University of Tennessee, Computer Science Department, Knoxville, Technical Report, UT-CS-07-594, 2007
Google Scholar
Pjesivac-Grbovic , J. , Angskun , T. , Bosilca , G. , Fagg , G.E. , Gabriel , E. and Dongarra , J.J. April 2005 . “ Performance analysis of MPI collective operations ” . In Proceedings of the 19th International Parallel and Distributed Processing Symposium, Denver, CO
Google Scholar
Rabenseifner , R. 2000 . “ Automatic MPI counter profiling ” . In 42nd CUG Conference
Google Scholar
Saini , S. , Ciotti , R. , Gunney , B.T.N. , Spelce , T.E. , Koniges , A.E. , Dossa , D. , Adamidis , P.A. , Rabenseifner , R. , Tiyyagura , S.R. , Müller , M. and Fatoohi , R. 2006 . “ Performance evaluation of supercomputers using hpcc and imb benchmarks ” . In Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, 25–29 April
Google Scholar
Shro , M. and Geijn , R. 1999 . CollMark MPI Collective Communication Benchmark . Available at: citeseer.ist.psu.edu/shroff00collmark.html, hoefler-netgauge-hpcc07
Google Scholar
Vadhiyar , S.S. , Fagg , G.E. and Dongarra , J. 2000 . “ Automatically tuned collective communications ” . In Supercomputing '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM) , 3 Washington, DC : IEEE Computer Society .
Google Scholar
Worsch , T. , Reussner , R. and Augustin , W. 2002 . “ On benchmarking collective mpi operations ” . In Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface , 271 – 279 . London : Springer-Verlag .
Google Scholar
Yu , W. , Buntinas , D. , Graham , R.L. and Panda , D.K. 2004 . “ Efficient and scalable barrier over quadrics and myrinet with a new nic-based collective message passing protocol ” . In 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), CD-ROM/Abstracts Proceedings, 26–30 April 2004, Santa Fe, New Mexico, USA
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date