Abstract
A major challenge of using volunteer computing (VC) for big data problems is the opportunistic environment. In such a dynamic, unreliable, and heterogeneous environment, there exist many uncertain factors that may impair the competence of big data processing. This paper explores these factors and their impact, aiming at exposing the original impaired performance that the undedicated environment of VC can achieve under the impact. Our investigation on this issue is four-fold. First, we define a number of impact factors to represent the opportunistic features of VC environment. Second, we model a distributed hash table-based MapReduce approach to process big data. Third, we proposes an ideal computing environment and then inject impact factors into the running MapReduce to quantitatively evaluate how the performance is impaired. Finally, we analyze the evaluation results to figure out the cause of impact and predict optimization potentials. The developers, who plan to construct MapReduce frameworks by using commodity computers or voluntary cycles on the public internet, will benefit from the evaluation results when considering performance requirements.
Disclosure statement
No potential conflict of interest was reported by the authors.
Additional information
Notes on contributors
Wei Li
Dr Wei Li holds a PhD degree in computer science from the Institute of Computing Technology of Chinese Academy of Sciences China. He currently works for the School of Engineering & Technology, Central Queensland University Australia. His research interests include dynamic software architecture, P2P volunteer computing and multi-agent systems. Dr Wei Li has been a peer reviewer of a number of international journals, including IEEE Transactions on Software Engineering, ELSEVIER Journal of Systems and Software and John Wiley & Sons Journal of Software Maintenance and Evolution: Research and Practice, and a program committee member of more than 30 international conferences.
William W. Guo
Dr William W. Guo is currently a professor in applied mathematics and computation at Central Queensland University Australia. His research interests include applied mathematics and computational intelligence, simulation and modelling, data mining, and STEM education.
Michael Li
Dr Michael Li received a PhD degree in 2003 from the University of Newcastle, Australia. Currently he is a senior lecturer with the School of Engineering and Technology, Central Queensland University, Australia. His research interests include optimization algorithms, neural networks, probabilistic modelling, and Markov Chain Monte Carlo simulation.