1,071
Views
0
CrossRef citations to date
0
Altmetric
Articles

Evaluating performance variations cross cloud data centres using multiview comparative workload traces analysis

, , , , &
Pages 1582-1608 | Received 22 Jul 2021, Accepted 24 Nov 2021, Published online: 11 Jun 2022

References

  • Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., & Tallent, N. R. (2010). HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6), 685–701. https://doi.org/10.1002/cpe.1553
  • Alibaba (2018). Alibaba cluster-data. Retrieved November 5, 2021, from https://github.com/alibaba/clusterdata/.
  • Alnooh, A. H. A., & Abdullah, D. B. (2018). Investigation and analysis of Google cluster usage traces: Facts and real-time issues. In 2018 International Conference on Engineering Technology and Their Applications (IICETA) (pp. 60–65). IEEE.
  • Amvrosiadis, G., Park, J. W., Ganger, G. R., Gibson, G. A., Baseman, E., & DeBardeleben, N. (2018). On the diversity of cluster workloads and its impact on research results. In 2018 {USENIX} Annual Technical Conference ({USENIX};{ATC} 18) (pp. 533–546). USENIX Association.
  • Balliu, A., Olivetti, D., Babaoglu, O., Marzolla, M., & Sîrbu, A. (2016). A big data analyzer for large trace logs. Computing, 98(12), 1225–1249. https://doi.org/10.1007/s00607-015-0480-7
  • Chen, X., Lu, C. D., & Pattabiraman, K. (2014). Failure analysis of jobs in compute clouds: A Google cluster case study. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (pp. 167–177). IEEE.
  • Cheng, Y., Chai, Z., & Anwar, A. (2018). Characterizing co-located datacenter workloads: An Alibaba case study. In Proceedings of the 9th Asia-Pacific Workshop on Systems (p. 12). ACM.
  • Daid, R., Kumar, Y., Hu, Y. C., & Chen, W. L. (2021). An effective scheduling in data centres for efficient CPU usage and service level agreement fulfilment using machine learning. Connection Science, 33(4), 954–974. https://doi.org/10.1080/09540091.2021.1926929
  • Di, S., Kondo, D., & Cirne, W. (2012). Host load prediction in a Google compute cloud with a Bayesian model. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (p. 21). IEEE/ACM.
  • Doray, F., & Dagenais, M. (2017). Diagnosing performance variations by comparing multi-level execution traces. IEEE Transactions on Parallel and Distributed Systems, 28(2), 462–474. https://doi.org/10.1109/TPDS.2016.2567390
  • Everman, B., Rajendran, N., Li, X., & Zong, Z. (2021). Improving the cost efficiency of large-scale cloud systems running hybrid workloads-A case study of Alibaba cluster traces. Sustainable Computing: Informatics and Systems, 30. https://doi.org/10.1016/j.suscom.2021.100528.100528.
  • Facebook (2010). Facebook workloads repository. Retrieved November 5, 2021, from https://github.com/SWIMProjectUCB/SWIM/wiki/Workloads-repository
  • Fernandez-Cerero, D., Gómez-López, M. T., & Alvárez-Bermejo, J. A. (2020). Measuring data-centre workflows complexity through process mining: The Google cluster case. The Journal of Supercomputing, 76(4), 2449–2478. https://doi.org/10.1007/s11227-019-02996-2
  • Gao, J., Wang, H., & Shen, H. (2020). Task failure prediction in cloud data centers using deep learning. 2019 IEEE International Conference on Big Data, Los Angeles, CA, USA, December 9–12, 2019.
  • Garraghan, P., Moreno, I. S., Townend, P., & Xu, J. (2014). An analysis of failure-related energy waste in a large-scale cloud environment. IEEE Transactions on Emerging Topics in Computing, 2(2), 166–180. https://doi.org/10.1109/TETC.2014.2304500
  • Giraldeau, F., & Dagenais, M. (2015). Wait analysis of distributed systems using kernel tracing. IEEE Transactions on Parallel and Distributed Systems, 27(8), 2450–2461. https://doi.org/10.1109/TPDS.2015.2488629
  • Google (2011). Google cluster-data. Accessed November 5, 2021, from https://github.com/google/cluster-data.
  • Guo, J., Chang, Z., Wang, S., Ding, H., Feng, Y., Mao, L., & Bao, Y. (2019). Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces. In Proceedings of the International Symposium on Quality of Service (p. 39). ACM.
  • Jassas, M., & Mahmoud, Q. H. (2018). Failure analysis and characterization of scheduling jobs in Google cluster trace. In Iecon 2018-44th Annual Conference of the IEEE Industrial Electronics Society (pp. 3102–3107). IEEE.
  • Javadpour, A., Saedifar, K., Wang, G., Li, K. C., & Saghafi, F. (2021). Improving the efficiency of customer's credit rating with machine learning in big data cloud computing. Wireless Personal Communications, 121(4), 2699–2718. https://doi.org/10.1007/s11277-021-08844-y
  • Liu, B., Lin, Y., & Chen, Y. (2016). Quantitative workload analysis and prediction using Google cluster traces. In 2016 IEEE Conference on Computer Communications Workshops (Infocom Wkshps) (pp. 935–940). IEEE.
  • Liu, Q., & Yu, Z. (2018). The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba trace. In Proceedings of the ACM Symposium on Cloud Computing (pp. 347–360). ACM.
  • Lu, C., Ye, K., Xu, G., Xu, C. Z., & Bai, T. (2017). Imbalance in the cloud: An analysis on alibaba cluster trace. In 2017 IEEE International Conference on Big Data (Big Data) (pp. 2884–2892). IEEE.
  • Luo, S., Xu, H., Lu, C., Ye, K., Xu, G., Zhang, L., Ding, Y, He, J, & Xu, C. (2021). Characterizing microservice dependency and performance: Alibaba trace analysis. In Proceedings of the ACM Symposium on Cloud Computing (pp. 412–426). ACM.
  • Ogbole, M., Ogbole, E., & Olagesin, A. (2021). Cloud systems and applications: A review. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 142–149. https://doi.org/10.32628/IJSRCSEIT
  • Peng, C., Li, Y., Yu, Y., Zhou, Y., & Du, S. (2018). Multi-step-ahead host load prediction with GRU based encoder-decoder in cloud computing. In 2018 10th International Conference on Knowledge and Smart Technology (KST) (pp. 186–191). IEEE.
  • Reiss, C., Tumanov, A., Ganger, G. R., Katz, R. H., & Kozuch, M. A. (2012a). Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the Third ACM Symposium on Cloud Computing (p. 7). ACM.
  • Reiss, C., Tumanov, A., Ganger, G. R., Katz, R. H., & Kozuch, M. A. (2012b). Towards understanding heterogeneous clouds at scale: Google trace analysis. Intel Science and Technology Center for Cloud Computing, Tech. Rep, 84, 1–12.
  • Ren, Z., Wan, J., Shi, W., Xu, X., & Zhou, M. (2013). Workload analysis, implications, and optimization on a production hadoop cluster: A case study on taobao. IEEE Transactions on Services Computing, 7(2), 307–321. https://doi.org/10.1109/TSC.4629386
  • Rosa, A., Chen, L. Y., & Binder, W. (2015). Understanding the dark side of big data clusters: An analysis beyond failures. In 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (pp. 207–218). IEEE.
  • Ruan, L., Xu, X., Xiao, L., Yuan, F., Li, Y., & Dai, D. (2019). A comparative study of large-scale cluster workload traces via multiview analysis. In 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SMARTCITY/DSS) (pp. 397–404). IEEE.
  • Shan, Y., Huang, Y., Chen, Y., & Zhang, Y. (2018). Legoos: A disseminated, distributed {OS} for hardware resource disaggregation. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18) (pp. 69–87). USENIX Association.
  • Tang, B., Tang, M., Xia, Y., & Hsieh, M. Y. (2021). Composition pattern-aware web service recommendation based on depth factorisation machine. Connection Science, 33(4), 870–890. https://doi.org/10.1080/09540091.2021.1911933
  • Versluis, L., Mathá, R., Talluri, S., Hegeman, T., Prodan, R., Deelman, E., & Iosup, A. (2020). The workflow trace archive: Open-access data from public and private computing infrastructures. IEEE Transactions on Parallel and Distributed Systems, 31(9), 2170–2184. https://doi.org/10.1109/TPDS.71
  • Xiao, T., Han, D., He, J., Li, K. C., & de Mello, R. F. (2021). Multi-Keyword ranked search based on mapping set matching in cloud ciphertext storage system. Connection Science, 33(1), 95–112. https://doi.org/10.1080/09540091.2020.1753175
  • Xu, J., Xiao, L., Li, Y., Huang, M., Zhuang, Z., Weng, T. H., & Liang, W. (2021). NFMF: neural fusion matrix factorisation for QoS prediction in service selection. Connection Science, 33(3), 753–768. https://doi.org/10.1080/09540091.2021.1889975
  • Yu, L., Duan, Y., & Li, K. C. (2021). A real-world service mashup platform based on data integration, information synthesis, and knowledge fusion. Connection Science, 33(3), 463–481. https://doi.org/10.1080/09540091.2020.1841110
  • Zhang, W., Li, B., Zhao, D., Gong, F., & Lu, Q. (2016). Workload prediction for cloud cluster using a recurrent neural network. In 2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI) (pp. 104–109). IEEE.
  • Zhao, H., Yao, L., Zeng, Z., Li, D., Xie, J., Zhu, W., & Tang, J. (2021). An edge streaming data processing framework for autonomous driving. Connection Science, 33(2), 173–200. https://doi.org/10.1080/09540091.2020.1782840