CrossRef citations to date
Research Article

Beyond the Screen With DanceSculpt: A 3D Dancer Reconstruction and Tracking System for Learning Dance

ORCID Icon, ORCID Icon, & ORCID Icon
Received 14 Dec 2023, Accepted 23 May 2024, Published online: 19 Jun 2024


  • Anderson, F., Grossman, T., Matejka, J., & Fitzmaurice, G. (2013). YouMove: Enhancing movement training with an augmented reality mirror [Paper presentation]. Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (pp. 311–320). https://doi.org/10.1145/2501988.2502045
  • Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., & Davis, J. (2005). Scape: Shape completion and animation of people [Paper presenation]. ACM SIGGRAPH 2005 Papers (pp. 408–416). https://doi.org/10.1145/1073204.1073207
  • Aristidou, A., Stavrakis, E., Charalambous, P., & Chrysanthou, Y. (2015). Stephania Loizidou Himona. 2015. Folk dance evaluation using laban movement analysis. Journal on Computing and Cultural Heritage, 8, 1–19. https://doi.org/10.1145/2755566
  • Bangor, A., Kortum, P. T., & Miller, J. T. (2008). An empirical evaluation of the system usability scale. International Journal of Human-Computer Interaction, 24(6), 574–594. https://doi.org/10.1080/10447310802205776
  • Cao, X. (2023). Case study of China’s compulsory education system: AI apps and extracurricular dance learning. International Journal of Human–Computer Interaction. Advance online publication. https://doi.org/10.1080/10447318.2023.2188539
  • Cheng, Z.-Q., Chen, Y., Martin, R. R., Wu, T., & Song, Z. (2018). Parametric modeling of 3D human body shape—A survey. Computers & Graphics, 71(2018), 88–100. https://doi.org/10.1016/j.cag.2017.11.008
  • Choi, J., Massey, K., Hwaryoung Seo, J., & Kicklighter, C. (2021). Balletic VR: Integrating art, science, and technology for dance science education [Paper presentation]. 10th international conference on digital and interactive arts, Aveiro, Portuga, (pp. 1–6). https://doi.org/10.1145/3483529.3483704
  • Doersch, C., & Zisserman, A. (2019). Sim2real transfer learning for 3d human pose estimation: Motion to the rescue. Advances in Neural Information Processing Systems, 32(2019), 12949–12961. https://dl.acm.org/doi/10.5555/3454287.3455447
  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale [Paper presentation]. The International Conference on Learning Representations (ICLR) 2021. https://doi.org/10.48550/arXiv.2010.11929
  • Ettina Laugwitz, T. Held., & M., Schrepp. (2008). Construction and evaluation of a user experience questionnaire. In Proceedings HCI and Usability for Education and Work: 4th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society, USAB 2008 (pp. 63–76). Springer. https://doi.org/10.1007/978-3-540-89350-9_6
  • Fieraru, M., Zanfir, M., Pirlea, S. C., Olaru, V., & Sminchisescu, C. (2021). Aifit: Automatic 3d human-interpretable feedback models for fitness training [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2021) (pp. 9919–9928). IEEE/CVF. https://doi.org/10.1109/CVPR46437.2021.00979
  • Goel, S., Pavlakos, G., Rajasegaran, J., Kanazawa, A., Malik, J. (2023). Humans in 4D: Reconstructing and tracking humans with transformers [Paper presentation]. ICCV (International Conference on Computer Vision) 2023 (pp. 14783–14794). IEEE/CVF. https://doi.org/10.1109/ICCV51070.2023.01358
  • Guest, A. H. (1990). Dance notation. Perspecta, 26, 203–214. https://doi.org/10.2307/1567163
  • Hanna, J. L. (2008). A nonverbal language for imagining and learning: Dance education in K–12 curriculum. J. Educational Researcher, 37(8), 491–506. https://doi.org/10.3102/0013189X08326032
  • He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition [Paper presentation]. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
  • Hong, J. C., Chen, M. L., & Hong Ye, J, the Department of Industrial Education, National Taiwan Normal University, Taiwan (2020). Acceptance of YouTube applied to dance learning. International Journal of Information and Education Technology, 10(1), 7–13. 2020 https://doi.org/10.18178/ijiet.2020.10.1.1331
  • Hou, Y. (2022). The collision of digital tools and dance education during the period of COVID-19. In 2022 2nd International Conference on Modern Educational Technology and Social Sciences (ICMETSS 2022) (pp. 844–852). Atlantis Press. https://doi.org/10.2991/978-2-494069-45-9_102
  • Kanazawa, A., Michael, J. B., Jacobs, D. W., Malik, J. (2018). End-to-end recovery of human shape and pose [Paper presentation]. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7122–7131).
  • Kanazawa, A., Zhang, J. Y., Felsen, P., & Malik, J. (2019). Learning 3d human dynamics from video [Paper presentation]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (CVPR 2019) (pp. 5614–5623). https://doi.org/10.1109/CVPR.2019.00576
  • Kocabas, M., Athanasiou, N., & Black, M. J. (2020). Vibe: Video inference for human body pose and shape estimation [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2020) (pp. 5253–5263). https://doi.org/10.1109/CVPR42600.2020.00530
  • Kocabas, M., Huang, C.-H. P., Hilliges, O., Black, M. J. (2021). PARE: Part attention regressor for 3D human body estimation [Paper presentation]. Proceedings of the IEEE/CVF international conference on computer vision (pp. 11127–11137).
  • Kuhn, H. W. ( 1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 2, 83–97. https://doi.org/10.1002/nav.3800020109
  • Laugwitz, B., Held, T., & Schrepp, M. (2008). Construction and evaluation of a user experience questionnaire. In HCI and Usability for Education and Work: 4th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society, USAB 2008, Graz, Austria, November 20–21, 2008. Proceedings 4 (pp. 63–76). Springer. https://doi.org/10.1007/978-3-540-89350-9_6
  • Lee, G.-H., & Lee, S.-W. (2021). Uncertainty-aware human mesh recovery from video by learning part-based 3d dynamics [Paper presentation]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV, 2021) (pp. 12375–12384). https://doi.org/10.1109/ICCV48922.2021.01215
  • Lee, S., & Lee, K. (2023). CheerUp: A real-time ambient visualization of cheerleading pose similarity [Paper presentation]. Companion Proceedings of the 28th International Conference on Intelligent User Interfaces (pp. 72–74). Association for Computing Machinery (ACM). https://doi.org/10.1145/3581754.3584135
  • Lee, S-h., Lee, D.-W., Jun, K., Lee, W., & Kim, M. S. (2022). Markerless 3d skeleton tracking algorithm by merging multiple inaccurate skeleton data from multiple rgb-d sensors. Sensors (Basel, Switzerland), 22(9), 3155. https://doi.org/10.3390/s22093155
  • Li, R., Yang, S., Ross, D. A., Kanazawa, A. (2021). Ai choreographer: Music conditioned 3d dance generation with aist++ [Paper presentation]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021) (pp. 13401–13412). https://arxiv.org/abs/2101.08779
  • Li, R., Zhao, J., Zhang, Y., Su, M., Ren, Z., Zhang, H., Li, X. (2023). FineDance: A fine-grained choreography dataset for 3D full body dance generation [Paper presentation]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023) (pp. 10234–10243). https://doi.org/10.1109/ICCV51070.2023.00939
  • Li, Z., Xu, B., Huang, H., Lu, C., & Guo, Y. (2022). Deep two-stream video inference for human body pose and shape estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022) (pp. 430–439). IEEE Computer Society. https://doi.org/10.1109/WACV51458.2022.00071
  • Liu, J., Saquib, N., Zhu-Tian, C., Kazi, R. H., Wei, L.-Y., Fu, H., & Tai, C.-L. (2022). PoseCoach: A customizable analysis and visualization system for video-based running coaching [Paper presentation]. IEEE Transactions on Visualization and Computer Graphics (pp. 1–15). https://doi.org/10.1109/tvcg.2022.3230855
  • Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., & Black, M. J. (2023). SMPL: A skinned multi-person linear model. In Seminal Graphics Papers: Pushing the Boundaries, 2(6), 851–866. https://doi.org/10.1145/2816795.2818013
  • Luo, Z., Golestaneh, S. A., Kitani, K. M. (2020). 3d human motion estimation via motion compression and refinement [Paper presentation]. Proceedings of the Asian Conference on Computer Vision (ACCV). Springer. https://doi.org/10.1007/978-3-030-69541-5_20
  • Muhammad, Z.-U.-D., Huang, Z., & Khan, R. (2022). A review of 3D human body pose estimation and mesh recovery. Digital Signal Processing, 128(2022), 103628. https://doi.org/10.1016/j.dsp.2022.103628
  • Ng, L. H. X., Tan, J. Y. H., Tan, D. J. H., & Lee, R. K.-W. (2021). Will you dance to the challenge? predicting user participation of TikTok challenges [Paper presentation]. Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 356–360). Association for Computing Machinery (ACM). https://doi.org/10.1145/3487351.3488276
  • Piitulainen, R., Hämäläinen, P., & Mekler, E. D. (2022). Vibing together: Dance experiences in social virtual reality [Paper presentation]. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (pp. 1–18). https://doi.org/10.1145/3491102.3501828
  • Rajasegaran, J., Georgios, P., Angjoo, K., Jitendra, M. (2021). Tracking people with 3D representations. Advances in Neural Information Processing Systems (NeurIPS 2021), 34, 23703–23713. https://doi.org/10.48550/arXiv.2111.07868
  • Rajasegaran, J., Pavlakos, G., Kanazawa, A., Malik, J. (2022). Tracking people by predicting 3D appearance, location and pose [Paper presenation]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2740–2749). IEEE/CVF.
  • Shen, X., Yang, Z., Wang, X., Ma, J., Zhou, C., & Yang, Y. (2023). Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (pp. 8887–8896). IEEE/CVF. https://doi.org/10.1109/CVPR52729.2023.00858
  • Singh, B., Beatty, J. C., & Ryman, R. (1983). A graphics editor for benesh movement notation [Paper presentation]. Proceedings of the 10th annual conference on Computer Graphics and Interactive Techniques (pp. 51–62). IEEE/CVF. https://doi.org/10.1145/800059.801132
  • Sukel, K. E., Catrambone, R., Essa, I., & Brostow, G. (2003). Presenting movement in a computer-based dance tutor. International Journal of Human-Computer Interaction, 15(3), 433–452. https://doi.org/10.1207/S15327590IJHC1503_08
  • Sun, Y., Bao, Q., Liu, W., Mei, T., Black, M. J. (2023). TRACE: 5D temporal regression of avatars with dynamic cameras in 3D environments. [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (pp. 8856–8866). IEEE/CVF. https://doi.org/10.48550/arXiv.2306.02850
  • Tian, Y., Zhang, H., Liu, Y., & Wang, L. (2023). Recovering 3d human mesh from monocular images: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12), 15406–15425. https://doi.org/10.1109/TPAMI.2023.3298850
  • Tsuchida, S., Mao, H., Okamoto, H., Suzuki, Y., Kanada, R., Hori, T., Terada, T., & Tsukamoto, M. (2022). Dance practice system that shows what you would look like if you could master the dance [Paper presentation]. Proceedings of the 8th international conference on movement and computing (pp. 1–8). Association for Computing Machinery (ACM). https://doi.org/10.1145/3537972.3537991
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30(2017), 6000–6010. https://dl.acm.org/doi/10.5555/3295222.3295349
  • Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W., & Xiao, B. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10), 3349–3364. https://doi.org/10.1109/TPAMI.2020.2983686
  • Whittier, C. (2006). Laban movement analysis approach to classical ballet pedagogy. Journal of Dance Education, 6(4), 124–132. https://doi.org/10.1080/15290824.2006.10387325
  • Wu, Y., Kirillov, A., Massa, F., Lo, W.-Y., & Girshick, R. (2019). Detectron2. https://github.com/facebookresearch/detectron2
  • Xiao, B., Wu, H., Wei, Y. (2018). Simple baselines for human pose estimation and tracking [Paper presenation]. Proceedings of the European Conference on Computer Vision (ECCV 2018). (pp. 466–481). Springer Science and Business Media. https://doi.org/10.1007/978-3-030-01231-1_29
  • Xu, X., Chen, H., Moreno-Noguer, F., László., & Jeni, A., Fernando De la Torre (2020). 3d human shape and pose from a single low-resolution image with self-supervised learning. In Computer Vision–ECCV 2020: 16th European Conference, Proceedings, Part IX 16. Springer. https://doi.org/10.1007/978-3-030-58545-7_17
  • Ye, V., Pavlakos, G., Malik, J., & Kanazawa, A. (2023). Decoupling human and camera motion from videos in the wild. [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (pp. 21222–21232). IEEE/CVF. https://doi.org/10.1109/CVPR52729.2023.02033
  • Yulu H. (2022). The collision of digital tools and dance education during the period of COVID-19. In 2022 2nd International Conference on Modern Educational Technology and Social Sciences (ICMETSS 2022) (pp. 844–852). Atlantis Press. https://doi.org/10.2991/978-2-494069-45-9_102
  • Z., Li, Bo Xu, H., Huang, C., Lu., & Y., Guo. (2022). Deep two-stream video inference for human body pose and shape estimation [Paper presentation]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022) (pp. 430–439). https://doi.org/10.1109/WACV51458.2022.00071
  • Zanfir, A., Bazavan, E. G., Zanfir, M., William, T. F., Sukthankar, R., Sminchisescu, C. (2021). Neural descent for visual 3d human pose and shape [Paper presentation]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 14484–14493).
  • Zhang, H., Tian, Y., Zhou, X., Ouyang, W., Liu, Y., Wang, L., Sun, Z. (2021). Pymaf: 3d human pose and shape regression with pyramidal mesh alignment feedback loop [Paper presentation]. Proceedings of the IEEE/CVF international conference on computer vision (pp. 11446–11456).
  • Zhou, Q., Cheng Chua, C., Knibbe, J., Goncalves, J., & Velloso, E. (2021). Dance and choreography in HCI: A two-decade retrospective [Paper presentation]. Proceedings of the 2021-14 CHI Conference on Human Factors in Computing Systems. (pp. 1–14). Association for Computing Machinery (ACM). https://doi.org/10.1145/3411764.3445804
  • Zhou, Q., Li, M., Zeng, Q., Aristidou, A., Zhang, X., Chen, L., & Tu, C. (2023). Let’s all dance: Enhancing amateur dance motions. Computational Visual Media, 9(3), 531–550. https://doi.org/10.1007/s41095-022-0292-6