134
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Graph convolutional network with STC attention and adaptive normalization for skeleton-based action recognition

, ORCID Icon, &
Pages 636-646 | Received 07 Jun 2022, Accepted 10 Mar 2023, Published online: 22 Mar 2023

References

  • Wang L, Xiong Y, Wang Z, et al. Temporal segment networks: towards good practices for deep action recognition, in: Proc. European conference on computer vision, 2016, pp. 20–36.
  • Veeriah V, Zhuang N, Qi GJ. Differential recurrent neural networks for action recognition, in: Proc. IEEE international conference on computer vision, 2015, pp. 4041–4049.
  • Poppe R. A survey on vision-based human action recognition, in: Image and vision computing, 2010, 28(6): 976–990.
  • Ke Q, Bennamoun M, An S, et al. Learning clip representations for skeleton-based 3d action recognition, in: IEEE Transactions on Image Processing, 2018, 27(6): 2842–2855.
  • Liu H, Tu J, Liu M. Two-stream 3d convolutional neural network for skeleton-based action recognition, 2017, arXiv:1705.08106.
  • Li B, Dai Y, Cheng X, et al. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN, in: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017, pp. 601–604.
  • Ke Q, Bennamoun M, An S, et al. A new representation of skeleton sequences for 3d action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 3288–3297.
  • Kim TS, Reiter A. Interpretable 3d human action analysis with temporal convolutional networks, in: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), 2017, pp. 1623–1631.
  • Zheng W, Li L, Zhang Z, et al. Skeleton-based relational modeling for action recognition, 2018, arXiv:1805.02556.
  • Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2015, pp. 1110–1118.
  • Li S, Li W, Cook C, et al. Independently recurrent neural network (indrnn): Building a longer and deeper rnn, in: Proc. IEEE conference on computer vision and pattern recognition, 2018, pp. 5457–5466.
  • Monti F, Boscaini D, Masci J, et al. Geometric deep learning on graphs and manifolds using mixture model cnns, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 5115–5124.
  • Li C, Zhong Q, Xie D, et al. Skeleton-based action recognition with convolutional neural networks, in: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017, pp. 597–600.
  • Li M, Chen S, Chen X, et al. Actional-structural graph convolutional networks for skeleton-based action recognition, in: Proc. IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3595–3603.
  • Shi L, Zhang Y, Cheng J, et al. Skeleton-based action recognition with directed graph neural networks, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7912–7921.
  • Shi L, Zhang Y, Cheng J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proc. IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12026–12035.
  • Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Thirty-second AAAI conference on artificial intelligence, 2018, pp. 1–10.
  • Aggarwal JK, Ryoo MS. Human activity analysis: A review, in: ACM Computing Surveys (CSUR), 2011, 43(3): 1–43.
  • Maas AL, Hannun AY, Ng Y. Rectifier nonlinearities improve neural network acoustic models, in Proc. icml. 2013, 30(1): 3.
  • Song S, Lan C, Xing J, et al. An end-to-end spatio-temporal attention model for human action recognition from skeleton data, in: Proc. AAAI conference on artificial intelligence, 2017, 31(1).
  • Cheng K, Zhang Y, He X, et al. Skeleton-based action recognition with shift graph convolutional network, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 183–192.
  • Song Y F, Zhang Z, Shan C, et al. Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition, in: Proc. 28th ACM International Conference on Multimedia, 2020, pp. 1625–1633.
  • Weinland D, Ronfard R, Boyer E. A survey of vision-based methods for action representation, segmentation and recognition, in: Computer vision and image understanding, 2011, 115(2): 224–241.
  • Liu J, Shahroudy A, Xu D, et al. Spatio-temporal lstm with trust gates for 3d human action recognition, in: European conference on computer vision. Springer, Cham, 2016: 816–833.
  • Wen Y H, Gao L, Fu H, et al. Graph CNNs with motif and variable temporal block for skeleton-based action recognition, in: Proc. AAAI Conference on Artificial Intelligence, 2019, 33(01): 8989–8996.
  • Li B, Li X, Zhang Z, et al. Spatio-temporal graph routing for skeleton-based action recognition, in: Proc. AAAI Conference on Artificial Intelligence, 2019, 33(01): 8561–8568.
  • Si C, Chen W, Wang W, et al. An attention enhanced graph convolutional lstm network for skeleton-based action recognition, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1227–1236.
  • Chaolong L, Zhen C, Wenming Z, et al. Spatio-temporal graph convolution for skeleton based action recognition, in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  • Cheng K, Zhang Y, Cao C, et al. Decoupling gcn with dropgraph module for skeleton-based action recognition, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16, Springer International Publishing, 2020, pp. 536–553.
  • Veličković P, Cucurull G, Casanova A, et al. Graph attention networks, 2017, arXiv:1710.10903.
  • Liu J, Wang G, Duan LY, et al. Skeleton-based human action recognition with global context-aware attention LSTM networks, in: IEEE Transactions on Image Processing, 2017, 27(4): 1586–1599.
  • Liu J, Shahroudy A, Wang G, et al. Skeleton-based online action prediction using scale selection network, in: IEEE transactions on pattern analysis and machine intelligence, 2019, 42(6): 1453–1467.
  • Song Y F, Zhang Z, Wang L. Richly activated graph convolutional network for action recognition with incomplete skeletons, in: 2019 IEEE International Conference on Image Processing (ICIP), 2019, pp. 1–5.
  • Liu Z, Zhang H, Chen Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2020, pp. 143–152.
  • Baradel F, Wolf C, Mille J. Human action recognition: pose-based attention draws focus to hands, in: Proc. IEEE international conference on computer vision workshops, 2017, pp. 604–613.
  • Shi L, Zhang Y, Cheng J, et al. Skeleton-based action recognition with multi-stream adaptive graph convolutional networks, in: IEEE Transactions on Image Processing, 2020, 29: 9532–9545.
  • Shahroudy A, Liu J, Ng TT, et al. Ntu rgb+ d: A large scale dataset for 3d human activity analysis, in: Proc. IEEE conference on computer vision and pattern recognition, 2015, pp. 1010–1019.
  • Liu J, Shahroudy A, Perez M, et al. Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding, in: IEEE transactions on pattern analysis and machine intelligence, 2019, 42(10): 2684–2701.
  • Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift, in: International conference on machine learning, 2015, pp. 448–456.
  • Ioffe S. Batch renormalization: Towards reducing minibatch dependence in batch-normalized models, 2017, arXiv:1702.03275, 2017.
  • Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution, in: European conference on computer vision, 2016, pp. 694–711.
  • Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization, in: Proc. IEEE international conference on computer vision, 2017, pp. 1501–1510.
  • Ba JL, Kiros JR, Hinton GE. Layer normalization, 2016, arXiv:1607.06450.
  • Luo P, Ren J, Peng Z, et al. Differentiable learning-to-normalize via switchable normalization, 2018, arXiv:1806.10779.
  • Zhou J, Cui G, Hu S, et al. Graph neural networks: A review of methods and applications, in: AI open, 2020, 1: 57–81.
  • Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks, in: IEEE transactions on neural networks and learning systems, 2020, 32(1): 4–24.
  • Li C, Zhong Q, Xie D, et al. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation, 2018, arXiv:1804.06055.
  • Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks, 2016, arXiv:1609.02907.
  • Niepert M, Ahmed M, Kutzkov K. Learning convolutional neural networks for graphs, in: International conference on machine learning, 2016, pp. 2014-2023.
  • Plizzari C, Cannici M, Matteucci M. Skeleton-based action recognition via spatial and temporal transformer networks, in: Computer Vision and Image Understanding, 2021, 208: 103219.
  • Bai R, Li M, Meng B, et al. GCsT: Graph convolutional skeleton transformer for action recognition, 2021, arXiv:2109.02860.
  • Zhang P, Lan C, Zeng W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2020, pp. 1112–1121.
  • Liu W, Ma X, Zhou Y, et al. p-Laplacian regularization for scene recognition. IEEE Trans Cybern. 2019;49(8):2927–2940. doi:10.1109/TCYB.2018.2833843.
  • Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  • Zhang B, Xiao J, Jiao J, et al. Affinity attention graph neural network for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  • Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in neural information processing systems, 2017, 30.
  • Wang F, Jiang M, Qian C, et al. Residual attention network for image classification, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 3156–3164.
  • Ying X, Wang Q, Li X, et al. Multi-attention object detection model in remote sensing images based on multi-scale. IEEE Access. 2019;7:94508–94519.
  • Bertasius G, Wang H, Torresani L. Is space-time attention all you need for video understanding?, ICML. 2021, 2(3): 4.
  • Chu X, Yang W, Ouyang W, et al. Multi-context attention for human pose estimation, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 1831–1840.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.