Search in:

The Imaging Science Journal Volume 71, 2023 - Issue 7

Submit an article Journal homepage

135

Views

CrossRef citations to date

Altmetric

Research Articles

Graph convolutional network with STC attention and adaptive normalization for skeleton-based action recognition

Haiyun Zhoua College of Public Security, Nanjing Forest Police College, Nanjing, People’s Republic of ChinaView further author information

Xuezhi Xiangb School of Information and Communication Engineering, Harbin Engineering University, Harbin, People’s Republic of China;c Key Laboratory of Advanced Marine Communication and Information Technology, Ministry of Industry and Information Technology, Harbin, People’s Republic of ChinaCorrespondence[email protected]

https://orcid.org/0000-0002-6185-833X View further author information

Yujian Qiub School of Information and Communication Engineering, Harbin Engineering University, Harbin, People’s Republic of ChinaView further author information

Xuzhao Liub School of Information and Communication Engineering, Harbin Engineering University, Harbin, People’s Republic of ChinaView further author information

Pages 636-646 | Received 07 Jun 2022, Accepted 10 Mar 2023, Published online: 22 Mar 2023

Cite this article
https://doi.org/10.1080/13682199.2023.2190927
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Wang L, Xiong Y, Wang Z, et al. Temporal segment networks: towards good practices for deep action recognition, in: Proc. European conference on computer vision, 2016, pp. 20–36.
Google Scholar
Veeriah V, Zhuang N, Qi GJ. Differential recurrent neural networks for action recognition, in: Proc. IEEE international conference on computer vision, 2015, pp. 4041–4049.
Google Scholar
Poppe R. A survey on vision-based human action recognition, in: Image and vision computing, 2010, 28(6): 976–990.
Google Scholar
Ke Q, Bennamoun M, An S, et al. Learning clip representations for skeleton-based 3d action recognition, in: IEEE Transactions on Image Processing, 2018, 27(6): 2842–2855.
Google Scholar
Liu H, Tu J, Liu M. Two-stream 3d convolutional neural network for skeleton-based action recognition, 2017, arXiv:1705.08106.
Google Scholar
Li B, Dai Y, Cheng X, et al. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN, in: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017, pp. 601–604.
Google Scholar
Ke Q, Bennamoun M, An S, et al. A new representation of skeleton sequences for 3d action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 3288–3297.
Google Scholar
Kim TS, Reiter A. Interpretable 3d human action analysis with temporal convolutional networks, in: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), 2017, pp. 1623–1631.
Google Scholar
Zheng W, Li L, Zhang Z, et al. Skeleton-based relational modeling for action recognition, 2018, arXiv:1805.02556.
Google Scholar
Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2015, pp. 1110–1118.
Google Scholar
Li S, Li W, Cook C, et al. Independently recurrent neural network (indrnn): Building a longer and deeper rnn, in: Proc. IEEE conference on computer vision and pattern recognition, 2018, pp. 5457–5466.
Google Scholar
Monti F, Boscaini D, Masci J, et al. Geometric deep learning on graphs and manifolds using mixture model cnns, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 5115–5124.
Google Scholar
Li C, Zhong Q, Xie D, et al. Skeleton-based action recognition with convolutional neural networks, in: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017, pp. 597–600.
Google Scholar
Li M, Chen S, Chen X, et al. Actional-structural graph convolutional networks for skeleton-based action recognition, in: Proc. IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3595–3603.
Google Scholar
Shi L, Zhang Y, Cheng J, et al. Skeleton-based action recognition with directed graph neural networks, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7912–7921.
Google Scholar
Shi L, Zhang Y, Cheng J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proc. IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12026–12035.
Google Scholar
Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Thirty-second AAAI conference on artificial intelligence, 2018, pp. 1–10.
Google Scholar
Aggarwal JK, Ryoo MS. Human activity analysis: A review, in: ACM Computing Surveys (CSUR), 2011, 43(3): 1–43.
Google Scholar
Maas AL, Hannun AY, Ng Y. Rectifier nonlinearities improve neural network acoustic models, in Proc. icml. 2013, 30(1): 3.
Google Scholar
Song S, Lan C, Xing J, et al. An end-to-end spatio-temporal attention model for human action recognition from skeleton data, in: Proc. AAAI conference on artificial intelligence, 2017, 31(1).
Google Scholar
Cheng K, Zhang Y, He X, et al. Skeleton-based action recognition with shift graph convolutional network, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 183–192.
Google Scholar
Song Y F, Zhang Z, Shan C, et al. Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition, in: Proc. 28th ACM International Conference on Multimedia, 2020, pp. 1625–1633.
Google Scholar
Weinland D, Ronfard R, Boyer E. A survey of vision-based methods for action representation, segmentation and recognition, in: Computer vision and image understanding, 2011, 115(2): 224–241.
Google Scholar
Liu J, Shahroudy A, Xu D, et al. Spatio-temporal lstm with trust gates for 3d human action recognition, in: European conference on computer vision. Springer, Cham, 2016: 816–833.
Google Scholar
Wen Y H, Gao L, Fu H, et al. Graph CNNs with motif and variable temporal block for skeleton-based action recognition, in: Proc. AAAI Conference on Artificial Intelligence, 2019, 33(01): 8989–8996.
Google Scholar
Li B, Li X, Zhang Z, et al. Spatio-temporal graph routing for skeleton-based action recognition, in: Proc. AAAI Conference on Artificial Intelligence, 2019, 33(01): 8561–8568.
Google Scholar
Si C, Chen W, Wang W, et al. An attention enhanced graph convolutional lstm network for skeleton-based action recognition, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1227–1236.
Google Scholar
Chaolong L, Zhen C, Wenming Z, et al. Spatio-temporal graph convolution for skeleton based action recognition, in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
Google Scholar
Cheng K, Zhang Y, Cao C, et al. Decoupling gcn with dropgraph module for skeleton-based action recognition, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16, Springer International Publishing, 2020, pp. 536–553.
Google Scholar
Veličković P, Cucurull G, Casanova A, et al. Graph attention networks, 2017, arXiv:1710.10903.
Google Scholar
Liu J, Wang G, Duan LY, et al. Skeleton-based human action recognition with global context-aware attention LSTM networks, in: IEEE Transactions on Image Processing, 2017, 27(4): 1586–1599.
Google Scholar
Liu J, Shahroudy A, Wang G, et al. Skeleton-based online action prediction using scale selection network, in: IEEE transactions on pattern analysis and machine intelligence, 2019, 42(6): 1453–1467.
Google Scholar
Song Y F, Zhang Z, Wang L. Richly activated graph convolutional network for action recognition with incomplete skeletons, in: 2019 IEEE International Conference on Image Processing (ICIP), 2019, pp. 1–5.
Google Scholar
Liu Z, Zhang H, Chen Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2020, pp. 143–152.
Google Scholar
Baradel F, Wolf C, Mille J. Human action recognition: pose-based attention draws focus to hands, in: Proc. IEEE international conference on computer vision workshops, 2017, pp. 604–613.
Google Scholar
Shi L, Zhang Y, Cheng J, et al. Skeleton-based action recognition with multi-stream adaptive graph convolutional networks, in: IEEE Transactions on Image Processing, 2020, 29: 9532–9545.
Google Scholar
Shahroudy A, Liu J, Ng TT, et al. Ntu rgb+ d: A large scale dataset for 3d human activity analysis, in: Proc. IEEE conference on computer vision and pattern recognition, 2015, pp. 1010–1019.
Google Scholar
Liu J, Shahroudy A, Perez M, et al. Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding, in: IEEE transactions on pattern analysis and machine intelligence, 2019, 42(10): 2684–2701.
Google Scholar
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift, in: International conference on machine learning, 2015, pp. 448–456.
Google Scholar
Ioffe S. Batch renormalization: Towards reducing minibatch dependence in batch-normalized models, 2017, arXiv:1702.03275, 2017.
Google Scholar
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution, in: European conference on computer vision, 2016, pp. 694–711.
Google Scholar
Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization, in: Proc. IEEE international conference on computer vision, 2017, pp. 1501–1510.
Google Scholar
Ba JL, Kiros JR, Hinton GE. Layer normalization, 2016, arXiv:1607.06450.
Google Scholar
Luo P, Ren J, Peng Z, et al. Differentiable learning-to-normalize via switchable normalization, 2018, arXiv:1806.10779.
Google Scholar
Zhou J, Cui G, Hu S, et al. Graph neural networks: A review of methods and applications, in: AI open, 2020, 1: 57–81.
Google Scholar
Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks, in: IEEE transactions on neural networks and learning systems, 2020, 32(1): 4–24.
Google Scholar
Li C, Zhong Q, Xie D, et al. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation, 2018, arXiv:1804.06055.
Google Scholar
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks, 2016, arXiv:1609.02907.
Google Scholar
Niepert M, Ahmed M, Kutzkov K. Learning convolutional neural networks for graphs, in: International conference on machine learning, 2016, pp. 2014-2023.
Google Scholar
Plizzari C, Cannici M, Matteucci M. Skeleton-based action recognition via spatial and temporal transformer networks, in: Computer Vision and Image Understanding, 2021, 208: 103219.
Google Scholar
Bai R, Li M, Meng B, et al. GCsT: Graph convolutional skeleton transformer for action recognition, 2021, arXiv:2109.02860.
Google Scholar
Zhang P, Lan C, Zeng W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition, in: Proc. IEEE conference on computer vision and pattern recognition, 2020, pp. 1112–1121.
Google Scholar
Liu W, Ma X, Zhou Y, et al. p-Laplacian regularization for scene recognition. IEEE Trans Cybern. 2019;49(8):2927–2940. doi:10.1109/TCYB.2018.2833843.
PubMed Web of Science ®Google Scholar
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Google Scholar
Zhang B, Xiao J, Jiao J, et al. Affinity attention graph neural network for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
Google Scholar
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in neural information processing systems, 2017, 30.
Google Scholar
Wang F, Jiang M, Qian C, et al. Residual attention network for image classification, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 3156–3164.
Google Scholar
Ying X, Wang Q, Li X, et al. Multi-attention object detection model in remote sensing images based on multi-scale. IEEE Access. 2019;7:94508–94519.
Web of Science ®Google Scholar
Bertasius G, Wang H, Torresani L. Is space-time attention all you need for video understanding?, ICML. 2021, 2(3): 4.
Google Scholar
Chu X, Yang W, Ouyang W, et al. Multi-context attention for human pose estimation, in: Proc. IEEE conference on computer vision and pattern recognition, 2017, pp. 1831–1840.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Graph convolutional network with STC attention and adaptive normalization for skeleton-based action recognition

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Graph convolutional network with STC attention and adaptive normalization for skeleton-based action recognition

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date