140
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Reconsideration of multi-stage deep network for human pose estimation

&
Pages 600-612 | Received 10 Jun 2020, Accepted 09 Mar 2021, Published online: 19 Apr 2021

References

  • Alani AA. 2017. Arabic handwritten digit recognition based on restricted Boltzmann machine and convolutional neural networks. Information. 8(4):142. Dec. doi:https://doi.org/10.3390/info8040142.
  • Andriluka M, Pishchulin L, Gehler P, Schiele B. 2014. 2d human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on computer Vision and Pattern Recognition. p. 3686–3693.
  • Andriluka M, Roth S, Schiele B. 2009. Pictorial structures revisited: people detection and articulated pose estimation. In: 2009 IEEE conference on computer vision and pattern recognition; Jun 20; IEEE. p. 1014–1021.
  • Belagiannis V, Zisserman A. 2017. Recurrent human pose estimation. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017); May 30; IEEE. p. 468–475.
  • Bulat A, Tzimiropoulos G. 2016. Human pose estimation via convolutional part heatmap regression. In: European Conference on Computer Vision; Oct 8; Cham: Springer. p. 717–732.
  • Carreira J, Agrawal P, Fragkiadaki K, Malik J. 2016. Human pose estimation with iterative error feedback. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 4733–4742.
  • Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2017 Apr 27. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell. 40(4):834–848. doihttps://doi.org/10.1109/TPAMI.2017.2699184.
  • Chen Y, Shen C, Chen H, Wei XS, Liu L, Yang J. 2019 Feb 26. Adversarial learning of structure-aware fully convolutional networks for landmark localization. IEEE Trans Pattern Anal Mach Intell. 42(7):1654–1669.
  • Chou CJ, Chien JT, Chen HT. 2018. Self adversarial training for human pose estimation. In: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC); Nov 12; IEEE. p. 17–30.
  • Chu X, Yang W, Ouyang W, Ma C, Yuille AL, Wang X. 2017. Multi-context attention for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 1831–1840.
  • Fan X, Zheng K, Lin Y, Wang S. 2015. Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 1347–1355.
  • He K, Zhang X, Ren S, Sun J. 2016. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 770–778.
  • Honari S, Molchanov P, Tyree S, Vincent P, Pal C, Kautz J. 2018. Improving landmark localization with semi-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 1546–1555.
  • Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B. 2016. Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: European Conference on Computer Vision; Oct 8; Cham: Springer. p. 34–50.
  • Iqbal U, Molchanov P, Gall T B J, Kautz J. 2018. Hand pose estimation via latent 2.5 d heatmap regression. In: Proceedings of the European Conference on Computer Vision (ECCV). p. 118–134.
  • Johnson S, Everingham M. 2011. Learning effective human pose estimation from inaccurate annotation. In: CVPR 2011; Jun 20; IEEE. p. 1465–1472.
  • Lifshitz I, Fetaya E, Ullman S. 2016. Human pose estimation using deep consensus voting. In: European Conference on Computer Vision; Oct 8; Cham: Springer. p. 246–260.
  • Liu W, Chen J, Li C, Qian C, Chu X, Hu X. 2018. A cascaded inception of inception network with attention modulated feature fusion for human pose estimation. In: Thirty-Second AAAI Conference on Artificial Intelligence; Apr 27.
  • Luvizon DC, Tabia H, Picard D. 2019. Human pose regression by combining indirect part detection and contextual information. Comput Graph. 85:15–22. doi:https://doi.org/10.1016/j.cag.2019.09.002.
  • Martinel N, Foresti GL, Micheloni C. 2018. Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV); Mar 12; IEEE. p. 567–576.
  • Newell A, Yang K, Deng J. 2016. Stacked hourglass networks for human pose estimation. In: European conference on computer vision; Oct 8; Cham: Springer. p. 483–499.
  • Papandreou G, Zhu T, Kanazawa N, Toshev A, Tompson J, Bregler C, Murphy K. 2017. Towards accurate multi-person pose estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 4903–4911.
  • Pavlakos G, Zhou X, Derpanis KG, Daniilidis K. 2017. Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 7025–7034.
  • Pishchulin L, Andriluka M, Gehler P, Schiele B. 2013. Poselet conditioned pictorial structures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 588–595.
  • Pishchulin L, Insafutdinov E, Tang S, Andres B, Andriluka M, Gehler PV, Schiele B. 2016. Deepcut: joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 4929–4937.
  • Rafi U, Leibe B, Gall J, Kostrikov I. 2016. An efficient convolutional network for human pose estimation. In: Wilson RC, Hancock ER, Smith WAP, editors. Proceedings of the British Machine Vision Conference (BMVC); Sep; BMVA Press. p. 109.1–109.11.
  • Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv Preprint arXiv:1409 1556. Sep 4.
  • Su Z, Ye M, Zhang G, Dai L, Sheng J. 2019. Cascade feature aggregation for human pose estimation. arXiv Preprint arXiv:1902 07837. Feb 21.
  • Sun K, Xiao B, Liu D, Wang J. 2019. Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 5693–5703.
  • Sun X, Shang J, Liang S, Wei Y. 2017. Compositional human pose regression. In: Proceedings of the IEEE International Conference on Computer Vision. p. 2602–2611.
  • Sun X, Xiao B, Wei F, Liang S, Wei Y. 2018. Integral human pose regression. In: Proceedings of the European Conference on Computer Vision (ECCV). p. 529–545.
  • Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. 2017. Inception-v4, inception-Resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence; Feb 12.
  • Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C. 2015. Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 648–656.
  • Tompson JJ, Jain A, LeCun Y, Bregler C. 2014. Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems; p. 1799–1807.
  • Wei SE, Ramakrishna V, Kanade T, Sheikh Y. 2016. Convolutional pose machines. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. p. 4724–4732.
  • Xiao B, Wu H, Wei Y. 2018. Simple baselines for human pose estimation and tracking. In: Proceedings of the European conference on computer vision (ECCV). p. 466–481.
  • Yang W, Ouyang W, Li H, Wang X. 2016. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 3073–3082.
  • Yang Y, Baker S, Kannan A, Ramanan D. 2012. Recognizing proxemics in personal photos. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition; Jun 16; IEEE. p. 3522–3529.
  • Yu X, Zhou F, Chandraker M. 2016. Deep deformation network for object landmark localization. In: European Conference on Computer Vision; Oct 8; Cham: Springer. p. 52–70.
  • Yu Y, Liu F. 2018. A two-stream deep fusion framework for high-resolution aerial scene classification. Comput Intell Neurosci. 2018:1–13. doi:https://doi.org/10.1155/2018/8639367.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.