4,389
Views
20
CrossRef citations to date
0
Altmetric
Review

Image-based laparoscopic tool detection and tracking using convolutional neural networks: a review of the literature

, &

References

  • Bodenstedt S, Allan M, Agustinos A, et al. Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery. arXiv: Comput Vis Pattern Recogn. 2018.
  • Zhao Z, Voros S, Chen Z, et al. Surgical tool tracking based on two CNNs: from coarse to fine. J Engg-Joe. 2019;2019(14):467–472.
  • Jin Y, Li H, Dou Q, et al. Multi-task recurrent convolutional network with correlation loss for surgical video analysis. Med Image Anal. 2020;59:1–14.
  • Bouget D, Allan M, Stoyanov D, et al. Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Med Image Anal. 2017;35:633–654.
  • Lecun Y, Bengio Y, Hinton GE. Deep learning. Nature. 2015;521(7553):436–444.
  • Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, et al, editors. Neural information processing systems. Red Hook (NY): Curran Associates Inc.; 2012. p. 1097–1105.
  • Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Mortensen E, Fidler S, editors. Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2014. p. 580–587.
  • Nwoye CI, Mutter D, Marescaux J, et al. Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. Int J Comput Assist Radiol Surg. 2019;14(6):1059–1067.
  • Jin A, Yeung S, Jopling J, et al. Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: Medioni G, Hoogs A, McCloskey S, editors. IEEE Winter Conference on Applications of Computer Vision. Washington (DC): IEEE Computer Society; 2018. p. 691–699.
  • Du X, Kurmann T, Chang P, et al. Articulated multi-instrument 2-D pose estimation using fully convolutional networks. IEEE Trans Med Imaging. 2018;37(5):1276–1287.
  • Sarikaya D, Corso JJ, Guru KA. Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection. IEEE Trans Med Imaging. 2017;36(7):1542–1549.
  • Chen ZR, Zhao ZJ, Cheng XL. Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context. Proceedings of 2017 Chinese Automation Congress. New York (NY): IEEE; 2017. p. 2711–2714.
  • Zhang K, Zhang L, Liu Q, et al. Fast visual tracking via dense spatio-temporal context learning. In: Fleet D, Pajdla T, Schiele B, et al., editors. Lecture notes in computer science. Proceedings of 2014 European Conference on Computer Vision. Cham-Heidelberg: Springer; 2014. p. 127–141.
  • Padoy N, Blum T, Feussner H, et al. On-line recognition of surgical activity for monitoring in the operating room. In: Goker MH, editor. IAAI'08: Proceedings of the 20th National Conference on Innovative Aplications of Artificial Intelligence. Palo Alto (CA): AAAI Press; 2008. p. 1718–1724.
  • Twinanda AP, Shehata S, Mutter D, et al. EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging. 2017;36(1):86–97.
  • Stauder R, Okur A, Peter L, et al. Random forests for phase detection in surgical workflow analysis. In: Stoyanov D, Collins DL, Sakuma I, Abolmaesumi P, Jannin P, editors. Information processing in computer-assisted interventions. IPCAI 2014. Lecture notes in computer science. Vol 8498. Cham; Heidelberg: Springer; 2014. p. 148–157.
  • Sahu M, Mukhopadhyay A, Szengel A, et al. Tool and Phase recognition using contextual CNN features. arXiv: Computer Vision and Pattern Recognition. 2016.
  • Wang S, Raju A, Huang J. Deep learning based multi-label classification for surgical tool presence detection in laparoscopic videos. Proceedings of 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). Melbourne (VIC): IEEE; 2017. p. 620–623.
  • Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2014.
  • Szegedy C, Liu W, Jia Y, et al.Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Boston, MA. Washington (DC): IEEE Computer Society; 2015. p. 1–9.
  • Sahu M, Mukhopadhyay A, Szengel A, et al. Addressing multi-label imbalance problem of surgical tool detection using CNN. Int J Comput Assist Radiol Surg. 2017;12(6):1013–1020.
  • Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Mortensen E, Fidler S editors. IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2015. p. 3431–3440.
  • Garcia-Peraza-Herrera LC, Li W, Gruijthuijsen C, et al. Real-time segmentation of non-rigid surgical tools based on deep learning and tracking. In: Peters T, Yang GZ, Navab N, Mori K, Luo X, Reichl T, et al. editors. Lecture notes in computer science. Cham; Heidelberg: Springer; 2017. p. 84–95.
  • Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39(6):1137–1149.
  • Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, editors. Lecture notes in computer science. Cham; Heidelberg: Springer; 2014. p. 818–833.
  • Nakazawa A, Harada K, Mitsuishi M, et al. Real-time surgical needle detection using region-based convolutional neural networks. Int J Comput Assist Radiol Surg. 2020;15(1):41–47.
  • Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection. In: Russakovsky O, editor. IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2016. p. 779–788.
  • Choi B, Jo K, Choi S, et al. Surgical-tools Detection based on Convolutional Neural Network in Laparoscopic Robot-assisted Surgery. In: Proceedings of Annual International Conference of the IEEE Engineering in Medicine and Biology Society. New York (NY): IEEE; 2017. p. 1756–1759.
  • Jo K, Choi Y, Choi J, et al. Robust real-time detection of laparoscopic instruments in robot surgery using convolutional neural networks with motion vector prediction. Appl Sci-Basel. 2019;9(14). DOI:10.3390/app9142865
  • Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Mortensen E, editor. IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2017. p. 6517–6525.
  • Mishra K, Sathish R, Sheet D. Learning latent temporal connectionism of deep residual visual abstractions for identifying surgical tools in laparoscopy procedures. In: Mortensen E, editor. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Washington (DC): IEEE Computer Society; 2017. p. 2233–2240.
  • Al Hajj H, Lamard M, Conze P, et al. Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks. Med Image Anal. 2018;47:203–218.
  • Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF, editors. Lecture notes in computer science. Cham; Heidelberg: Springer; 2015. p. 234–241.
  • Kurmann T, Marquez Neila P, Xiaofei D, et al. Simultaneous recognition and pose estimation of instruments in minimally invasive surgery. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S, editors. Medical image computing and computer assisted intervention - MICCAI 2017. Lecture notes in computer science. Vol.10434. Cham; Heidelberg: Springer; 2017. p. 505–513.
  • Hou R, Chen C, Shah M. An end-to-end 3D convolutional neural network for action detection and segmentation in videos. arXiv: Computer Vision and Pattern Recognition. 2017.
  • Maturana D, Scherer S. VoxNet: a 3D convolutional neural network for real-time object recognition. In: Wang Z, Papanikolopoulos N, editors. IEEE International Conference on Intelligent Robots and Systems. New York (NY): IEEE; 2015. p. 922–928.
  • Funke I, Mees ST, Weitz J, et al. Video-based surgical skill assessment using 3D convolutional neural networks. Int J Comput Assist Radiol Surg. 2019;14(7):1217–1225.
  • Colleoni E, Moccia S, Du X, et al. Deep learning based robotic tool detection and articulation estimation with spatio-temporal layers. IEEE Robot Autom Lett. 2019;4(3):2714–2721.
  • Laina I, Rieke N, Rupprecht C, et al. Concurrent segmentation and localization for tracking of surgical instruments. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S, editors. Medical image computing and computer-assisted intervention − MICCAI 2017. MICCAI 2017. Lecture notes in computer Science. Vol 10434. Cham; Heidelberg: Springer; 2017. p. 664–672.
  • He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Mortensen E, Saenko K, editors. IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2016. p. 770–778.
  • Mondal SS, Sathish R, Sheet D. Multitask Learning of Temporal Connectionism in Convolutional Networks using a Joint Distribution Loss Function to Simultaneously Identify Tools and Phase in Surgical Videos [arXiv]. arXiv. 2019. p. 15.
  • Ma X, Hovy E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Erk K, Smith NA, editors. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin (Germany): Association for Computational Linguistics; 2016. p. 1064–1074.
  • Newell A, Yang K, Deng J. Stacked Hourglass networks for human pose estimation. In: Leibe B, Matas J, Sebe N, Welling M, editors. Lecture notes in computer science. Cham; Heidelberg: Springer; 2016. p. 483–499.
  • Zhao Z, Cai T, Chang F, et al. Real-time surgical instrument detection in robot-assisted surgery using a convolutional neural network cascade. Healthc Technol Lett. 2019;6(6):275–279.
  • Liu Y, Zhao Z, Chang F, et al. An anchor-free convolutional neural network for real-time surgical tool detection in robot-assisted surgery. IEEE Access. 2020;8:78193–78201.
  • Howard AG, Menglong Z, Bo C, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. [arXiv]. arXiv. 2017. p. 9.
  • Jia Z, Huang X, Chang EI, et al. Constrained deep weak supervision for histopathology image segmentation. IEEE Trans Med Imaging. 2017;36(11):2376–2388.
  • Zhou Y, Zhu Y, Ye Q, et al. Weakly supervised instance segmentation using class peak response. In: Mortensen E, Brendel W, editors. IEEE Conference on Computer Vision and Pattern Recognition. Washington (DC): IEEE Computer Society; 2018. p. 3791–3800.
  • Vardazaryan A, Mutter D, Marescaux J, et al. Weakly-supervised learning for tool localization in laparoscopic videos. In: Stoyanov D, Taylor Z, Balocco S, Sznitman R, editors. Intravascular imaging and computer assisted stenting and large-scale annotation of biomedical data and expert label synthesis. LABELS 2018, CVII 2018, STENT 2018. Lecture notes in computer science. Vol 11043. Cham; Heidelberg: Springer; 2018. p. 169–179.
  • Attia M, Hossny M, Nahavandi S, et al. Surgical tool segmentation using a hybrid deep CNN-RNN auto encoder-decoder. Proceedings of 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). New York (NY): IEEE; 2017. p. 3373–3378.