493
Views
9
CrossRef citations to date
0
Altmetric
Research Article

LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture

Pages 302-307 | Received 16 Sep 2020, Accepted 07 Oct 2020, Published online: 21 Oct 2020
 

ABSTRACT

One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Satoshi Kondo

Satoshi Kondo received his B.S., M.S. and Ph.D degrees at Osaka Prefecture University in 1990, 1992 and 2004, respectively. He joined Matsushita Electric Industrial Co., Ltd. (now Panasonic Corporation) in 1992. He mainly developed video coding and computer vision technologies during he was with Panasonic Corporation. He holds over 100 patents on H.264/MPEG-4 AVC video coding standard. Since 2014, he is with Konica Minolta, Inc. and he is the head of AI technology development department now. His current research interests are in the fields of image processing and computer vision, especially for medical images.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.