LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization: Vol 9, No 3

493

Views

CrossRef citations to date

Altmetric

ABSTRACT

One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Satoshi Kondo

Satoshi Kondo received his B.S., M.S. and Ph.D degrees at Osaka Prefecture University in 1990, 1992 and 2004, respectively. He joined Matsushita Electric Industrial Co., Ltd. (now Panasonic Corporation) in 1992. He mainly developed video coding and computer vision technologies during he was with Panasonic Corporation. He holds over 100 patents on H.264/MPEG-4 AVC video coding standard. Since 2014, he is with Konica Minolta, Inc. and he is the head of AI technology development department now. His current research interests are in the fields of image processing and computer vision, especially for medical images.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture

Notes on contributors

Satoshi Kondo

Information for

Open access

Opportunities

Help and information

LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture

ABSTRACT

Disclosure statement

Additional information

Notes on contributors

Satoshi Kondo

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature