175
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Automatic excavator action recognition and localisation for untrimmed video using hybrid LSTM-Transformer networks

, , &
Pages 353-372 | Received 15 Mar 2023, Accepted 26 Nov 2023, Published online: 13 Dec 2023
 

ABSTRACT

In mining and construction, excavators are integral to earth-moving operations. Accurate knowledge of excavator activities may be used in productivity analysis to streamline delivery. This paper presents a computer vision-based method for excavator action detection which can automatically inference the occurrence and time duration of excavator actions from untrimmed video captured from the excavator cab. The model uses a three-stage architecture consisting of a VGG16 feature extractor, a four-stage Transformer Encoder-Long Short-Term Memory (LSTM) module, and a post-processing component. The model’s predictive performance has been validated on the largest dataset among similar studies, comprising 567,000 frames filmed on-site at day and night. When tested on night and daytime videos, the model achieves accuracies of 90% and 70%, respectively, highlighting strong potential for practical implementation of the Transformer-LSTM network in excavator action detection. This study presents the first application of the combined Transformer-LSTM network for action detection in computer vision.

Acknowledgments

This work was supported by the Australian Centre for Field Robotics and the Rio Tinto Centre for Mine Automation.

Disclosure statement

This work was supported by the Rio Tinto Centre for Mine Automation and the Australian Centre for Field Robotics, the University of Sydney.

Data availability statement

Due to commercial restrictions, supporting data is not available.

Additional information

Funding

This work was supported by the The Rio Tinto Centre for Mine Automation, Australian Centre for Field Robotics, The University of Sydney, Australia.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.