287
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Monocular 3D object detection with thermodynamic loss and decoupled instance depth

ORCID Icon, & ORCID Icon
Article: 2316022 | Received 05 Jun 2023, Accepted 02 Feb 2024, Published online: 13 Feb 2024
 

Abstract

Monocular 3D detection is to obtain the 3D information of the object from the image. The mainstream methods mainly use L1 loss or L1-like loss to control the instance depth prediction. However, these methods have not achieved satisfactory results. One of the main reasons is that L1 loss or L1-like loss does not accurately reflect the fit between the predicted instance depth and the corresponding ground truth. Another of the main reason is that the instance depth on the RGB image hard to be directly learned in the network. In order to solve the above problems, a novel thermodynamic loss based on the principle of free energy minimisation and a novel depth decoupling method are proposed in this paper. The proposed method is called the monocular 3D object detection network with thermodynamic loss and decoupled instance depth (TDN). In TDN, the optimisation of the instance depth prediction is regarded as the thermodynamic process. Therefore, the thermodynamic loss is designed according to the principle of free energy minimisation. TDN decouples the instance depth into three different depths. By combining the thermodynamic loss and the different types of depths, we can obtain the final instance depth.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Hubei University of Technology Graduate Research Innovation Project (4306.22019). The work described in this paper was support by National Natural Science Foundation of China Foundation No.61300127.