Search in:

Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization Volume 10, 2022 - Issue 4

Submit an article Journal homepage

Open access

4,585

Views

CrossRef citations to date

Altmetric

Research Article

Randomly connected neural networks for self-supervised monocular depth estimation

Samyakh TukraHamlyn Centre of Robotic Surgery, Department of Surgery and Cancer, Imperial College, London, UKCorrespondence[email protected]
View further author information

Stamatia GiannarouHamlyn Centre of Robotic Surgery, Department of Surgery and Cancer, Imperial College, London, UKCorrespondence[email protected]
View further author information

Pages 390-399 | Received 20 Oct 2021, Accepted 21 Oct 2021, Published online: 10 Nov 2021

Cite this article
https://doi.org/10.1080/21681163.2021.1997648
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Figure 1. The proposed encoder-decoder architecture. The solid and the dotted lines denote forward propagation and skip connections, respectively. The purple lines signify the output left and right disparity maps generated at 4 scales, each increasing hierarchically with a scale factor of 2.

Figure 2. The process of generating a randomly connected deep-learning architecture. The red and the grey nodes are the input and the output nodes of the block, respectively. The black nodes are the convolution operations. The purple module is a multi-head self-attention module.

Figure 3. Showing the proposed learn-able skip connections methodology. $a$ is the network topology of a standard U-Net and $b$ is the proposed new network topology. Solid lines and dotted lines denote forward propagation and skip connections, respectively.

Figure 4. Discriminator model. The solid black lines and the dotted lines denote forward propagation and skip connections, respectively. The skip connection inputs are concatenated with the forward propagating feature map prior to being processed by the next layer. The pink lines show the extraction of multi-scale feature maps.

Figure 5. Depth maps generated by the proposed model on the SCARED and Hamlyn test splits shown in rows $A$ and $B$ , respectively. Inclusion of $L S$ signifies presence of learn-able skip connections and $n o L S$ without. All models were trained with the standard $L_{s s i m}$ loss (EquationEquation 1(1) $L_{s s i m} = δ_{1} L_{r e c o n} + δ_{2} L_{s m o o t h} + δ_{3} L_{L R c o n s}$ (1) ) unless $L_{F i n a l}$ is specified, which signifies training with the proposed loss function (EquationEquation 5(5) $\begin{aligned} L_{f i n a l} = δ_{1} L_{r e c o n} + δ_{2} L_{s m o o t h} + \\ δ_{3} L_{L R c o n s} + δ_{4} L_{a d v}^{G} + δ_{5} L_{f e a t} \end{aligned}$ (5) ). The colour bar displays the depth range in mm.

Figure 6. Comparison of depth maps generated by our model with Godard et al. (Citation2017) and Pilzer et al. (Citation2018) on the Hamlyn test split. The last column re-displays the depth produced by our proposed method (column 2), with the green boxes highlighting the key areas where more details are visible in our depth maps. The colour bar displays the depth range in mm.

Table 1. Showing the mean absolute depth error in mm for test dataset 1 from SCARED. The compared methods are the self-supervised methods submitted to this challenge. (M) signifies Monodepth method and (S) Stereodepth. The presence of $L$ denotes the type of loss function used for training, and presence of (no LS) implies no learn-able skip connections were used in the model. Best results are highlighted in bold

Display Table

Table 2. Showing the mean absolute depth error in mm for test dataset 2 from SCARED. The compared methods are the self-supervised methods submitted to this challenge. Where (M) signifies Monodepth method and (S) Stereodepth. The presence of $L$ denotes the type of loss function used for training, and presence of (no LS) implies no Learn-able skip connections were used in the model. Best results are highlighted in bold

Display Table

Table 3. SSIM index on the reconstructed images generated via the estimated disparity maps on the Hamlyn test dataset. Higher value indicates better performance. Where (M) signifies Monodepth method and (S) Stereodepth. The M in parameter count signifies a million parameters. Methods $A$ , $B$ , $C$ , $D$ and $E$ are Geiger et al. (Citation2010), Yamaguchi et al. (Citation2014), Ye et al. (Citation2017), Godard et al. (Citation2017) and Pilzer et al. (Citation2018), respectively

Display Table

Godard C, Mac Aodha O, Brostow GJ. 2017. Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii.

Google Scholar

Pilzer, Andrea, Dan Xu, Mihai Marian Puscas, Elisa Ricci and N. Sebe. 2018. “Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks.” 2018 International Conference on 3D Vision (3DV) : 587–595.

Google Scholar

Geiger A, Roser M, Urtasun R. 2010. Efficient large-scale stereo matching. ACCV. Berlin, Heidelberg: Springer

Google Scholar

Yamaguchi K, McAllester DA, Urtasun R 2014. Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: ECCV, Zurich, Switzerland.

Google Scholar

Ye M, Johns E, Handa A, Zhang L, Pratt P, Yang G 2017. Self-supervised siamese learning on stereo image pairs for depth estimation in robotic surgery. ArXiv. abs/1705.08260.

Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Randomly connected neural networks for self-supervised monocular depth estimation

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Randomly connected neural networks for self-supervised monocular depth estimation

Figures & data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date