914
Views
0
CrossRef citations to date
0
Altmetric
Target Article

Vectorizing planar roof structure from very high resolution remote sensing images using transformers

, , , &
Pages 1-15 | Received 13 Jul 2023, Accepted 04 Dec 2023, Published online: 17 Dec 2023

Figures & data

Figure 1. False detection caused by false positive vertex or edge candidates, which are marked the yellow dotted line.

Figure 1. False detection caused by false positive vertex or edge candidates, which are marked the yellow dotted line.

Figure 2. The overall architecture of Roof-Former, which consists of three steps: (1) Image encoding and edge node initialization (yellow); (2) Image feature fusion with enhanced segmentation refinement branch (blue); and (3) Structural reasoning with Transformer decoders (green).

Figure 2. The overall architecture of Roof-Former, which consists of three steps: (1) Image encoding and edge node initialization (yellow); (2) Image feature fusion with enhanced segmentation refinement branch (blue); and (3) Structural reasoning with Transformer decoders (green).

Figure 3. The corner detection model adapted from edge detection architecture.

Figure 3. The corner detection model adapted from edge detection architecture.

Figure 4. Illustration of the proposed AFFM. C and r denote the channel number and the channel reduction ratio, respectively. The refined feature X is enhanced by the extracting local channel context (L(X) in blue box) and global channel context (G(X) in green box) in MS-CAM.

Figure 4. Illustration of the proposed AFFM. C and r denote the channel number and the channel reduction ratio, respectively. The refined feature X′ is enhanced by the extracting local channel context (L(X) in blue box) and global channel context (G(X) in green box) in MS-CAM.

Table 1. Quantitative results on the VWB dataset and the Enschede dataset are provided, where APH and FH represent the heatmap-based average precision and F1-score for both vertex and edge primitives.

Figure 5. Sample results on the Enschede dataset (first two rows) and VWB datasets (the third and fourth row). Roof-Former has been found to produce better reconstruction results for massive and complex buildings, with its output closer to the label in the second column. In comparison, other methods such as ConvMPN and HEAT results contain geometric imperfections such as narrow triangles, self-intersecting edges, and colinear edges.

Figure 5. Sample results on the Enschede dataset (first two rows) and VWB datasets (the third and fourth row). Roof-Former has been found to produce better reconstruction results for massive and complex buildings, with its output closer to the label in the second column. In comparison, other methods such as ConvMPN and HEAT results contain geometric imperfections such as narrow triangles, self-intersecting edges, and colinear edges.

Table 2. Ablation study for the components of Roof-Former, evaluated with vertex or edge F1 scores.

Figure 6. Sample reconstruction results of building objects based upon our extracted roof structure and nDSM.

Figure 6. Sample reconstruction results of building objects based upon our extracted roof structure and nDSM.

Data availability statement

The experiments conducted in this paper are based on two publicly available datasets, which can be accessed at Nauata and Furukawa (Citation2020) and Zhao, Persello, and Stein (Citation2022). Any inquiries regarding the datasets should be directed to the original authors.