Full article: A comparative study on extraction of buildings from Quickbird-2 satellite imagery with & without fusion

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Extraction of building from very high resolution satellite imagery is a challenging task. Many automatic algorithms are proposed to extract buildings from remote sensing imageries, but most of the algorithms detect only rectangular buildings very effectively (i.e. buildings with the same size and shape). In this paper, an attempt is made to extract buildings with different shape, size, color and pattern from Quickbird-2 imagery. In the automatic method, firstly the adaptive k means clustering algorithm is performed to classify the pixels into a number of classes which then is followed by morphological operators to extract the buildings. The manual method is also implemented to extract building feature. Consequently, both, the automatic and manual methods are adopted on the original Multispectral (MS) image and on the fused image obtained by fusing Quickbird-2 Panchromatic (Pan) image with MS image using the Fuze Go method. The performance of both the methods for the extraction of buildings is evaluated using qualitative and metric analysis. The experimental results show that both the methods are performed reasonably well. However, improving the spatial resolution of the original MS image by fusion helps to determine the buildings information more precisely in terms of spatially as well as spectrally.

Keywords:

Public Interest Statement

Remote sensing satellite sensor capture the image of the earth surface. In which the remote sensing image contains information of both man-made and non-man made features. These image help the viewer to visually interpret the changes of features over the period of time. The remote sensing satellite images can be deployed for various applications including building extraction. Further, extraction of buildings has significant applications in the domain of urban mapping, urban planning and also helps to assess the destruction caused after the natural disasters such as floods and earthquake. Therefore, persistent attention is paid on the remote sensing satellite imagery for the extraction of buildings. Many algorithms are developed and available for the extraction of buildings from the satellite imagery. In which majority of algorithms extracts the buildings accurately and effectively.

1. Introduction

Extraction of buildings from remote sensing satellite imagery is one of the challenging problems. Recently, improvement in the spatial and spectral resolution of remote sensing satellite imagery has driven researchers to develop different algorithms (i.e. automatic and semi-automatic) for the extraction of buildings from very high resolution satellite imagery. Detection of buildings from the satellite imagery has various significant applications in the domain of urban mapping, urban planning, urban change detection analysis, target detection, geographic information system (GIS) (Shorter & Kasparis, Citation2009). Further, detection of buildings is very much important to assess the extent of destruction caused after natural disasters such as floods, earthquake and military operation.

Many factors that appear in the satellite imageries, make the process more complex for the extraction of buildings, even though the new sensors provide satellite imagery with improved resolutions. Factors such as scene complexity, building variability and sensor resolution (Mayer, Citation1999) affect the overall accuracy for the detection of buildings. The man-made feature (i.e. building) is one of the most significant feature among the other features, which consume time and cost to extract, for the reason of their variability, complexity and abundance in urban areas (Chaudhuri, Kushwaha, Samal, & Agarwal, Citation2016).

Generally, very high resolution satellite imagery is necessary to extract detailed spatial and spectral information of buildings. The satellite captures the images of earth with Pan and MS sensors. The Pan sensor offer image with high spatial resolution. The MS sensors offer image with high spectral resolution but low spatial resolution when compared to Pan image Zhang and Mishra (Citation2013). Generally, Pan image covers wider spectral wavelength whereas MS image covers the minute range of wavelength. Thus, there is a trade-off between the sensors in the form of spatial resolution, spectral resolution and swath width etc. which are caused due to technical and budget limitations. In reality, acquisition of MS image with the spatial resolution of Pan image is expensive. On the other hand, pan-sharpening or image fusion methods are developed to obtain the image with both high spatial and spectral resolution.

Therefore, to meet this goal, various image fusion or pan-sharpening methods have been proposed to improve the spatial resolution of multispectral (MS) image such as principal component analysis (PCA) (Chavez & Kwarteng, Citation1989), hyperspectral color space (HCS) (Padwick, Deskevich, Pacifici, & Smallwood, Citation2010), high pass filter (HPF) (Chavez, Sides, & Anderson, Citation1991), gram-schmidt (GS) (Laben & Brower, Citation2000), Ehlers (Ehlers, Kolouch, Lohman, & Dennert-Möller, Citation1984) and subtractive resolution merge (SRM) (Ashraf, Brabyn, & Hicks, Citation2012). Some hybrid pan-sharpening methods are also used widely now-a-days such as wavelet-principal component analysis (W-PCA) and wavelet-intensity hue saturation (W-HIS). These hybrid methods work on the principle of wavelet decomposition (King & Wang, Citation2001) and more details of these hybrid methods can be found in (González-Audícana, Otazu, Fors, & Seco, Citation2005; Ranchin & Wald, Citation1993). In order to extract buildings from satellite imagery, numerous algorithms were proposed by various researchers, which are as follows.

Attarzadeh and Momeni (Citation2012) proposed object based algorithm, in which stable and variable features were utilised jointly, obtained from inherent qualities and threshold analysis. The visual analysis indicate that the algorithm can detect major rectangular buildings of Quickbird imagery. Wang, Yuan, et al. (Citation2013) detected the rectangular buildings using mean shift segmentation, scale invariant feature transform (corner detection) and adaptive windowed Hough transform. Wang, Qin, et al. (Citation2013) adopted bilateral filter, line segment detector and perceptual grouping approach. All of the algorithm mentioned above detects only the rectangular buildings from the RGB image.

Ghaffarian and Ghaffarian (Citation2014b) used double threshold method, parallelepiped supervised classification and morphological operators for building detection from Google earth image. The proposed method can detect the buildings without influencing from their geometric characteristics and also it provides the training data sample automatically to the supervised classification. However, the method classifies the non-buildings features as building features, when they have equivalent spectral values.

Chaudhuri et al. (Citation2016) detected building from high resolution Panchromatic (Pan) imagery of Quickbird and Ikonos using morphological operator, multispeed-based clustering technique and adaptive threshold based segmentation. The proposed approach relies on the shadows of the buildings to accurately extract the buildings, images with low-rise buildings in urban area do not have sufficient shadows in that situation the buildings are not detected accurately. Liasis and Stavrou (Citation2016) developed new active contour model to extract the buildings from the RGB image, the proposed method detects the buildings with arbitrary shapes and sizes. A limitation of the proposed method was that some non-building objects like bridges, roads are classified as buildings. Further, buildings which are close to each other were classified as a single building.

In previous studies, buildings are extracted using google earth image, panchromatic image, and multispectral image with the combination of R, G, and B color mode. Majority of algorithms work efficiently for detecting buildings with the same shape (i.e. rectangular, square), colour and size. Only few algorithms are available for detecting buildings with arbitrary shapes and sizes. In the present study, buildings are extracted from the multispectral image with the combination of R, G, B and NIR color mode, in which buildings are in different shapes, colours and sizes.

In the present work, one of the objectives was to extract the buildings from the original multispectral (MS) imagery of Quickbird-2 in which the buildings are in different shape, size and colour. Two approaches were adopted to extract the buildings: (1) automatic, (2) manual, and the results of both were compared qualitatively. In the automatic approach; firstly, the vegetation portion was removed from the input image. Secondly, adaptive K-means clustering algorithm was adopted to cluster the different pixels into different classes. Finally, the morphological operator fill and open was implemented to extract the buildings. In the manual approach, area of interest (AOI) was created from the input image. Later, the generated AOI was used to subset the interested features (i.e. buildings) from the image. The next important objective was to find the effectiveness of improving the spatial resolution of the original MS image by fusing Pan and MS imageries of Quickbird-2, for the extraction of buildings using automatic and manual methods. The results of both the methods were compared qualitatively and discussed.

2. Materials and methodology

2.1. Data used

A high-resolution imaging satellite named Quickbird-2 was launched on 18 October 2001. Quickbird-2 acquires five bands covering panchromatic, blue, green, red and near-infrared (NIR). The spectral response of Quickbird-2 imagery is shown in Figure . The satellite sensor captures the Pan image with a high spatial resolution of 0.60 m and the MS image with high spectral resolution but a low spatial resolution of 2.4 m. The data location is the Opera house in Sydney, Australia (33° 51′ 25′′ S, 151° 12′ 55′′ E) provided by Digital Globe. The wavelength range of four bands such as blue, green, red and NIR matches with the Pan band and all four bands are layer stacked to obtain the MS image. The imagery of Quickbird-2 covers features such as commercial buildings, urban area, roads, vehicles, water, roof, tree, and grass. In the MS image the shape of the vehicles, building roofs are not easily identifiable; on the other hand, these are easily recognizable in the Pan image. Therefore, enhancing the spatial resolution of an MS image of Quickbird-2 will help to extract the buildings with high spatial and spectral information.

Figure 1. Spectral response of Quickbird-2 satellite imagery.

2.2. Methodology

The methodology for extracting the buildings using automatic and manual methods from the Quickbird-2 satellite imagery is shown in Figures and .

Figure 2. Methodology for extracting the building automatically.

Figure 3. Methodology for extracting the building manually.

2.2.1. Methodology for the extraction of buildings automatically

The detailed explanation of methodology for extracting the buildings automatically are explained as below, the algorithm process automatically without preclassification or any training sets, however some initial algorithm parameters must be set by the user.

2.2.1.1. Removal of vegetation portion

The multispectral image with the combination of band blue, green, red and near-infrared were used. From the input image, it is visualised that the vegetation feature is quite dominant compared to the other features; therefore the portion of vegetation is removed based on the intensity value. The threshold value of red >120, green <100 and blue <100 were used for the removal of vegetation. These values needs to be manually adjusted by the user. In our experiments we have found that the above mentioned values are giving satisfactory results.

2.2.1.2. Adaptive K-means clustering

The Adaptive K-means clustering algorithm functions by automatically choosing the appropriate K elements from the input image (Bhatia, Citation2004) and further information of adaptive clustering can be found in (Chen, Luo, & Parker, Citation1998; Pappas, Citation1992). The algorithm automatically determines the K elements and generates the group of clusters (i.e. the feature with the same intensity value are grouped together). Generally, the algorithm classifies each pixel into the clusters, based on their intensity values. Firstly, the algorithm computes the distance between the selected element and the number of clusters. This process also helps to determine the distance between the two elements. In order to compute the distance, it is important to normalize the distance properties, so that the domination of distance from one property (or) certain properties are not omitted from the computation of distance. The method, Euclidean distance is adequate for determining the distance between two elements. If the input data encompass n-dimension, then the distance of two elements such as $A_{1} = \{A_{11}, A_{12}, \dots, A_{1 n}\}$ and $A_{2} = \{A_{21}, A_{22}, \dots, A_{2 n}\}$ is given by:

(1) $Distance (A_{1}, A_{2}) = \sqrt{{(A_{11} - A_{21})}^{2} + {(A_{12} - A_{22})}^{2} + \dots + {(A_{1 n} - A_{2 n})}^{2}}$ (1)

By the distance function, the further processing of the algorithm is explained as below: The distance is computed for each of the clusters from one another. The computed distance is warehoused in two-dimensional array as a triangular matrix. Later, the minimum distance $(d_{min})$ among any two clusters (i.e. $B_{m 1} and B_{m 2}$ ) and also the two nearest clusters are identified. For any un-clustered element E_i, it computes the distance of E_i from every cluster. To assign the element E_i to the appropriate cluster, the following three different processes are mentioned.

(i)	If the distance of the element E_i is zero from the clusters, then allocate E_i to that cluster and examine the other un-clustered elements.
(ii)	If the distance of the element E_i from the known clusters is less than the distance $(d_{min})$ then allocate E_i to the nearest clusters. By allocating E_i to the clusters, the centroid representation of clusters may differ; therefore the centroid is recalculated for all the elements presented in the respective clusters. Further, the distance of disturbed clusters from the other clusters, minimum distance between the two clusters and the two clusters that are near to each other is also recomputed.
(iii)	The distance $(d_{min})$ is less than the distance of the element from the nearest cluster, in this case we select the two closest clusters (e.g. $C_{m 1} and C_{m 2}$ ), then merge C_m2 into C_m1. Later the elements presented in the C_m2 cluster are removed and also the representation of respective cluster is deleted. Further, new elements are added to the empty cluster and distance between all the clusters are re-determined and two nearest clusters are recognized again.

The above mentioned steps are iterated for all the elements to be clustered.

2.2.2. Morphological fill and open operation

2.2.2.1. Fill operation

It is used to fill the holes in the grayscale image I. A hole is defined as an area of dark pixels surrounded by lighter pixel.

The following matlab syntax is used to fill holes in the image:

(2) $I 2 = imfill (I)$ (2)

where, I is the binary image. The advantage of fill operation is to fill the holes in the image by describing an area of dark pixels bounded by light pixels and producing another binary image I2.

2.2.2.2. Open operation

The morphological open operator are normally applied to the binary image. It is used to remove the features that are smaller than the value of p pixels and retains the large structure in the image.

The following matlab syntax is used to extract the objects from the input image:

(3) $IM = bwareaopen (I, P)$ (3)

In our experiments we have used a threshold value of 600 pixels (p = 600) which is found to be appropriate for the extraction of majority buildings.

To evaluate the performance of automatic algorithm, following two metrics proposed in (Lin & Nevatia, Citation1998) were used. Here, the performance of algorithm were compared with the ground truth data which is derived manually.(4) $Detection percentage (DP) = \frac{100 \times TP}{TP + TN}$ (4)

(5) $Branch factor (BF) = \frac{100 \times FP}{TP + FP}$ (5)

where, True positive $(TP)$ denotes the detection of buildings by both automatic algorithm and manual. False positive $(FP)$ indicates the number of buildings detected by the algorithm but not manually. True negative $(TN)$ denotes a buildings extracted by a manual approach but not by the automatic algorithm. The detection of building is rated, if at least a small portion of it is detected by the automatic algorithm (Chaudhuri et al., Citation2016). The two metrics are computed by comparing the buildings detected by the manual approach and by the automatic algorithm. The metric DP calculates how many of the buildings in the image are extracted by the automatic algorithm and BF denotes how many buildings are found erroneously.

The above procedure is adopted to extract buildings from the original MS image of Quickbird-2 satellite imagery and the same procedure is adopted to extract buildings from the pan-sharpened image (i.e. fused image is generated by using the Fuze Go method) and the results of both are compared using metrics and qualitative analysis.

2.2.3. Pan-sharpening

Pan-sharpening is the process of transferring the spatial resolution of Pan image to the MS image to obtain a single fused (or) pan-sharpened image with both high spatial and spectral resolution (Zhang, Citation2010). During the process of pan-sharpening, the two most important key quality aspects of fused images are the enhancement of spatial resolution and the preservation of spectral information. In other words, the effectiveness of pan-sharpening algorithm (i.e. Fuze Go) should not distort the spectral information of an MS image while enhancing the spatial resolution.

2.2.3.1. Fuze Go

The Fuze Go method achieves a pan-sharpened MS image by implementing the following process: The MS bands having a spectral range equal to that of the Pan band are selected. Standard deviation, mean, and covariance are calculated for both the selected MS bands and the Pan band. Then, histogram standardization is implemented on both bands. By implementing mean and standard deviation, all of the selected sets of MS and Pan images are standardized. The coefficient values are computed by applying the selected MS and Pan bands. Band weights calculated from the covariance matrix are applied for simulating a synthetic Pan band. Subsequently, a synthetic Pan band is created by applying the selected MS bands and set weights. The product-synthetic ratio is determined by applying the standardized Pan band, standardized MS bands and synthetic Pan image to obtain the fused image. Further details can be found in Zhang (Citation2004). A common flowchart for the Fuze Go method is shown in Figure .

Figure 4. A common flowchart for the Fuze Go method.

2.2.4. Quality analysis

The main theme of Fuze Go algorithm is to preserve the relevant information that is present in the input images and to reduce the spatial and spectral distortion in the fused image. Therefore, the performance of Fuze Go method is evaluated by Quality with no reference image (QNR) index (Alparone et al., Citation2008).

2.2.4.1. QNR

It computes the spectral and spatial distortion, $D_{λ}$ and D_s, in the fused image without demanding a reference image. The ideal value of QNR is one which indicates the best spatial and spectral performance of the fused image. The QNR index is defined as:

(6) $QNR = {(1 - D_{λ})}^{α} {(1 - D_{s})}^{β}$ (6)

where α and β are set to one, thus the equivalent position is given to both spatial and spectral quality.(7) $D_{λ} = \sqrt{\frac{1}{N (N - 1)} \sum_{l = 1}^{N} \sum_{r = 1, r \neq 1}^{N} |Q ({MS}_{{LowRes}_{l}}, {MS}_{{LowRes}_{r}}) - Q ({MS}_{{Fused}_{l}}, {MS}_{{Fused}_{r}})|}$ (7)

(8) $D_{s} = \sqrt{\frac{1}{N} \sum_{l = 1}^{N} |Q ({MS}_{{LowRes}_{l}}, {Pan}_{LowRes}) - Q ({MS}_{{Fused}_{l}}, {MS}_{HighRes})|}$ (8)

For determining the spectral distortion, the parameter Q is calculated at both low and fused resolutions among each of the MS bands (Inter bands). For determining spatial distortion, the Q index is calculated between each MS band (Inter bands) and Pan image over both the low and high resolution.

The inter-band computation at the two scale aid is defined. If there is a difference between the spectral content between MS bands across scale, spectral distortion is indicated. The intra-band calculations at the two scales helps to determine the difference between the spatial information between MS bands and Pan image across scale, indicating spatial distortion. Q index calculated among images across scale should remain constant.

2.2.5. Methodology for extracting the building manually

The methodology for extracting the buildings using manual methods from the Quickbird-2 satellite imagery is shown in Figure .

Firstly, area of interest (AOI) is generated by drawing the polygon over the interested building on the input image. Secondly, the created AOI portion of the building is used to subset the buildings from the input image.

The same procedure is applied on the input image (i.e. original MS image and after improving the spatial resolution of the original MS image) and the results of both are compared qualitatively.

3. Results and discussion

The original MS and Pan images are shown in the Figures and . The original MS with the combination of band blue, green, red and near-infrared contains different features such as buildings, roads, vehicles, vegetation, etc. It is important to note that the pattern, shape, size and spectral reflectance of the buildings vary from each other. It is also visualized that the color reflectance of roads and color reflectance of buildings are similar. Therefore, the attempt is made to extract the buildings with different size, shape, color and pattern.

Figure 5a. Original MS image of Quickbird-2.

Figure 5b. Original Pan image of Quickbird-2.

The automatic approach for extracting the buildings from the original MS image is shown in Figures –. Firstly, the vegetative portion of the image is removed and shown in Figure . Secondly, the adaptive k-means clustering algorithm automatically classifies the different pixels based on the intensity value into five different classes Figure . In which majority of buildings were observed in the class two. It is further observed that only the roof tops coming under class four and five which appears comparatively brighter. Majority of buildings portion comes under the class two, therefore, even if we remove the pixels coming under the class four and five, we were still able to identifying the majority of buildings. From the classified image, it is notable that the buildings with the same intensity value are clustered into one class. Further, it is also noted that the intensity value corresponding to some buildings is close to the intensity value of roads and hence it is segregated in the same class.

Figure 6a. Removal of vegetation.

Figure 6b. Adaptive k-means.

Figure 6c. Conversion of classified image into binary image.

Figure 6d. Morphological fill operation.

Figure 6e. Morphological area open operation.

Figure 6f. Conversion of binary mage (area open) to RGB.

Majority of buildings were found only in the class two. Therefore the binary image is created only for the class two is shown in the Figure , the small portion of buildings which are presented in others classes were ignored. It clearly indicates that some portion of the road is identified as building due to the similarity in intensity value and spectral reflectance. The same behaviour is noticed in literature (Ghaffarian & Ghaffarian, Citation2014a; Liasis & Stavrou, Citation2016). However, the morphological fill operation helps in restoring some of these pixels which were lost in the above process. Since some buildings roof top is void, in order to reduce the potential error, the voids were filled using morphological fill operator (i.e. the void presented in the buildings, after the classification process is identified with the reference to the original pan image) which is shown in Figure . To extract the buildings from the image, the morphological area open operator is adopted and the result is shown in Figure . The extracted buildings in the RGB color mode are shown in Figure .

In order to determine the effectiveness of improving the spatial resolution of the original MS image for the extraction of buildings, 0.60 m spatial resolution of original Pan image and 2.4 m spatial resolution of MS image is pan-sharpened using Fuze Go method and a single image is obtained having both; high spatial and spectral information which is shown in Figure . During the translation of spatial information from the Pan image to the MS image, the method may generate spatial and spectral distortion. In order to evaluate the quality of the pan-sharpened image, the statistical index QNR is adopted. The ideal value of QNR is one which indicates the best pan-sharpened image with high spatial and spectral information. The value (0.9173) of QNR indicates that the method Fuze Go generates a pan-sharpened image with high spatial and spectral information.

Figure 7. Pan-sharpened image of Fuze Go method.

The same methodology of automatic approach adopted to extract buildings from the pan-sharpened image is shown in Figures –. The removal of vegetation portion is shown in Figure . The classification of different features using adaptive k-means clustering is shown in Figure . Consequently, conversion of classified image into the binary image followed by morphological fill and open operators are shown in Figures –. The extraction of building in the RGB color mode is shown in Figure .

Figure 8a. Removal of vegetation portion.

Figure 8b. Adaptive K-means clustering.

Figure 8c. Conversion of classified image into binary image.

Figure 8d. Morphological fill operation.

Figure 8e. Morphological area open operation.

Figure 8f. Conversion of binary mage (area open) to RGB.

To evaluate the performance of automatic algorithm both metrics and visual analysis were used. The total number of buildings presented in the input image is twelve and shown in the Figure . Table shows the performance of automatic algorithm using two metrics like DP and EF. Here, the building detection percentage of automatic algorithm for before fusion image and after fusion image are reasonable for such a challenging MS image. However, the branch factor indicates the percentage of buildings found erroneously. It is notable that some portion of the road is identified as buildings due to the similarity in intensity value and spectral reflectance. However, the loss of information is higher in pan-sharpened image compared to the original image. The consequence of resolution for detecting buildings are presented in (Segl & Kaufmann, Citation2001). The common challenge for detecting buildings from less than or equal to 1 m pixel resolutions are low-signal to noise ratio and weak objects signal.

Figure 9. Total number of buildings in original MS image.

Table 1. Evaluation of automatic algorithm using two metrics DP and BF

Download CSV Display Table

Generally, it is well understood that the loss of information is evident for the extraction of buildings using any automatic algorithm. The loss of information may be higher or lower which is totally dependent on the scene complexity, building variability and abundance in the urban areas. If there exists a significant difference in the feature size, pattern and shape, loss of information ensues. In our case, majority of buildings were found in the class two. Therefore, the binary image is created only for the class two, the small portion of buildings which are presented in others classes were ignored. However, the morphological fill operation helps in restoring some of these pixels which were lost in the above process. If the threshold value for morphological open operation is too large or small, it may lead to over and under segmentation respectively.

The visual comparison of spatial and spectral information of extracted buildings from the automatic approach (i.e. before fusion and after fusion) are shown in Figures and . Here, the red color circle indicates the sample location to differentiate the extracted buildings in terms of spatial and spectral information.

Figure 10. Visual analysis of spatial and spectral information of building for original MS image.

Figure 11. Visual analysis of spatial and spectral information of building for Pan-sharpened image.

The circles A, B and C represent the sample extraction of building which differ in pattern, color, size and shape. In the circle A of Figure , a small white portion on top of the building is clearly visible both spatially and spectrally, whereas the same object is not clearly visible in Figure . The circle B of Figure , representing the rooftop of the building is clearly interpreted both spatially and spectrally, when compared to circle B in Figure . Moreover, the circle C in Figure indicates that the building roof top with tiny structures, is clearly visible in size, shape and color, whereas the same tiny structures are not identified in Figure .

Therefore, it is evident that the improvement in the spatial and spectral information helps to determine the information of buildings more effectively. However, loss of information is visible in both the images (i.e. Figures and ) due to various factors such as pattern, size, shape and color of buildings which different from each other and some of the buildings having same color reflectance as a road. Moreover, detailed spatial and spectral information about buildings are high in the pan-sharpened image when compared to the original MS image.

In order to extract the buildings without any loss of information, the manual extraction method is adopted using ERDAS imagine 2014 software. At first, the area of interest (AOI) is created with the interested features (i.e. buildings) in the input image. Secondly, the same AOI is used to subset the interested features from the image. The same methodology adopted to extract the buildings from the original MS image and the pan-sharpened image are shown in Figures and . Here, the manual method extracts all the buildings without any loss of information and the pan-sharpened image with high spatial and spectral information helps to extract the building information very effectively. Nevertheless, manual extraction of building is time consuming and also depends on the user experience to digitize the boundary of buildings for effective extraction of buildings.

Figure 12. Manual extraction of building for original MS image.

Figure 13. Manual extraction of building for Pan-sharpened image.

By comparing the results of both automatic and manual methods, it is understandable that the automatic algorithm works efficiently when the interested features in the image are recognized to be in the same pattern and size. Generally, the loss of information in the output image is common regardless of input image (i.e. features with the same size or different size, same shape or different shape and same pattern or different pattern). Here, the loss of information in the automatic algorithm is noted for various reasons such as different building size, shape, pattern and color. In the case of manual method, the loss of information for extracting buildings is less, but the method requires more time to complete the process.

4. Conclusions

In this paper, automatic and manual method for the extraction of buildings are presented and compared for Quickbird-2 imagery. In the first phase, both the methods are adopted on the original MS image. Secondly, buildings are extracted after improving the spatial resolution of the original MS image by fusion and final results of both the methods (i.e. from the original MS image and the pan-sharpened image) are compared. The effectiveness of improving the spatial resolution of the original MS image for the extraction of buildings are compared qualitatively and quantitatively.

From the results of automatic method, it is noted that major buildings are detected correctly for the original MS image and the pan-sharpened image. However, loss of information is evident in both the images. The results of manual method indicate that the extraction of buildings is achieved with minimum loss of information in comparison with the automatic method.

The results from both the automatic and manual methods of pan-sharpened image indicate that the spatial and spectral information of buildings are clearly identifiable. Therefore, improving the spatial resolution of the original MS image increases the spatial and spectral information of buildings.

In the case of any input image, (i.e. if the interested features are identified to be different from each other in terms of shape, size and color) the manual method is recommended, in order to reduce loss of information. However, the effectiveness of the method depends on the user experience and it is a time consuming process.

It is to be noted that the performance of automatic algorithm is very effective when all buildings are in rectangular shape. In our case, the building shapes are different from one another and nevertheless, the performance of automatic approach in the paper for the extraction of buildings with different shape, size and color is reasonable.

Additional information

Funding

Funding. The authors received no direct funding for this research.

Notes on contributors

Jagalingam Pushparaj

Jagalingam Pushparaj is a research student at National Institute of Technology Karnataka (NITK), Surathkal, India. He received his Bachelor of Engineering Degree in Computer Science and Engineering from Anna University, India. His Masters was in Remote Sensing and GIS from Adhiyamaan Engineering College, Hosur, India. His research interests are Pan-sharpening, Bathymetry studies and Remote Sensing & GIS applications.

Arkal Vittal Hegde

Arkal Vittal Hegde is a professor of Applied Mechanics and Hydraulics Department at National Institute of Technology Karnataka, Surathkal, India. He received his Bachelor of Engineering Degree in Civil Engineering from the Karnataka Regional Engineering College (Now NITK). He received his Master Degree in Offshore Engineering from Indian Institute of Technology Bombay, India, and Doctor of Philosophy (PhD) Degree from Mangalore University, India. His research interests are Soft computing applications in coastal engineering, Coastal zone management and coastal erosion, Breakwaters, Coastal wetlands, etc.

References

Alparone, L., Aiazzi, B., Baronti, S., Garzelli, A., Nencini, F., & Selva, M. (2008). Multispectral and panchromatic data fusion assessment without reference. Photogrammetric Engineering & Remote Sensing, 74, 193–200. doi:10.14358/PERS.74.2.193
Web of Science ®Google Scholar
Ashraf, S., Brabyn, L., & Hicks, B. J. (2012). Image data fusion for the remote sensing of freshwater environments. Applied Geography, 32, 619–628. doi:10.1016/j.apgeog.2011.07.010
Web of Science ®Google Scholar
Attarzadeh, R., & Momeni, M. (2012). Object-based building extraction from high resolution satellite imagery. International Archives of the Photogrammetry, 29-B4, 57–60. Retrieved from http://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XXXIX-B4/57/2012/isprsarchives-XXXIX-B4-57-2012.pdf
Google Scholar
Bhatia, S. K. (2004). Adaptive K-means clustering algorithm. In Proceedings of Florida Artificial Intelligence Research Symposium.
Google Scholar
Chaudhuri, D., Kushwaha, N. K., Samal, A., & Agarwal, R. C. (2016). Automatic building detection from high-resolution satellite images based on morphology and internal gray variance. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9, 1767–1779. doi:10.1109/JSTARS.2015.2425655
Web of Science ®Google Scholar
Chavez, P. S. T., & Kwarteng, A. Y. (1989). Extracting spectral constrast in landsat thematic mapper image data using selective principal component analysis. Photogrammetric Engineering and Remote Sensing, 55, 339–348.
Web of Science ®Google Scholar
Chavez, P., Sides, S. C., & Anderson, J. A. (1991). Comparison of three different methods to merge multiresolution and multispectral data: Landsat TM and SPOT panchromatic. Photogrammetric Engineering & Remote Sensing, 57, 295–303.
Web of Science ®Google Scholar
Chen, C. W., Luo, J., & Parker, K. J. (1998). Image segmentation via adaptive K-mean clustering and knowledge-based morphological operations with biomedical applications. IEEE Transactions on Image Processing, 7, 1673–1683. doi:10.1109/83.730379
PubMed Web of Science ®Google Scholar
Ehlers, M. E. D.-M., Kolouch, D., Lohman, P., & Dennert-Möller, E. (1984). Nonrecursive filter techniques in digital processing of remote sensing imagery. Paper prsesented at the XVth International Conference (pp. 163–175). Retrieved from http://www.isprs.org/proceedings/XXV/congress/part7/163_XXV-part7.pdf
Google Scholar
Ghaffarian, S., & Ghaffarian, S. (2014a). Automatic building detection based on Purposive FastICA (PFICA) algorithm using monocular high resolution Google Earth images. ISPRS Journal of Photogrammetry and Remote Sensing, 97, 152–159. doi:10.1016/j.isprsjprs.2014.08.017
Web of Science ®Google Scholar
Ghaffarian, S., & Ghaffarian, S. (2014b). Automatic building detection based on supervised classification using high resolution Google Earth images. In ISPRS Journal of Photogrammetry and Remote Sensing (pp. 101–106). Zurich.
Web of Science ®Google Scholar
González-Audícana, M., Otazu, X., Fors, O., & Seco, A. (2005). Comparison between Mallat’s and the ‘à trous’ discrete wavelet transform based algorithms for the fusion of multispectral and panchromatic images. International Journal of Remote Sensing, 26, 595–614. doi:10.1080/01431160512331314056
Web of Science ®Google Scholar
King, R. L., & Wang, J. W. J. (2001). A wavelet based algorithm for pan sharpening Landsat 7 imagery. In Proceedings of the International of Geoscience and Remote Sensing Symposium (IGARSS 2001) (Vol. 2, pp. 849–851). doi:10.1109/IGARSS.2001.976657
Google Scholar
Laben, C., & Brower, B. (2000). Process for enhancing the spatial resolution of multispectral imagery using pan-sharpening. U.S. Patent 6,011,875. Washington, DC: U.S. Patent and Trademark Office.
Google Scholar
Liasis, G., & Stavrou, S. (2016). Building extraction in satellite images using active contours and colour features. International Journal of Remote Sensing, 37, 1127–1153. doi:10.1080/01431161.2016.1148283
Web of Science ®Google Scholar
Lin, C., & Nevatia, R. (1998). Building detection and description from a single intensity image. Computer Vision and Image Understanding, 72, 101–121. doi:10.1006/cviu.1998.0724
Web of Science ®Google Scholar
Mayer, H. (1999). Automatic object extraction from aerial imagery–A survey focusing on buildings. Computer Vision and Image Understanding, 74, 138–149. doi:10.1006/cviu.1999.0750
Web of Science ®Google Scholar
Padwick, C., Deskevich, M., Pacifici, F., & Smallwood, S. (2010). Worldview-2 pan-sharpening. ASPRS 2010, 48, 26–30. Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:WorldView-2+Pan-Sharpening#0
Google Scholar
Pappas, T. N. (1992). An adaptive clustering algorithm for image segmentation. IEEE Transactions on Signal Processing, 40, 901–914. doi:10.1109/78.127962
Web of Science ®Google Scholar
Ranchin, T., & Wald, L. (1993). The wavelet transform for the analysis of remotely sensed images. International Journal of Remote Sensing, 14, 615–619. doi:10.1080/01431169308904362
Web of Science ®Google Scholar
Segl, K., & Kaufmann, H. (2001). Detection of small objects from high-resolution panchromatic satellite imagery based on supervised image segmentation. IEEE Transactions on Geoscience and Remote Sensing, 39, 2080–2083. doi:10.1109/36.951105
Web of Science ®Google Scholar
Shorter, N., & Kasparis, T. (2009). Automatic vegetation identification and building detection from a single nadir aerial image. Remote Sensing, 1, 731–757. doi:10.3390/rs1040731
Web of Science ®Google Scholar
Wang, J., Qin, Q., Chen, L., Ye, X., Qin, X., Wang, J., & Chen, C. (2013). Automatic building extraction from very high resolution satellite imagery using line segment detector. In 2013 IEEE International Geoscience and Remote Sensing Symposium - IGARSS (pp. 212–215). IEEE. doi:10.1109/IGARSS.2013.6721129
Google Scholar
Wang, M., Yuan, S., & Pan, J. (2013). Building detection in high resolution satellite urban image using segmentation, corner detection combined with adaptive windowed hough transform. In 2013 IEEE International Geoscience and Remote Sensing Symposium - IGARSS (pp. 508–511). IEEE. doi:10.1109/IGARSS.2013.6721204
Google Scholar
Zhang, J. (2010). Multi-source remote sensing data fusion: Status and trends. International Journal of Image and Data Fusion, 1, 5–24. doi:10.1080/19479830903561035
Google Scholar
Zhang, Y. (2004). System and method for image fusion. United States Patent, 2004(0141659), A1.
Google Scholar
Zhang, Y., & Mishra, R. K. (2013). From UNB PanSharp to Fuze Go–The success behind the pan-sharpening algorithm. International Journal of Image and Data Fusion, 5, 39–53. doi:10.1080/19479832.2013.848475
Google Scholar

A comparative study on extraction of buildings from Quickbird-2 satellite imagery with & without fusion