296
Views
10
CrossRef citations to date
0
Altmetric
Research Papers

Impact of discretization methods on the rough set-based classification of remotely sensed images

, &
Pages 330-346 | Received 24 Dec 2009, Accepted 14 May 2010, Published online: 29 Jun 2010

Abstract

In recent years, the rough set (RS) method has been in common use for remote-sensing classification, which provides one of the techniques of information extraction for Digital Earth. The discretization of remotely sensed data is an important data preprocessing approach in classical RS-based remote-sensing classification. Appropriate discretization methods can improve the adaptability of the classification rules and increase the accuracy of the remote-sensing classification. To assess the performance of discretization methods this article adopts three indicators, which are the compression capability indicator (CCI), consistency indicator (CI), and number of the cut points (NCP). An appropriate discretization method for the RS-based classification of a given remotely sensed image can be found by comparing the values of the three indicators and the classification accuracies of the discretized remotely sensed images obtained with the different discretization methods. To investigate the effectiveness of our method, this article applies three discretization methods of the Entropy/MDL, Naive, and SemiNaive to a TM image and three indicators for these discretization methods are then calculated. After comparing the three indicators and the classification accuracies of the discretized remotely sensed images, it has been found that the SemiNaive method significantly reduces large quantities of data and also keeps satisfactory classification accuracy.

1. Introduction

Digital Earth originally proposed by Gore (Citation1998) is an information expression of the real Earth and is a new way of understanding the Earth in the twenty-first century (Guo et al. Citation2009). It is mainly composed of the following five phases: data extraction, information extraction, knowledge extraction, modeling, and decision making (Chen and van Genderen Citation2008). Remote-sensing technology provides a strong technical support for the phase of data extraction, while information extraction techniques, such as image classification, geo-statistical analysis, and data mining can extract relevant information from these huge data archives and data bases. In recent years, rough set (RS) theory, proposed by Pawlak (Citation1982, Citation1991), has already been used in geographical fields, such as spatial analysis, information extraction, and uncertainty analysis and geo-knowledge discovery (Ahlqvist et al. Citation2000, Bittner Citation2001, Bittner and Stell Citation2001, Ahlqvist et al. Citation2003, Berger Citation2004, Wang et al. Citation2004, Beaubouef et al. Citation2007, Ge et al. Citation2009, Bai et al. Citation2010). Especially, some applications focus on remotely sensed data preprocessing and on the RS-based classification (Pal and Mitra Citation2002, Wu Citation2004, Ouyang and Ma Citation2006, Li et al. Citation2007, Leung et al. Citation2007, Li et al. Citation2008, Xiao and Zhang Citation2008). For example, Wu (Citation2004), in research on remote-sensing classification using RS method, investigated data preprocessing, classifier designing, and classification evaluation. Leung et al. (Citation2007) used RS to extract classification rules of remotely sensed data. The experimental results demonstrate that it can effectively discover in remotely sensed data the optimal spectral bands and optimal rule set for a classification task. Lei et al. (Citation2008) used discrete RS to extract the texture information rules of remotely sensed image and added in its classification. The overall accuracy of classification with texture information extracted by discrete RS is higher than the overall accuracy of classification with texture information extracted by principal components analysis (PCA).

In addition to apply RS to the above studies, RS can also be integrated with other methods, such as support vector machines, neural network, and fractal to improve the accuracy of classification of remote sensing (Wu Citation2001, Liu et al. Citation2004, Ma and Hasi Citation2005, Zhang et al. Citation2005, Das et al. Citation2006, Zhan et al. Citation2007). This is particularly attractive because it combines the advantages of RS and other methods in data mining to improve the accuracy of classification of remote sensing. Commonly, the gray values of spectral bands for an 8-bit gray image are within 0–255, therefore, the gray values are considered as continuous (numerical). The term ‘continuous’ is used to indicate both real- and integer-valued attributes. However, the RS theory assumes that all attributes are nominal, so continuous-valued attributes must be discretized (Fayyad and Irani Citation1992). Discretization of attributes is an important data preprocessing approach in machine learning, particularly for the classification problem. Empirical results have shown that the quality of classification methods depends on the discretization method used in the preprocessing step (Nguye Citation1998). In past decades, discretization of attributes has received significant attention and many discretization methods have been developed, such as 1RD (Holte Citation1993), C4.5 (Quinlan Citation1993), Entropy/MDL (Fayyad and Irani Citation1992, Citation1993, Dougherty Citation1995), Naive (Øhrn Citation1999), and SemiNaive (Øhrn Citation1999). In the rule extraction of remote-sensing information based on classical RS, the discretization of remote-sensing data plays an important role. Reasonable discretization method can reduce the data size of remote sensing and improve its quality. The rules extracted with RS after reasonable discretization are then more understandable and concise. However, in the current applications of the discretization methods, few discussions on the selection and comparison of different discretization methods are given in the information extraction from remotely sensed images (Duan et al. Citation2007, Zhang et al. Citation2008).

This article investigates the differences between different discretization methods and uses three indicators, which are data compression capability indicator (CCI), consistency indicator (CI), and number of the cut points (NCP) to assess the efficiencies of these methods for the given data (Yue Citation2006). The impact of different discretization methods on classification accuracies is then implemented. The result shows that these three indicators integrating with the analysis of their influences on the classification accuracy can help the user determine the choice of discretization method for remote-sensing classification.

2. Discretization of attributes

The value of attributes mainly consists of nominal (categorical) or continuous (numerical). The nominal value mainly contains string and enum, while continuous value mainly contains integer numbers and float numbers (Wang Citation2001). For example, the color attribute value of remotely sensed images is nominal, which can be expressed by enum, such as red, green, and blue. The gray value of pixels is continuous, which is integer. To adapt to intelligence method used in the image procession of remote sensing, the continuous values of the remote sensing should be discretized into nominal values. Sometimes, the nominal values will be discretized further to acquire more abstract discretization values. In general, discretization is a process of searching for partition of attribute domains into intervals and unifying the values over each interval (Nguyen Citation1998). Hence discretization can be defined as a problem of searching for a suitable set of cuts (i.e. boundary points of intervals) on attribute domains (Nguyen Citation1998). According to different criteria, the discretization methods can be roughly classified into global/local, supervised/unsupervised, and static/dynamic. The commonly used discretization methods include supervised and unsupervised. The supervised discretization methods use the decision class information in setting cut points, so the discretization result can provide effective help for further classification. And the commonly used supervised methods include 1RD, Entropy/MDL, Naive, and SemiNaive (Dougherty Citation1995, Øhrn Citation1999). The unsupervised discretization methods do not consider the decision class information in setting cut points and they mainly contain Equal Interval and Width. In this article, Entropy/MDL, Naive, and SemiNaive are taken and compared by using these three indicators to exemplify the effect of discretization on the remote-sensing classification result. The three discretization methods used will be introduced as follows.

2.1. Discretization methods of continuous attributes

2.1.1. Entropy/MDL method

Entropy/MDL method (Fayyad and Irani Citation1992, Citation1993, Dougherty et al. Citation1995) uses the class information entropy of candidate partitions to select bin boundaries for discretization. For a set of instances S, Let there be k classes C 1,…, C k . Let P(C i , S) be the proportion of instances in S that have class C i . The class entropy of S is defined as:

1
Given a set of instances S, a feature A, and a partition boundary T, the class information entropy of the partition induced by T, denoted E(A, T; S) is given by:
2
where ∣S∣ is the number of instances in the set S, S1S and S 2=SS 1.

For a given feature A, the boundary T min which minimizes the entropy function over all possible partition boundaries is selected as a binary discretization boundary. This method can then be applied recursively to both of the partitions induced by T min until the stopping condition minimal description length principle defined by Fayyad and Irani is achieved, thus creating multiple intervals on the feature A.

Recursive partitioning within a set of values S stops iff

3
where N is the number of instances in the set S,
4
5
and k i is the number of class labels represented in the set S i . Since the partitions along each branch of the recursive discretization are evaluated independently using this criteria, some areas in the continuous spaces will be partitioned very finely whereas others (which have relatively low entropy) will be partitioned coarsely.

2.1.2. Naive method

The value for each condition attribute ‘a’ is sorted in Naive method (Øhrn Citation1999). And then, the instances in the universe are scanned. For two adjacent instances x i and x j in the universe, the average value of the two instances is set the value of the cut point, when a(x i )≠a(x i ) and d(x i )≠d(x i ) (which means that the values and class types are different for the two instances). Naive method doesn't need any extra parameters and sets cut point between two instances which have a different attribute value and decision value, regarding the cut point is very important. But it does not consider the indiscernibility among instances. Consequently, many important cut points will be ignored, which should be chosen for keeping the indiscernibility unchanged, having great contribution for classification. At the same time, the cut point will have a large distinction when the attribute values of the instances are sorted according to different order. Naive increases the cut points step-by-step and usually gets a large set of cut points.

2.1.3. SemiNaive method

SemiNaive method is similar to naive method, but has more logic to handle the case where value-neighboring instances belong to different decision classes (Øhrn Citation1999). The set of cut points found by SemiNaive method is a subset of the cut points found by naive method. SemiNaive method scans the cut points found by naive method and decides which cut points are needed further. It is supposed that c is a cut point of attribute a. Also, x i and x j are two neighbor values of cut c, and D i and D j are the dominant decision value set. D i corresponds to the equivalence class containing D i while D j corresponds to the equivalence class containing x j . The cut point c is deleted from the set of the cut points found by naive method when D i D j or D j D i , otherwise c is considered as an important cut point in the set of the cut points. SemiNaive method is considered as the optimization of the naive method due to reducing some redundant cut points. However, compared with Naive method, SemiNaive method might cause more inconsistent data.

2.2. Indicators for assessing the discretization method

The purpose of discretization is a process of grouping the values of the attributes in intervals in such a way that the knowledge content or the discernibility is not lost (Roy and Pal Citation2003). Discretization of the attributes can reduce the redundant data of the data base and then achieve the purpose of data compression. In order to find an appropriate discretization method for classical RS-based remote-sensing classification, in this article, three indicators of CCI, CI, and NCP are adopted to evaluate discretization methods for classical rough set-based remote-sensing classification. Yue (Citation2006) used these indicators to compare the different discretization methods and the experimental results have shown that these indicators are effective in analyzing the difference between discretization methods. Therefore, these three indicators are used in this article to evaluate the discretization method for remote-sensing classification. By comparing the values of these three indicators and the classification accuracies of the discretized remotely sensed images with the different discretization methods, the appropriate discretization method will be acquired. The spectral bands of Landsat TM image are used to exemplify the method. First, the spectral bands of the Region of Interest (ROI) are discretized with different discretization methods and the cut points of all spectral bands are then acquired. Three indicators, which are defined as follows, are calculated.

Let A Band_1, A Band_2, A Band_3, A Band_4, A Band_5, A Band_6, A Band_7 denote the seven bands and D class denote the class for ROI image. In RS theory, A Band_1, A Band_2, A Band_3, A Band_4, A Band_5, A Band_6, A Band_7 are called condition attributes and D class is called decision attribute.

Let A=∧A Band_j

A(i) = ∧A Band_j(i)

where n is the number of pixels within ROI, A Band_j(i) denotes the spectral bands values of pixel i in band_j and D class(i) denotes the class value of pixel i. For a set S, ∣S∣ is the number of instances in the set.

2.2.1. Compression capability indicator (CCI)

6

In formula (6), d denotes that the set which is derived from the discretized ROI Image. Discretization will bring about a reduction of the data size and loss of information, but it can generate useful knowledge or rules from the large quantity of data. The CCI reflects the data processing ability of different discretization methods.

2.2.2. Consistent indicator (CI)

7
In formula (7), dI denotes that the set which is derived from the discretized ROI image and objects of the set are inconsistent. If two objects in the set have the same A(i) values, but the D class(i) values are different, the two objects are called inconsistent objects. The consistent indicator can reflect the degree of the loss of category information owing to the discretization method.

2.2.3. Number of the cut points (NCP)

The NCP of all spectral bands is calculated when different discretization methods are used to discrete each spectral band. Discretization of the spectral bands is a process of searching for partition of spectral bands domains into intervals. For example, for a band of TM image, the gray values are within 16–142. Suppose 25.5, 70.5, and 125.5 are all the cut points of the band, the band is divided into four intervals [16, 25], [26, 70], [71, 125], and [126, 142]. The gray values in the same intervals are regarded as indiscernibility and usually designated the same value. The NCP is an important feature of discretization method.

3. Experimental study

These three indicators defined in Section 2.2 will be exemplified to evaluate the discretization method used in remote-sensing classification. The impact of different discretization methods on the classification of remotely sensed images is then analyzed. After the analysis of these three indicators and the impact of different discretization methods on the classification, the appropriate discretization method is acquired. The flow chart of this experiment is shown in . The Entropy/MDL, Naive, and SemiNaive methods are used in this example.

Figure 1.  The flow chart of the experiment.

Figure 1.  The flow chart of the experiment.

3.1. Data description

A Landsat TM image of the Yellow River Delta in China on 28 August 1999 is used to substantiate the conceptual discussion and demonstrate the application of the above-discussed method. A verification data obtained by fusing PANchromatic (PAN) band of Systeme Probatoire d'Observation de la Tarre (SPOT) two images acquired on 16th October 2002 and Enhanced Thematic Mapper (ETM) on 9 August 2001 is applied to test the analytical result. The spatial resolution of this verification image is 10 m. The TM image size is 515×515 pixels and the resolution is 30 m except that the spatial resolution of band six is 120 m. The size of verification data is 1545×1161 pixels. is the 4, 3, 2-band pseudo-color composition image.

Figure 2.  Landsat TM pseudo-color composition image (RGB 4, 3, 2) of the study area acquired on 28 August 1999.

Figure 2.  Landsat TM pseudo-color composition image (RGB 4, 3, 2) of the study area acquired on 28 August 1999.

There are 26,639 pixels selected as the ROI from the study area by using a random-sampling scheme according to prior knowledge and each pixel has seven different spectral values and a class value. The ROI image is shown in .

Figure 3.  Sample data collected by using stratified random sampling scheme.

Figure 3.  Sample data collected by using stratified random sampling scheme.

3.2. Discretization of the Region of Interest (ROI) and study area

The spectral bands of the ROI are discretized with Entropy/MDL, Naive, and SemiNaive methods and the cut points of all spectral bands are then acquired. The cut points acquired are sorted and the values of each pixel of the ROI are divided into several intervals for each spectral band. With these sorted cut points, the whole image is discretized. The values of the pixels in each interval are then set to the same value. The pixels with values in the same interval are indiscernibility and their values are designated as the average value in that interval in this article. The pseudo-color composition images of the study area discretized with different discretization methods are shown in .

Figure 4.  Pseudo-color composition image of the study area discretized by Entropy/MDL, Naive, and SemiNaive methods.

Figure 4.  Pseudo-color composition image of the study area discretized by Entropy/MDL, Naive, and SemiNaive methods.

CCI, CI, and NCP of the ROI with different discretization methods are calculated and shown in and . In , the values of fields of original data, discretization data, and inconsistent data are obtained by removing the repeating items. For example, if the values of condition attributes and decision attributes of pixel i and pixel j are completely identical, these two items are counted into one item.

Table 1. Data compression situation and consistent situation of the ROI with Entropy/MDL, Naive, and SemiNaive methods.

Table 2. Number of cut points of the ROI.

3.3. Results

After comparing the CCI, CI, and NCP of different discretization methods, the impact of the three discretization methods on the classification result of remotely sensed images will be analyzed. Here, the original and discretized remotely sensed images with Entropy/MDL, Naive, and SemiNaive discretization methods are classified with RS. Also, the results with classical RS classifier are then compared with the result with maximum likelihood classifier (MLC) method. The chosen supervised classifier MLC is one of the most popular tools for classification in remotely sensed images processing and discussed much in the literature. The MLC and RS classification results are shown in .

Figure 5.  MLC classification results and RS classification results from original and discretized remotely sensed images with Entropy/MDL, Naive, and SemiNaive methods.

Figure 5.  MLC classification results and RS classification results from original and discretized remotely sensed images with Entropy/MDL, Naive, and SemiNaive methods.

To validate the accuracy of RS classification results of original and discretized remotely sensed images with different discretization methods, this article presents a group of error matrices. The reference data are selected according to prior knowledge of the same area from the SPOT image with resolution 2.5 m random sampling scheme is used. The sample unit is a single pixel in the SPOT image, representing a 10×10 m2 area of the ground data. According to Edwards et al. (Citation1998), at least samples are requested to be chosen, where n is the minimal number of samples required, α is a parameter determining the confidence level, u is a value corresponding to α in the Gaussian distribution, d is the desired precision, and p is the estimated accuracy of the classification result. When α = 0.05, u=1.96, and p=0.5, there should be at least 384 samples. Here 1000 samples comprising 22 samples for water, 290 samples for agriculture_1, 202 samples for agriculture_2, 121 samples for urban, 18 samples for bottomland, and 347 samples for bareground are collected. The confusion matrixes, classification accuracy and kappa coefficients are acquired. The confusion matrixes of MLC and RS classification results are shown in , respectively. The producer’ accuracy, user' accuracy, overall accuracy, and kappa coefficients of MLC and RS classification results are shown in .

Table 3. Confusion matrixes for MLC classification result and RS classification result of original and discretized classified remotely sensed images with Entropy/MDL, Naive, and SemiNaive methods.

Table 4. Producer' accuracy, user' accuracy, overall accuracy and kappa coefficients of MLC classification result and RS classification results of original and discretized remotely sensed images with Entropy/MDL, Naive, and SemiNaive methods.

To analyze the effect of different discretization methods on the RS classification results more clearly, the comparison of the producer’ accuracy, user' accuracy, overall accuracy, and kappa coefficients are depicted in .

Figure 6.  Assessment accuracies of RS classification results for original and discretized remotely sensed images with different discretization methods; (a) the producer’ accuracy of each category; (b) the user' accuracy of each category; (c) the overall accuracy; (d) the kappa coefficients of each category.

Figure 6.  Assessment accuracies of RS classification results for original and discretized remotely sensed images with different discretization methods; (a) the producer’ accuracy of each category; (b) the user' accuracy of each category; (c) the overall accuracy; (d) the kappa coefficients of each category.

4. Discussion

It can be seen from that the discretized remotely sensed image with Naive method is most similar to the original image, having least loss of the value information of the spectral bands. The discretized remotely sensed images obtained by the SemiNaive and Entropy/MDL methods look different to the original image. The differences between discretized and original images intuitively show the spectral information with different discretization methods. The comparison of the CCI, CI, and NCP for different discretization methods are shown in and . From and , it can be seen that the CI of the SemiNaive method is lower than the other two discretization methods, while the CI of Naive method is the highest. As to the CCI of the SemiNaive is the highest while that of Naive is the lowest. As to NCP, it has the same change means as that of CCI. Furthermore, although the CI of Naive method is the highest, its CCI is relatively low. It means the Naive method can not significantly reduce the amount of data on the database and improve the efficacy of RS classification. Relatively, the SemiNaive method can compress the data size better, but the CI is relatively lower.

The confusion matrixes of the MLC and RS classification results of the original and discretized images with the Entropy/MDL, Naive, and SemiNaive methods are shown in . The producer's accuracy, user's accuracy, overall accuracy, and kappa coefficients acquired from the confusion matrixes are shown in . The relationship between producer's accuracy, user's accuracy, overall accuracy, kappa coefficients, and different discretization methods, is clearly shown in a–d. From a, it shows that the producer's accuracy of the water, agriculture_1 and bottomland is the highest for RS classification with the Entropy/MDL method. Only the producer's accuracy in the terrains of bareground is the highest for MLC classification. The producer's accuracy in agriculture_1 is the lowest for MLC classification. b and c were also analyzed and showed that the user's accuracy and kappa of water, agriculture_2, bottomland and bareground are very close for RS classification compared with MLC classification, although they are lower than MLC classification in agriculture_1 and urban lands.

From d, it can be seen that the overall accuracies of RS classification with different discretization methods are lower than MLC classification. The overall accuracies of RS classification with different discretization methods are very close. Although the overall accuracies are very close, the CCI of the SemiNaive method is much higher than the CCI of the Entropy/MDL and Naive methods. It shows that the data capability of the SemiNaive is much greater than the Entropy/MDL and Naive methods. The discretized remotely sensed images with the SemiNaive method still have relatively higher overall accuracy, while reducing the large quantity of remotely sensed data. Although the accuracy of RS classification with SemiNaive method is lower than the other two discretization methods, SemiNaive method reduces large quantity of image data, and improves the efficacy of the classification.

5. Conclusion and future works

In this article, the CCI, CI, and NCP are used to evaluate the discretization method which is the indispensable procedure of classical RS-based classification for a given remotely sensed image. The impact of different discretization methods on remote-sensing classification results was then analyzed. From the experimental results, it can be seen that the CCI of the SemiNaive method is much higher than the CCI of the Entropy/MDL and Naive methods. Although the CI of the SemiNaive is lower than the CI of Entropy/MDL and Naive methods, the overall accuracy of RS classification with SemiNaive method is very close to the overall accuracy of RS classification with Entropy/MDL and Naive methods, and the accuracy of MLC classification. At the same time, the NCP of the SemiNaive is lower than that of the other two methods. Also, the SemiNaive method reduces the spectral band values at magnitude level and greatly improved the efficacy of RS classification. There are still some problems about the discretization methods used in the remote-sensing classification: (1) most discretization methods discretize each attribute independently and the design of a discretization method that discretizes all attributes simultaneously needs further study, (2) remotely sensed data usually has its own features. For example, the values of the spectral bands obey to normal distributions or have spatial correlations. Therefore, the design of a new discretization method that can reflect those features is an important future research topic. The accuracies of RS classification with Entropy/MDL, Naive, and SemiNaive methods are lower than the accuracy of MLC classification in this article. It is because the used discretization methods are traditional ones which do not consider the features of remotely sensed images. In future work, we will study the discretization methods considering the features of remotely sensed images and improve the RS classification accuracy.

Notes on contributors

Yong Ge received BS and MS degrees on surveying and mapping from Wuhan University, Wuhan, PRC, in 1995 and 1998, respectively, and the Ph.D. degree in cartography and geographical information system from Chinese Academy of Sciences in 2001. Since July 2001, she has been with the State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences where she is currently an associate professor.

Feng Cao is a Ph.D. candidate in the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. He received a master's degree at the School of Computer and Information Technology, Shanxi University in 2009. His current research is the application of RS theory in GIS and RS.

Ruifang Duan studied GIS at the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences from 2006 to 2008. She received a master's degree at the School of Computer and Information Technology, Shanxi University in 2008.

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant No. 40971222) and the National High Technology Research and Development Program of China (Grant No. 2006AA120106).

References

  • Ahlqvist , O. 2000 . Rough classification and accuracy assessment . International Journal Geographical Information Science , 14 ( 5 ) : 475 – 496 .
  • Ahlqvist , O. 2003 . Rough and fuzzy geographical data integration . International Journal Geographical Information Science , 17 ( 3 ) : 223 – 234 .
  • Bittner , T. 2001 . “ Rough sets in spatio-temporal data mining ” . In Temporal, spatial, and spatio-temporal data mining , Edited by: Roddick , J.F. and Homsby , K. 89 – 104 . Berlin , Heidelberg : Springer .
  • Bittner , T. and Stell , G.J. 2001 . “ Rough sets in approximate spatial reasoning ” . In Rough sets and current trends in computing , Edited by: Ziarko , W. and Yao , Y. 445 – 453 . Berlin , Heidelberg : Springer .
  • Berger , A.P. 2004 . Rough set rule induction for suitability assessment . Environmental Management , 34 ( 4 ) : 546 – 558 .
  • Beaubouef , T. 2007 . Spatial data methods and vague regions: a rough set approach . Applied Soft Computing , 7 ( 1 ) : 425 – 440 .
  • Bai , H.X. 2010 . Using rough set theory to identify villages affected by birth defects: the example of Heshun, Shanxi, China . International Journal Geographical Information Science , 24 ( 4 ) : 559 – 576 .
  • Chen , S.P. and van Genderen , J. 2008 . Digital Earth in support of global change research . International Journal of Digital Earth , 1 ( 1 ) : 43 – 65 .
  • Dougherty , J. , et al. , 1995 . Supervised and unsupervised discretization of continuous features . In : Proceedings of the twelfth international conference on machine learning , 9–12 July, Tahoe City, California. Morgan Kaufmann , 194 – 202 .
  • Das , S. , et al. , 2006 . A hybrid rough set particle swarm algorithm for image pixel classification . In : Proceedings of the sixth international conference on hybrid intelligent systems , 13–15 December, Auckland, New Zealand , 26 – 30 .
  • Duan , R.F. , et al. , 2007 . Experimental study on the discretization on remote sensing data . In : 7th international workshop on geographical information system , 14–15 September, Beijing, China , 383 – 388 .
  • Edwards , T.C. 1998 . Assessing map accuracy in a remotely sensed, ecoregion-scale cover map – a user's perspective . Remote Sensing of Environment , 63 : 73 – 83 .
  • Fayyad , U.M. and Irani , K.B. 1992 . On the handling of continuous-valued attributes in decision tree generation . Machine Learning , 8 : 87 – 102 .
  • Fayyad , U.M. and Irani , K.B. , 1993 . Multi-interval discretization of continuous-valued attributes for classification learning . In : Proceedings of the 13th international joint conference on artificial intelligence , 28 August–3 September, Chambery, France. Morgan Kaufmann , 1022 – 1027 .
  • Ge , Y. 2009 . Rough set-derived measures in image classification accuracy assessment . International Journal of Remote Sensing , 30 ( 20 ) : 5323 – 5344 .
  • Gore , A. , 1998 . The digital earth: understanding our planet in the 21st century . In : Presented at the Californian Science Center . 31 January Los Angeles, CA, USA .
  • Guo , H. 2009 . A digital earth prototype system: DEPS/CAS . International Journal of Digital Earth , 2 ( 1 ) : 3 – 15 .
  • Holte , R.C. 1993 . Very simple classification rules perform well on most commonly used datasets . Machine Learning , 11 : 63 – 90 .
  • Liu , H.J. 2004 . Rough neural network of variable precision . Neural Processing Letters , 19 : 73 – 87 .
  • Leung , Y. 2007 . A rough set approach to the discovery of classification rules in spatial data . International Journal of Geographical Information Science , 21 ( 9 ) : 1033 – 1038 .
  • Li , L.W. et al. , 2007 . Tolerant rough set on satellite remote sensing data classification . Computer Egineering and Applications , 43 ( 20 ), 11 – 13 . (In Chinese)
  • Li , L.W. , et al. , 2008 . Tolerant rough set processing on uncertainty of satellite remote sensing data classification . Computer Engineering , 34 ( 6 ), 2 – 6 . (In Chinese)
  • Lei , T.C. 2008 . The comparison of PCA and discrete rough set for feature extraction of remotely sensed imagery classification – a case study on rice classification, Taiwan . Computers & Geosciences , 12 : 1 – 14 .
  • Ma , J.W. and Hasi , B. 2005 . Remote sensing data classification using tolerant rough set and neural networks . Science in China Ser. D Earth Science , 48 ( 12 ) : 2251 – 2259 .
  • Nguyen , H.S. 1998 . “ Discretizaiton problem for rough sets methods ” . In Rough sets and current trends in computing Edited by: Polkowski , L. and Skowron , A. 545 – 552 . London : Springer-Verlag, .
  • Ouyang , Y. and Ma , J.W. 2006 . Land cover classification based on tolerant rough set . International Journal of Remote Sensing , 27 ( 14 ) : 3041 – 3047 .
  • Pawlak , Z. 1982 . Rough sets . International Journal of Computer and Information Sciences , 11 ( 5 ) : 341 – 356 .
  • Pawlak , Z. 1991 . Rough sets: theoretical aspects of reasoning about data , Boston : Kluwer Academic .
  • Pal , S.K. and Mitra , P. 2002 . Multispectral image segmentation using the rough-set-initialized EM algorithm . IEEE Transactions on geoscience and remote sensing , 40 ( 11 ) : 2495 – 2501 .
  • Quinlan , J.R. 1993 . C4.5: programs for machine learning , San Francisco , CA : Morgan Kaufmann .
  • Øhrn , A. , 1999 . Discernibility and rough sets in medicine: tools and applications, computer and information science . Thesis (PhD). Norwegian University of Science and Technology .
  • Roy , A. and Pal , R.K. 2003 . Fuzzy discretizaiton of feature space for a rough set classifier . Pattern Recognition Letters , 24 ( 6 ) : 859 – 902 .
  • Wang , G.Y. , 2001 . Rough set theory and knowledge acquisition . Xi an: Xi'an Jiaotong University Press. (In Chinese)
  • Wu , Z.C. , 2001 . Research on remotely sensed imagery classification using neural network based on rough sets . In : International conferences on info-tech and info-net , 29 October–1 November, Beijing, China , 279 – 284 .
  • Wu , Z.C. , 2004 . Rough sets approach to remotely sensed imagery processing and classification . Thesis (PhD). Wuhan University. (In Chinese)
  • Wang , S.L. 2004 . “ Rough spatial interpretation ” . In Rough sets and current trends in computing , Edited by: Tsumoto , S. 435 – 444 . Berlin , Heidelberg : Springer .
  • Xiao , H. and Zhang , X.B. 2008 . Comparison studies on classification for remotely sensed imagery based on data mining method . WSEAS Transactions on computers , 5 ( 7 ) : 552 – 558 .
  • Yue , X.D. , 2006 . Research on discretization of continuous features based on rough set theory . Thesis (MA). Shanxi University. (In Chinese)
  • Zhang , G.X. , et al. , 2005 . A hybrid classifier based on rough set theory and support vector machines . In : L. Wang and Y. Jin Fuzzy systems and knowledge discovery . Berlin , Heidelberg : Springer , 1287 – 1296 .
  • Zhan , Y.J. , et al. , 2007 . Hyperspectral RS image classification based on fractal and rough set . In Second International Conference on Space Information Technology , Wuhan , 6795 (3) , 67954F.1 – 67954F.6 .
  • Zhang , G.F. , et al. , 2008 . A remote sensing feature discretization method accommodating uncertainty in classification systems . In : Proceedings of the 8th international symposium on spatial accuracy assessment in natural resources and environmental sciences , 25–27 June, Shanghai, China , 195 – 202 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.