1,144
Views
3
CrossRef citations to date
0
Altmetric
Editorial

Stewardship and analysis of big Earth observation data

&

Big Earth observation (EO) data have played an important role in fields such as quantifying global forest changes (Hansen et al., Citation2014), tracking global urban expansion (Liu et al., Citation2019), and mapping land use and land cover (Felipe-Lucia et al., Citation2020). It is becoming more and more obvious that big EO data is an important asset, which means that there is an urgent need to develop effective data-stewardship tools and advanced analytical techniques to make the best use of big EO data. This special issue aims to explore and summarize the new theories, new technologies, and new methods used in the stewardship and analysis of big EO data.

As with other types of big data, big EO data generally have the following basic characteristics.

  1. Volume

According to incomplete statistics, the total amount of data archived by the Earth Observing System Data and Information System (EOSDIS) reached 32.9 petabytes (PBs) at the end of August 2019 (Vietnam, Citation2019). Up until November 2020, the volume of data archived by the China National Satellite Meteorological Center (NSMC) reached 10.276 PBs, and the China Center for Resources Satellite Data and Application (CCRSDA) had archived more than 20 PBs of remote sensing images by June 2019.

  • (2) Variety

According to the 2020 State of the Satellite Industry report, there were about 664 remote sensing and observation satellites in orbit at the end of 2019. These satellites each carry more than one Earth observation sensor and can collect information about the Earth’s surface, day and night. This means that at least 664 kinds of EO data are continuously transmitted to ground receiving stations.

  • (3) Velocity

With the development of multi-satellite coordination and observations that use constellations of satellites, satellite revisit periods are changing from monthly or daily to hourly or even shorter. For example, the Jilin-1 satellite constellation, which consists of 60 satellites, will have a revisit cycle of 20 minutes by the end of 2020. In addition, the remote sensing data received by each data center is arriving at an ever-faster rate. For example, the amount of data received from a GF-2 satellite PSM1 sensor is approximately 1.5 TB per day.

  • (4) Value

As the spatial and spectral resolution of data increases, a greater number of ground information features can be captured by satellite sensors in greater detail. Using remote sensing images, land-use and land-cover change (LUCC) can be investigated, vegetation growth monitored and military targets on the ground detected. However, in order to obtain such valuable information, massive numbers of remote sensing images have to processed, a process which is similar to trying to extract gold from sand.

  • (5) Heterogeneous

Due to the great variety in satellite orbit parameters and the specifications of sensors, the storage formats, projections, spatial resolutions, and revisit periods of archived data also vary enormously, and these differences have resulted in great difficulties for data stewardship and management (Fan, Yan, Ma, & Wang, Citation2017).

The basic characteristics of big EO data listed above, pose great challenges in terms of data stewardship and analysis, and efforts are needed to produce novel data-stewardship reference models (Albani & Maggio, Citation2020), efficient data indexing and retrieval methods (Qu et al., Citation2020), effective methods of improving data quality (Wang, Li, Luo, Xie, Citation2020) and data mining (Dumitru et al., Citation2020), and high-performance data-processing systems (Huang, Yang, Tao, & Zhu, Citation2020; Zhou et al., Citation2020).

  1. Archived EO data have huge scientific value as they provide long-term uninterrupted observations of the Earth. There is an urgent need to design a novel data-stewardship reference model to guarantee that these high-value data are accessible and exploitable. In data management and governance, the role of data stewardship is to ensure that data policies and standards are put into practice within the steward’s domain (Allen & Cervo, Citation2015). In relation to owners of EO data and space agencies, data stewardship relates to the preservation and curation of satellite datasets so that they are ready for compilation into long-term data series and for analysis. Specifically, the data stewardship reference model describes a series of space assets to be applied and used, before, during, and/or after the end of an Earth observation mission, in order to ensure Earth observation space datasets are preserved and valorized. The stewardship process starts during the dataset planning phase of the mission and continues in the definition, implementation, operation and maintenance phases.

  2. Big EO data is typical of spatio-temporal big data with multi-scale characteristics and has introduced great challenges to traditional data indexing and retrieval methods or tools. For example, a relational database cannot cope with multi-term joint data searches (searches based on the satellite, sensor, longitude and latitude, and time) of several million records. In addition, different types of EO data generally use different spatio-temporal reference systems, which also produces difficulties for multi-source data indexing and retrieval. Therefore, establishing a unified spatio-temporal reference system for multi-source remote sensing data and adopting new data-management tools are the key to improving the efficiency of big data retrieval.

  3. Restricted by conditions such as climate and cloud cover, the EO data collected by remote sensing satellites are often of low quality, which causes great challenges for the extraction of Earth observation information and data mining (Wei, Yu, Lee, Wang, & Jiang, Citation2020). Therefore, designing a series of effective data-quality improvement and mining models suitable for the characteristics of the data is key to fully mining EO information.

  4. The processing and analysis of massive remote sensing data poses a huge challenge for computing models and computing platforms. The correspondence relationship of remote sensing data products at all levels provides the possibility of constructing and executing new data-processing workflows (Yan, Wang, Choo, & Jie, Citation2017); the emergence of new computing models such as cloud computing is providing endless computing power for big EO data processing (Wang, Ma, Yan, Chang, & Zomaya, Citation2018). Therefore, in the cloud computing environment, the construction of high-performance data-processing frameworks and workflows can be used to achieve efficient processing and analysis of EO data.

By means of the actions listed above, the efficiency and accuracy of the stewardship and analysis of big EO data can be improved. The six papers in this issue discuss in detail the background, implementation, results and conclusions of studies where models or methods related to the four points listed above were implemented. We hope that this issue will promote research into big EO data management, stewardship, and high-performance processing and analysis, thus contributing to the sustainable development of human society.

References

  • Albani, M., & Maggio, I. (2020). Long time data series and data stewardship reference model. Big Earth Data, 4(4), 353–366. doi:10.1080/20964471.2020.1800893
  • Allen, M., & Cervo, D. (2015). Multi-domain master data management: Advanced MDM and data governance in practice. Morgan Kaufmann,95–107. https://doi.org/10.1016/B978-0-12-800835-5.00007-5
  • Dumitru, C. O., Schwarz, G., Pulak-Siwiec, A., Kulawik, B., Albughdadi, M., Lorenzo, J., & Datcu, M. (2020). Understanding satellite images: A data mining module for Sentinel images. Big Earth Data, 4(4), 367–408. doi:10.1080/20964471.2020.1820168
  • Felipe-Lucia, M. R., Soliveres, S., Penone, C., Fischer, M., Ammer, C., Boch, S., … Allan, E. (2020). Land-use intensity alters networks between biodiversity, ecosystem functions, and services. Proceedings of the National Academy of Sciences, 117.45, 28140–28149.
  • Hansen, M., Potapov, P., Margono, B., Stehman, S., Turubanova, S., & Tyukavina, A. (2014). Response to comment on “high-resolution global maps of 21st-century forest cover change”. Science, 344(6187), 981.
  • Huang, F., Yang, H., Tao, J., & Zhu, Q. (2020). Universal workflow-based high performance geo-computation service chain platform. Big Earth Data, 4(4), 409–434. doi:10.1080/20964471.2020.1776201
  • Fan, J., Yan, J., Ma, Y., & Wang, L. (2017). Big data integration in remote sensing across a distributed metadata-based spatial infrastructure. Remote Sensing, 10, 7.
  • Liu, X., Pei, F., Wen, Y., Li, X., Wang, S., Wu, C., ... Liu, Z. (2019). Global urban expansion offsets climate-driven increases in terrestrial net primary productivity. Nature Communications, 10.1, 1–8. doi:10.1038/s41467-019-13462-1.
  • Qu., T., Wang, L., Yu, J., Yan, J., Xu, G., Li, M., Cheng, C., Hou, K., & Chen, B. (2020): STGI: a spatio-temporal grid index model for marine big data, Big Earth Data, 4(4), 435–450. doi:10.1080/20964471.2020.1844933
  • Vietnam, H. NASA agency report. In Proceedings of the 48th Meeting of the Working Group on Information Systems & Services, Hanoi, Vietnam. 11 October 2019.
  • Wang, L., Ma, Y., Yan, J., Chang, V., & Zomaya, A. Y. (2018). pipsCloud: High performance cloud computing for remote sensing big data management and processing. Future Generation Computer Systems, 78, 353–368.
  • Wang, W., Li, Y., Luo, X., & Xie, S. (2020). Ocean image data augmentation in the USV virtual training scene. Big Earth Data, 4(4), 451–463. doi:10.1080/20964471.2020.1780096
  • Wei, J., Yu, X., Lee, Z., Wang, M., & Jiang, L. (2020). Improving low-quality satellite remote sensing reflectance at blue bands over coastal and inland waters. Remote Sensing of Environment, 250, 112029.
  • Yan, J., Wang, L., Choo, K.-K. R., & Jie, W. (2017). A cloud-based remote sensing data production system. Future Generation Computer Systems, 86(September 2018), 1154–1166.
  • Zhou, G., Wang, X., Chen, W., Xianju, L., & Chen, Z. (2020). Realization and application of geological cloud platform. Big Earth Data, 4(4), 464–478. doi:10.1080/20964471.2020.1820175