2,355
Views
6
CrossRef citations to date
0
Altmetric
Articles

Reviewing the discoverability and accessibility to data and information products linked to Essential Climate Variables

, ORCID Icon & ORCID Icon
Pages 236-252 | Received 24 Jul 2018, Accepted 14 May 2019, Published online: 14 Jun 2019

ABSTRACT

Essential Climate Variables (ECVs) are geophysical records generated from systematic Earth Observations associated with climate variations, changes, and impacts. ECVs products support the data and information needs of international frameworks and policies such as the work of the United Nations Framework Convention on Climate Change (UNFCCC) and the Intergovernmental Panel on Climate Change (IPCC). We map the main networks and initiatives publishing ECVs, by presenting an overview of existing satellite-based ECVs, their general data creation characteristics, discoverability and accessibility methods from an end-user perspective. We investigate key initiatives providing or coordinating access to ECV data records, such as the Global Climate Observing System (GCOS), the Committee on Earth Observation Satellites (CEOS), the Coordination Group for Meteorological Satellites (CGMS), Joint Working Group on Climate (WGClimate), the Remote Sensing Systems (REMSS), and the European Space Agency Climate Change Initiative (ESA CCI). We find that ECV data discovery and access is difficult and time consuming due to the lack of common data and metadata catalogues. In addition, the selection of fit-for-purpose data records by end-users requires the implementation of interoperable standards and scalable data infrastructures to allow the generation of tailored applications and data-driven information products in support of decision-making processes.

1. Introduction

Earth Observation (EO) data combined with model-derived information can provide evidence on the past, present, and future state of the Earth climate. This is necessary to understand the dynamics of climate conditions and their impact on societal systems. Data derived from earth observations support the daily collection and monitoring of key variables from the atmosphere, oceans and terrestrial systems. This subsequently supports the development of climate science products, policy frameworks, and different climate services (Giuliani et al. Citation2017). Climate services can be defined as the transformation of climate-related data and information into customized products. These may include projections, forecasts, information products, trend assessments, or decision support tools and information (Street Citation2016).

Climate data derived from EOs are collected, processed and distributed by several research centers and data warehouses that generate and distribute datasets with different spatial and temporal coverages. In order to guarantee that these observations are sustained, preserved, coordinated and improved, an international and multiagency program endorsed by the United Nations (UN) gathered expert panels from different climate domains (Atmosphere, Oceans and Terrestrial Systems) to establish in 1992 the Global Climate Observing System (GCOS).

GCOS activities consist of coordinating and ensuring the availability and accessibility to climate observations for all potential users (GCOS Citation2014; Houghton et al. Citation2012; World Meteorological Organization et al. Citation2010). The scope of GCOS also includes identifying gaps in observation systems and proposing remedial actions to respond to data and information needs of international frameworks, such as the United Nations Framework Convention on Climate Change (UNFCCC) and the Intergovernmental Panel on Climate Change (IPCC). One key task of GCOS is to identify the ‘principal observations’ to be addressed by a set of space missions and other EO networks in order to characterize the state of the global climate system, and to support the monitoring and planning of mitigation and adaptation measures (Bojinski et al. Citation2014; Giuliani et al. Citation2017; WMO Citation2014; WMO et al. Citation2016). These observations are called Essential Climate Variables (ECVs) (Bojinski et al. Citation2014). The 54 ECVs are geophysical records from systematic observations associated with climate variation and change, along with their impacts. These ECVs are associated with three domains: (1) atmospheric: surface (measurements at standardized heights close to the Earth’s surface), upper-air (measurements from up to the stratopause) and chemical composition, (2) oceanic: surface (measurements within the upper 15 m) and sub-surface, and (3) terrestrial (comprising hydrological, cryospheric, biological and ecological subdomains) (Bojinski et al. Citation2014). 30 ECVs are currently identified as significantly relying on satellite-based observations, often complemented with in situ measurements (CEOS and ESA Citation2015).

Satellite and in situ measurements are transformed into ECV products by different EO data processing workflows that routinely transform them into calibrated, validated, documented, peer-reviewed, quality-controlled data records (Barkstrom, Bates, and Privette Citation2007; Bates et al. Citation2015). As different ECVs serve different purposes, the generation of data records follow diverse methodologies, standards and dissemination practices, carried out by different data processing, analysis and distribution centers. These processes generally involve international collaborations between coordination networks, space agencies and research institutions or programs throughout a vast range of data management systems.

ECV data records and their scientific quality are derived from one or more Climate Data Records (CDRs), that consist in time series of EO measurements of sufficient length, consistency, and continuity to determine climate variability and change (Hollmann et al. Citation2013; Robinson et al. Citation2004). CDRs are generated from basic measurements of climate observations subjected to quality control and calibration processes with ancillary data or satellite inter-calibration, called Fundamental Climate Data Records (FCDRs) (e.g. radiances and brightness captured by remote sensing instruments) (Robinson et al. Citation2004).

Thematic Climate Data Records (TCDRs) are model outputs describing geophysical variables derived from CDRs (e.g. global grids of sea surface temperature or monthly precipitation). Finally, when FCDRs or TCDRs are regularly updated, with near-real time observations, Interim Climate data Records (ICDR) are generated (CEOS and CGMS Citation2014, Citation2017; Hollmann et al. Citation2013; Su et al. Citation2018; Yang et al. Citation2016; Zeng et al. Citation2015).

ECVs can be composed by a single or multiple TCDRs. An example of the latter is the variable aerosols that include at least four different TCDRs (CEOS and JAXA Citation2015; Dowell et al. Citation2013). presents a generalized workflow to illustrate the process to obtain ECVs from EOs and the possible applications of the different data records created in the process, along with the transformation of these into applications and information products to support decision-making processes.

Figure 1. Schematic of the data flow and processes required to generate ECVs from FCDRs and TCDRs, and their applications, based on Dowell et al. (Citation2013).

Figure 1. Schematic of the data flow and processes required to generate ECVs from FCDRs and TCDRs, and their applications, based on Dowell et al. (Citation2013).

Scientific Data Stewardship is a systematic approach to observation, production, and preservation of climate information (NRC Citation2004), that ensures that data and information products are adequately preserved, accessible and reliable towards the whole data lifecycle, with well-defined documentation to allow easy data access and implementation by end-users (Barkstrom, Bates, and Privette Citation2007; NCDC Citation2014). The stewardship of a data record focuses on how adequate data is being managed, preserved and accessed. The quality of data could be reflected on the difficulty to discover, access, understand, trust, and use a given data product along with its metadata (Peng, Lawrimore, et al. Citation2016).

ECVs provide state-of-the-art data records that are the product of scientific consensus and international research collaboration efforts. However, the current mechanisms for discovering and accessing ECVs present several challenges and barriers. Improving data description, discovery, and usability of ECV data and information is critical in ensuring data use, but it requires coordinated efforts from multiple disciplines and the integration of diverse data infrastructures and standards.

Our study provides an overview of the major open data ECV initiatives, catalogues and repositories that provide access to ECV data and information products. We focus on ECVs with a strong contribution from satellite observations (CEOS and ESA Citation2015). We start by presenting the role of GCOS in selecting ECVs, and the contributions of ECVs to other domains. We follow by presenting the methodology used to review the main data systems and initiatives generating or distributing ECVs, with an in-depth analysis of how findable, accessible, and documented these datasets are. Finally, we discuss how data providers could adapt their services to increase the accessibility to their data products. A list of acronyms used is found in the Supplementary file.

2. Methodology

We concentrated our research on ECVs with major contributions from satellite-based EOs that are available through data or metadata catalogues, or other publicly accessible platforms. The research workflow followed these steps.

  1. Literature review of the main initiatives or data catalogues that provide access to ECV datasets and information products.

  2. Selection of a subset of initiatives and data catalogues to be analyzed.

  3. Identification of criteria to evaluate the accessibility and usability of datasets.

  4. Compilation of a list of ECVs or ECV CDRs available by data catalogue. This list serves as a tool for comparing features across data catalogues and platforms reviewed during the period from April to June 2018.

  5. Analysis of results.

2.1. Reviewing ECV-related initiatives and data catalogues

We reviewed the GCOS list of ECV dedicated projects (GCOS Citation2016a), implementation plan, and technical papers (WMO et al. Citation2004, Citation2010, Citation2011, Citation2016). Additionally, CEOS’ official website (CEOS Citation2018) was consulted to identify the initiatives and repositories designated for the generation and discovery of ECV data products. Other initiatives and repositories were selected from a wider literature review and semi-structured search engine exploration of publicly accessible data platforms, using the following terms: ‘Essential Climate Variable’ OR ‘ECV’ combined with ‘catalogue’ OR ‘data repository’ OR ‘database’.

The two core portals that provide access to GCOS ECVs information are the US National Oceanic and Atmospheric Administration National Centers for Environmental Information (NOAA, NCEI) Global Observing Systems Information Center (GOSIC), and the joint CEOS/CGMS Working Group on Climate (WGClimate) ECV Inventory (WMO et al. Citation2016, 54). GOSIC’s portal does not hold data records, but it rather provides information about ECV data products and points to the centers holding them (GOSIC Citation2018). The goal of GOSIC is to be a one-stop entry platform to facilitate access to ECV data and information products identified by GCOS through the Essential Climate Variables Data Access Matrix (GOSIC Citation2018). This matrix provides a list of ECVs and their respective CDRs generated by different observation systems and research centers. It also describes the GCOS requirements for each variable, and the list of contributing in situ and satellite-based observation systems. Although the aim of this portal is to provide a reference access point to ECVs, it is not yet fully implemented (WMO et al. Citation2016).

The WGClimate ECV inventory is a structured repository that compiles the most complete list of existing and planned CDRs (as of 31 December 2016). These are generated from satellite-based EOs as a result of the collaborative effort of several experts from space agencies and other research institutions, during the period June 2016 to April 2017 (WGClimate Citation2018b). The inventory was coordinated by the joint CEOS/CGMS working group. It compiles a set of 913 CDRs (496 existing and 417 planned, as of 2 June 2018) that contribute to the generation of ECV products. This inventory was verified and made openly available on the WGClimate website, and the methodology used to compile it is described in the ECV Inventory Questionnaire Guide (CEOS and CGMS Citation2016).

2.2. Selection of major initiatives and catalogues

All ECV data catalogues or initiatives identified during the period from April to June 2018, are listed in .

Table 1. Summary of identified coordination initiatives, catalogues and projects publishing ECV data products.

A subset of the portals and data catalogues review was used to analyze how easy it is to find and use ECV data records. The selection criteria used are listed below:

  • Data catalogues/platforms that provide access to ECV datasets and/or allow filtering their content by using the keyword ‘ECV’ were included.

  • When datasets generated by an initiative were accessible through different data catalogues, the primary catalogue that generated the dataset was included in the analysis.

  • Only catalogues providing access to more than one ECV dataset were considered.

  • This review only includes catalogues and platforms that were accessible during the period from April to June 2018.

Six initiatives were selected for the analysis: QA4ECV, CM-SAF, Obs4MIPs, ESA CCI, GOSIC, and the WGClimate ECV inventory. A summary of the ECVs available by data catalogue/initiative is presented in . From the assessed records, 74% of them are part of the WGClimate inventory, and about 20% are accessible via the GOSIC platform. The ESA CCI and Obs4MIPs represented 2% each, and the CM-SAF and QA4ECV about 1% each. All ECV data records available were assessed as part of the accessibility and usability analysis.

Figure 2. List of initiatives that provide access to ECVs data products (CDRs) with significant contributions from satellites. Based on datasets available at GOSIC Data Access Matrix (GOSIC Citation2015), GCOS requirements for ECVs (GCOS Citation2016b), CEOS Database (CEOS and ESA Citation2015), WGClimate ECV Inventory Access (WGClimate 2018), ESA CCI (ESA Citation2015), QA4ECV portal (Scanlon and Nightingale Citation2017), CM SAF portal from the ‘European Organisation for the Exploitation of Meteorological Satellites’ (EUMETSAT Citation2018) and Obs4MIPs data catalogue (ESGF Citation2018).

Figure 2. List of initiatives that provide access to ECVs data products (CDRs) with significant contributions from satellites. Based on datasets available at GOSIC Data Access Matrix (GOSIC Citation2015), GCOS requirements for ECVs (GCOS Citation2016b), CEOS Database (CEOS and ESA Citation2015), WGClimate ECV Inventory Access (WGClimate 2018), ESA CCI (ESA Citation2015), QA4ECV portal (Scanlon and Nightingale Citation2017), CM SAF portal from the ‘European Organisation for the Exploitation of Meteorological Satellites’ (EUMETSAT Citation2018) and Obs4MIPs data catalogue (ESGF Citation2018).

2.3. Assessing accessibility and usability

To understand the criteria for accessibility and usability that are used to assess maturity of data records, we conducted a literature review of initiatives and maturity models applied to ECVs or CDRs (Barkstrom, Bates, and Privette Citation2007; Bates et al. Citation2015; Edis-Williams Citation2017; EUMETSAT Citation2015; NCDC Citation2014; Peng Citation2018; Scanlon and Nightingale Citation2017; Waliser Citation2017). A list of maturity models found in the literature is summarized in . In addition, accessibility criteria used by the WGClimate ECV inventory were also consulted.

Table 2. Overview of maturity assessments related to CDR or ECV initiatives found in the literature.

Our definitions of accessibility and usability are based on NOAA’s satellite-based Climate Data Records Program Stewardship Maturity Matrix (SMM), as presented in section 2.4 (Barkstrom, Bates, and Privette Citation2007; Bates and Privette Citation2012; Peng, Lawrimore, et al. Citation2016). NOAA’s SMM consists on rating individual datasets in five levels of maturity, from Level 1 (low or ad hoc stage) to Level 5 (high or optimal maturity stage).

2.4. Accessibility

Accessibility to ECV data records refers to how easy it is to search and access datasets and documents (e.g. user guidelines and technical papers) over the web through the dedicated online access system (Peng, Lawrimore, et al. Citation2016). Selected indicators of accessibility of ECV records were based on the WGClimate ECV inventory. From the eight criteria used by the WGClimate ECV inventory, we adopted three: Data record (link), dissemination mechanism, and data and metadata format. The following criteria were not included: access point, data access, FCDR availability, release date, and access conditions. These were considered less relevant for this exploratory analysis.

2.5. Usability

Usability refers to how easy it is to use the datasets based on their format and documentation. This depends on community-specific best practices and standards (e.g. data encoding formats, community standards used, self-describing data), as well as the use of datasets in scientific and decision-making processes. The usability of data records can also be determined by additional information to facilitate the understanding and use of the data product (Peng et al. Citation2018; Peng, Ritchey, et al. Citation2016) – e.g. through technical papers and user manuals.

Indicators of usability adapted from the WGClimate ECV inventory are: data format, metadata format, Quality Assurance (QA) process, and quantitative maturity index assessment.

2.6. Assessing accessibility and usability

A summary of all the indicators selected from the WGClimate ECV inventory, as well as additional adopted indicators, are described in the ECV data discovery and access revision matrix in .

Table 3. ECV data discovery and access revision matrix.

The ECV data discovery and access revision matrix was filled for each visited data catalogue or repository. The metadata of self-describing ECV records was reviewed when possible. Features included in the WGClimate ECV inventory were not revisited or verified, given that the content was already validated and represents the state-of-the-art of existing CDRs from space agency sponsored activities. Overall, 667 ECV-related features were reviewed from the six user service organizations selected for the analysis. According to WGClimate, ‘User Service Organizations’ are organizations responsible for serving user requests for data records (WGClimate Citation2018a).

The list of ECV data reviewed and the criteria evaluated were recorded following the structure described in , by using a binary record (‘Yes’ or ‘No’) for the presence or absence of the assessed indicator. Additional information describing the ECV CDRs was also recorded (e.g. name, domain of observation, and type of EO contribution). Inconsistencies in the spelling or names of variables, organizations, data formats and metadata were homogenized to facilitate the analysis (e.g. ‘NetCDF’, ‘netcdf’, ‘NetCDF-4’, ‘NetCDF 4’, and ‘NETCDF’ were all recorded as ‘NetCDF’).

3. Results

The six platforms surveyed provide access to 667 data records from 89 user service organizations or initiatives. The ECV data discovery and access revision matrix list, as well as the assessment of accessibility and usability criteria, are presented in Supplementary Table 1. An overview of the data discoverability, accessibility and usability is presented below.

3.1. Overview of discoverability

We found that information generated by surveyed platforms such as GCOS, GOSIC’s Essential Climate Variables Data Access Matrix, WGClimate ECV inventory, and CEOS database/handbook, is not interlinked throughout their respective websites. This hinders access to ECV-related data and information. Information provided by GOSIC does not include references or links to all datasets or data providers listed in the WGClimate inventory, QA4ECV and Obs4MIPS databases. Thus, there is a lack of web-based systems compiling GCOS technical papers, ECV metadata catalogues, technical specifications, maturity assessments, gap analysis and validation processes implemented by different initiatives and projects.

The ten user service organizations providing the largest number of ECV CDRs are: NASA, EUMETSAT, NOAA, NCEI, ICARE, CNES, AERIS, ESA CCI, NSIDC, and Copernicus. NASA’s Earth Data portal and NOAA’s OneStop Portal (https://data.noaa.gov) provide advanced data-browsing catalogues and data exploration tools. These portals do not allow filtering of their content with ECV specific tags. However, they provide access to various CDRs or ICDRs using Thematic Real-time Environmental Distributed Data Services (THREDDS), Open-source Project for a Network Data Access Protocol (OPeNDAP), and other types of web services.

170 of 667 (25%) assessed records are accessible via alternative online platforms, and 77 of 667 (12%) provide access to user-oriented additional online resources such as visualization platforms, APIs and learning materials (e.g. access routine codes, videos or tutorials). However, these indicators were not captured as part of the WGClimate inventory. Therefore, the number of datasets that are accessible via alternative websites, or that support additional online resources, could be higher.

The ESA CCI project supports access, exploration, visualization and query of data records via the CCI toolbox. This provides users with a graphical user interface and Python API to access climate data across multiple CCI datasets.

The US National Center for Atmospheric Research (NCAR) and University Corporation for Atmospheric Research (UCAR URL: https://climatedataguide.ucar.edu) provide data record summaries, expert user guidance documentation, metadata description, data access links, key figures and references, for Obs4MIPS data records.

3.2. Overview of data accessibility

Different collections of ECV CDRs are scattered in different platforms and data hubs that have varying metadata and documentation standards. Out of the six initiatives/catalogues selected for the analysis, only two (Obs4MIPs and CM-SAF) provide search and metadata exploration capabilities to access 21 ECV CDRs (the equivalent to 3% of the records included in the analysis). Almost all records assessed are free and open access (633 of 667, 95%). The only restriction for some datasets is a registration requirement in the portal before downloading data records. A small number of assessed records (55 of 667, 8%) specify in their metadata additional information about the type of license policy. However, this parameter was not recorded in the WGClimate inventory. Results are summarized in .

Figure 3. Classification of the 667 assessed data records, for each accessibility indicator. Numbers indicate the number of data records in each category.

Figure 3. Classification of the 667 assessed data records, for each accessibility indicator. Numbers indicate the number of data records in each category.

3.2.1. Challenges and gaps

The following elements were identified as hindering access to ECV products from the assessed platforms and initiatives:

  • There is no centralized data and information discovery platform that allows accessing data and information products related to policies, technical papers, reports about the contributions to policy frameworks, data user providers, or additional ECV specific material.

  • No interface exists to support search functions of the complete list of GCOS ECVs, or to filter datasets by domain, area of contribution, type of observation (in situ, satellite based or combined products), initiative, program, user service provider, satellite, temporal coverage, quality assessment process, and maturity score.

  • The lack of information to differentiate the general characteristics of data records between CDRs, TCDRs, FCDRs or ICDRs (WGClimate Citation2018a) could hinder the selection of pertinent records by interested users.

  • The lack of unique identifiers and common naming conventions of ECVs (e.g. ‘Fire’, ‘Fire Disturbance’) and respective CDRs/TCDRs, hampers the comparison of datasets across initiatives/platforms, as well as the identification of relevant data records.

  • The absence of metrics or indicators to identify datasets that follow GCOS requirements and standards, maturity or scientific maturity, hinders the process of finding relevant ECV datasets.

3.2.2. Access to systems and data

A small subset of records (86 of 667, 13%) are distributed through standardized web services such as the OGC’s Web Map Service (WMS), Web Coverage Service (WCS), OPenDAP or THREDDS, which allows the dynamic harvesting of data. However, most records are distributed via download-only dissemination mechanisms such as FTP or HTTP services. Data records are published in 19 different formats, NetCDF or HDF being the most common (553 of 667, 83%). 10 of 667 (1%) data records do not provide direct access to data records or are accessible only by physical media (disk) or e-mail.

338 of 667 products (51%) have, or report to have, a standard metadata scheme such as CF, ACDD, ISO 19115-2, COARDS CF, INSPIRE 2, ECHO or OAI-PMH. A total of 18 records (3%) have additional metadata available as text files. Metadata information was not found for 297 records (45%).

While most of the records assessed (89%) provide access to product-specific technical documentation (e.g. product specification document, algorithm theoretical basis documentation, error characterization), 118 of 667 (18%) have access to product-specific user guidance documentation. Nevertheless, these parameters were not captured as part of the WGClimate inventory.

3.3. Overview of data usability

Product-specific technical documentation is available for most of the features assessed (596 of 667, 89%). Still, user guides and user support were found to be supported by only 18% (118 of 667) of the records reviewed. These parameters were not recorded as part of the WGClimate inventory.

Some information or links to documents about data quality assessments or information were present in 506 of 667 (76%) records, while documentation or information pointing to relevant information about the maturity assessment of the products was available for 229 of 667 (34%) records. Furthermore, additional information and tools to facilitate the use of data records is not a common practice in the different platforms and initiatives assessed. Out of the six initiatives/platforms assessed, only the ESA CCI program has a wide range of products such as the CCI toolbox, or tutorials such as Massive Open Online Courses (MOOC). An example of such a MOOC is the Essential Climate Variables and megatrends by ESA (ESA Citation2015) that aims to introduce a diversity of users to the ESA CCI data products and support visualization, analysis and data processing.

118 of 667 (18%) records assessed provide the contact (e.g. e-mail address or link) or contact person/organization, excluding those records form the WGClimate inventory that did not assess this parameter. A summary of the results is illustrated in .

Figure 4. Summary of the usability indicators (occurrence of values). A total of 667 data records were assessed.

Figure 4. Summary of the usability indicators (occurrence of values). A total of 667 data records were assessed.

4. Discussion

Our results have confirmed that a lack of a unified access point to ECV data and associated information products, and of common standard data and metadata catalogues, can drastically hamper the automatic harvesting of data records. Moreover, the current deficiency of a common semantic (i.e. the ability of services and systems to exchange data in a meaningful/useful way; Harvey et al. Citation1999) and of a shared ontology (i.e. a common knowledge model that represents a particular domain of interest; Lacasta et al. Citation2007) add to this difficulty. Standards and interoperability of data are key components for the discoverability, accessibility and usability of data records. Based on our results, we found that there is a need for a comprehensive catalogue and index of ECVs data and information products to facilitate the discoverability, access and use of these data records.

Furthermore, the creation of an up-to-date index of core initiatives and observing systems, their responsibilities and contributions to GCOS, along with an exhaustive inventory of ECV and related FCDRs, CDRs, and TCDRs, can be of great use. This could help interested users, stakeholders and decision makers to have a better overview of the international efforts undertaken to generate the state-of-the-art EOs and data products. In addition, this could support the creation of climate indicators to monitor different policy frameworks, international agreements and targets, such as UNFCCC, the Rio Conventions, the Ramsar Convention, the Sendai Framework for Disaster Risk Reduction 2015–2030 or support progresses towards the UN Sustainable Development Goals (SDGs) and particularly SDG 13: Take urgent action to combat climate change and its impacts (Anderson et al. Citation2017).

Improving the accessibility to these records would require the implementation of standard data and metadata schemas along with catalogues and repositories by ECV data providers. The discoverability of these products could also be improved by including ECV specific tags as part of the metadata file, so that browsing, and filtering datasets are easier. ECV specific tags could certainly facilitate the access to data records that are already accessible via robust infrastructures and data catalogues such as NASA’s Earthdata, NOAA’s data catalogue, Google Earth Engine, and the GEOSS portal.

The integration of ECV data records into the GEOSS Portal could take advantage of the brokering-based architecture and scalable spatial data infrastructure implemented to discover and access a centralized catalogue of EOs. This would allow discovering and harvesting ECV records using common standard protocols, as well as increasing the visibility of projects generating ECV data records (Nativi et al. Citation2015).

The selection of fit-for-purpose data records by end-users would require: (1) better access to robust product specific documentation of user guidelines/support, (2) openly available technical information and product descriptions (e.g. EOs used, algorithms, product resolution, temporal granularity, temporal and spatial coverages), (3) information about uncertainty, quality control and stewardship maturity assessments, (4) the description of well-known issues and level of compliance with GCOS requirements, (5) possibility to search, discover and access FDCRs, CDRs, TCDRs and ICDRs (if these are available) from the same ECV and data provider, (6) better documentation on the FCDRs, CDRs, TCDRs and ICDRs, (7) better information on the provenance of the data records all along their data cycle, and (8) user feedback information.

The availability of robust user documentation, user support and expert feedback annotations systems were found to be particularly lacking. Solutions exist, such as the EU FP7 project CHARMe (University of Reading et al. Citation2018) or the OGC standard for Geospatial User Feedback (GUF) (OGC Citation2016), that could allow users to have a better overview of the datasets provenance and would facilitate an informed use of data records. Even though ESA CCI products involve the collaboration of experts from at least ten different research organizations to generate an ECV dataset, the ESA CCI initiative stands out at adopting documentation and training material for a wide outreach of potential users.

Data usability is linked to the data product license, the facility to generate customization applications (ad-hoc and web-based), the ease to explore the data records (e.g. through visualization tools), and the possibility to generate spatiotemporal analytics such as time series, regional means, or trends (Gorelick et al. Citation2017; Peng et al. Citation2015). In this context, the implementation of analysis-ready data catalogs (e.g. Google Earth Engine, Gorelick et al. Citation2017), API client libraries (e.g. GEO-DAB IIA Citation2018) and web-responsive applications to access and represent complex multidimensional data structures (e.g. NetCDF/HDF data encoding format) could help overcoming the current download-only access to data records. This would allow the automated ingestion of datasets in data processing pipelines, which could pave the way to more user engaging applications and information products.

5. Conclusion

Our study provided an overview of major initiatives that support the access to dedicated ECV products with a focus on records that have a strong contribution from satellites-based EOs. Different indicators were adopted to assess the data accessibility and usability of ECV data records from an end-user perspective. We identified that the lack of standard metadata information, the scarcity of data dissemination via web services and the deficiency of user support and documentation hinder the discovery, access, and use of ECVs. In addition, multiple data records are available from different data providers that frequently distribute data records without metrics or information about the level of compliance with GCOS requirements or the maturity of the data records.

Improving the discoverability, interoperability, and accessibility to quality-controlled ECV data records using web services could optimize the regular update of inventories of current and planned ECVs and related FCDRs, CDRs, and TCDRs. This could reduce the time invested in finding datasets and could allow linking data records with relevant documentation and user feedback. Furthermore, this could facilitate the browsing, filtering, retrievial and ingestion of these records into automated data processing pipelines, in order to help the generation of community-driven tools, libraries (e.g. Python client libraries) and added value applications across catalogues. Overall, the improvement of ECV data records is instrumental for better studying how, where, and when climate system characteristics (oceans, land, atmosphere) have changed, and for proposing data-driven mitigation and adaptation measures, while also supporting the development of regional applications to target specific climate services requirements and policy frameworks.

Supplemental material

Supplementary_Material

Download Zip (91.1 KB)

Acknowledgements

We thank Anahí Sebastián Rico Chinchilla and Caryn Saslow for helpful comments on an earlier draft, and two anonymous reviewers for very constructive comments. We acknowledge the support of the European Commission H2020 ERA-PLANET/GEOEssential project No. 689443.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by Horizon 2020 Framework Programme [grant number 689443].

References