Abstract
The HUMBOLDT project has the aim of implementing a Framework for harmonisation of data and services in the geoinformation domain, under the Infrastructure for Spatial Information in Europe (INSPIRE) Directive and in the context of the Global Monitoring for Environment and Security (GMES) Initiative. The two-pronged approach of HUMBOLDT comprises a technical side of software framework development and an application side of scenario testing and validation. Among the HUMBOLDT Application Scenarios designed to demonstrate the capabilities of the Framework there is the one covering Protected Areas themes and use cases. It aims to transform geoinformation, managed by park authorities, into a seamless flow that combines multiple information sources from different governance levels (European, national, regional), and exploits this newly combined information for the purposes of planning, management and tourism promotion. The Scenario constitutes a step further towards the integration of monitoring systems envisaged in the view of Digital Earth. Protected Areas Scenario creates an examples of the use of the HUMBOLDT tools in Desktop and Web GIS environment, together with setting up a server environment exploiting HUMBOLDT harmonisation framework as taking into account user requirements and needs and providing benefits for making the road to ESDI establishment easier.
Introduction
Harmonised geoinformation is a basic need for fulfilling the task of creating a Spatial Data Infrastructure (at scales ranging from regional to global) which is reliable and efficient, in which different data sources and different services for discovery, portrayal and retrieval of geodata are a fundamental asset (Annoni and Smits Citation2003, Bernard et al. Citation2005).
Digital Earth vision (Gore Citation1999) envisages new perspectives and points of view for all the scientific disciplines and technical sectors linked to geoinformation sharing and utilisation, to a better knowledge and management of our planet. A crucial point in Digital Earth vision is the integration of services, tools and data (Grossner et al. Citation2008).
At the European level, the road to geoinformation sharing and integration is deeply inscribed into the process of implementation of a European Spatial Data Infrastructure (ESDI) that follows the guidelines contained in the INSPIRE (Infrastructure for Spatial Information in Europe) Directive of the European Union (Commission of the EC Citation2007). The INSPIRE Directive consists of a regulation framework for European Union geodata aiming at enforcing the use of best practices and easy-to-use and integrated interfaces for the benefit of geoinformation users (virtually, every citizen of the EU), thus making possible the creation of an ESDI. In this context, the structure of an ESDI shall be composed of a set of interoperable, interacting services, thus following the Service Oriented Architecture (SOA) paradigm (MacKenzie et al. Citation2006). Such an architecture well matches the distributed responsibilities regarding service provision and data management in the geoinformation sector. For an SOA to work, an essential element is to select or build on a group of interface standards that are mutually interoperable and complementary (Smits and Friis-Christensen Citation2007). For this, any new component for SDI should be interoperable with the existing services codified by geospatial standardisation organisations, and in particular the Open Geospatial Consortium (OGC) at international level (McKee Citation2001). On this topic, as Web Mapping Service (WMS), Web Feature Service (WFS), Web Coverage Service (WCS), Web Processing Service (WPS), and Catalogue Services for the Web (CSW) are the most relevant for the geoinformation domain covered by the HUMBOLDT project and this work. Besides these well-established standards, there are various areas where standardisation has not come very far yet or where there are multiple competing standards (Ziegler and Dittrich Citation2004). The HUMBOLDT project focused on the objective of enabling harmonisation instances, especially the ones not covered by existing standardised procedures, as a whole; that is to say from harmonisation instances definition to harmonisation performing. Its developments must therefore be very flexible with respect to their configuration and modes of deployment to fit into existing Spatial Data Infrastructures. The main advancements and outcomes in enabling and making easier geodata harmonisation are related to: management of cross-border geoinformation, enabling of cross-domains geoinformation integration, and enhancement of geodata access.
The data harmonisation process involves both technical and non-technical aspects. Nevertheless this work has been mainly centred on technical aspects. Even if this work is focused on technical and scientific aspects of harmonisation processes, a strict distinction between technical and non-technical aspects can often not be made because of existing interdependencies, since a harmonisation process can either be executed on an organisational level (through a common agreement of all parties involved in the harmonisation process) or on a technical level (that is providing all involved parties with a tool that supports a harmonised solution).
During recent years, some projects and technical tools as well as scientific works have been dealing with solving harmonisation issues. An outcome of the RISE project (Eriksson and Hartnor Citation2006, Portele Citation2006) was the provision of a general data harmonisation methodology that can be applied to spatial data, through the development of harmonised product specifications following some steps carried out ‘manually’ or rather by human beings (experts).
Other projects dealt mainly with ontological issues, as for example the HarmonISA-project (Hall Citation2006) that aimed at developing a set of tools to semi-automatically integrate different land-use datasets through an expert-driven approach supported by the used and developed software such as ontology editors.
On the other hand, the application of the Model Driven Approach (MDA) to harmonisation process definition and implementation has given momentum to the development of some tools and languages for enabling application schema mapping as one of the main issues to be solved in the field of geodata harmonisation handling, such as UGAS-ShapeChange, a Tool for changing ISO-19109 UML-models to GML Application Schemas. An important step forward in state of the art for heterogeneous geodata transformation and transfer was marked by INTERLIS (Gnägi et al. Citation2006), a conceptual schema language (CSL; customised profile of ISO 19103/19109) based on MDA (Model Driven Architecture) as well as a system-neutral transfer format. INTERLIS facilitates data transfer between different data stores with different data models via transformation to/from a common data model and using a common interchange format and it supports both spatial and non-spatial data. Some solutions have also been developed in an integrated web-based frame as OGC services compliant products useful in cross-border applications, and namely the mdWFS (model driven Web Feature Service), adding a theoretical approach to the capabilities of handling schema translation between data models on-the-fly (Donaubauer et al. Citation2006).
All those efforts in the field of geodata harmonisation approach have tackled only one harmonisational issues at a time (e.g. schema mapping, harmonised catalogue search services, language translation and ontology). The HUMBOLDT project, and the work described in this paper, delivered a framework that is both a theoretical one and a framework of software tools that can handle the harmonisation process as a whole, tackling multiple harmonisational issues as instances of the same harmonisational process.
Within the scope of our work and after a first analysis of harmonisation-related studies and efforts, the following working definition for data harmonisation was taken into account in the HUMBOLDT project environment:
Geodata harmonisation implies and means the possibility to combine data from heterogeneous sources into seamlessly integrated, consistent and unambiguous information products, in an easy and repeatable way, adapted to the end-user's requirements and context. (Schulze Althoff and Giger Citation2009)
• | Data format | ||||
• | Data collection procedures/data quality | ||||
• | Spatial reference system | ||||
• | Data/conceptual model: structure and constraints | ||||
• | Metadata model | ||||
• | Nomenclature, classification, taxonomy, terminology/vocabulary, thesaurus, ontology (Tikunov et al. Citation2008) | ||||
• | Scale, degree/amount of detail, extent (spatial, thematic, temporal) | ||||
• | Portrayal (legend/classification, style) | ||||
• | Processing functions, their parameters and formulas/algorithms |
Geodata harmonisation within the HUMBOLDT project
The HUMBOLDT project, built by EU founding and support, contributes to the implementation of a ESDI that integrates the diversity of spatial data available for a multitude of European organisations. The main goal of HUMBOLDT is to enable organisations to document, publish and harmonise their spatial information in a way that is as seamless as possible. The software tools and processes created demonstrate the feasibility and advantages of the INSPIRE Directive. Moreover, the overall outcomes of the project will contribute to the technological advance of data harmonisation and sharing, integrating therefore with the vision of Digital Earth and especially through the joint EU and ESA initiative Global Monitoring for Environment and Security (GMES), which is the main European contribution to GEOSS (European Commission Citation2008, GEO Citation2009). The contribution of the project to GMES consists of the demonstration of usefulness of harmonisational capabilities in application areas that are or will be covered by GMES services, especially downstream services.
The approach of the HUMBOLDT project for solving raising harmonisation issues and user needs displays a two-pronged structure (see ) and focuses on integrating both concrete application requirements but also technical innovations, best practices and research results (Villa et al. Citation2008).
Figure 1. The approach of the HUMBOLDT project to geodata harmonisation with its two-pronged structure made up of a technological side and application momentum converging into the implementation of a Framework architecture.
![Figure 1. The approach of the HUMBOLDT project to geodata harmonisation with its two-pronged structure made up of a technological side and application momentum converging into the implementation of a Framework architecture.](/cms/asset/e25a1841-4a7e-45dd-bf07-7cc05fc81f55/tjde_a_585183_o_f0001g.gif)
The benefits brought via harmonised spatial data using HUMBOLDT tools are not only resulting in the reduction of implementing efforts and costs for the future ESDI, together with an abatement of costs related to ESDI realisation and easier geodata handling but, also, dealing with the technical and scientific outcomes of the project, a sensible advancement has been given to:
• | Support to cross-borders geoinformation management. | ||||
• | Enabling cross-application domains geoinformation sharing and integration, thus affecting scientific fields of analysis not only in directly related geosciences, but also bringing a wide benefit to socio-economic studies, statistical analysis, civil protection and security, medical and epidemiological issues (Vanderhaegen and Muro Citation2005). | ||||
• | Overcoming limitations inhered in spatial data availability from incompatible data formats to semantic gaps related to lacking data and metadata models. | ||||
• | Enabling access to geospatial services not available or not usable at this very moment, using current technological solutions because of inconsistencies in data definitions and formats or lacking of data documentation and modelling. | ||||
• | Creation of new information through the access to additional data and services, affecting the decision-making process and making it more comprehensive (in the fields of social security, environmental issues and infrastructure planning, for instance). | ||||
• | Enhancing and facilitating of data and services access and distribution. |
• | Border Security: Effective Border Control and Security in Rural Areas | ||||
• | Urban Planning: European Urban Management Information Systems | ||||
• | Urban Atlas: Enforcing GMES in Urban Areas Mapping Core Services | ||||
• | Forest: Saxony & Czech Cross-Border Forest Scenario | ||||
• | Protected Areas: Management of Protected Areas | ||||
• | ERiskA: Environmental Risk | ||||
• | Transboundary Catchments: Cross-border Water Basin Management | ||||
• | Ocean: Oil/Contaminants Spill Crisis Impact and Management, expanding the experience done with SeaDataNet (Schaap and Lowry Citation2010) | ||||
• | Atmosphere: Integration for Atmospheric Data Distribution |
The HUMBOLDT framework for geodata harmonisation
At the core of the development work described here on the topic of geodata harmonisation processes stands the HUMBOLDT Framework, which consists of a software architecture targeted at performing harmonisational instances and process. This software framework is the hull for the various data harmonisation application scenarios tested during the project.
The Architecture of the HUMBOLDT Framework, SOAP-based, has been centred on an approach that comprises the fundamental part of a Mediator Service (see ), a proxy that acts as controller of the service components that are part of the Framework for service integration. It offers a number of standard OGC interfaces like WMS, WFS or WCS to clients. The HUMBOLDT Mediator Service combines a number of different functionalities and hides them behind standard OGC interfaces. It is a Workflow Engine, capable of executing chains of geoprocessing services as well as a Feature Portrayal Service, dynamically portraying Features and serving them via the OGC WMS interface. The Mediator Service orchestrates a set of more specialised interfaces that are also integrated in the overall architecture (Fitzner and Reitz Citation2009), and all together they compose the framework itself, as shown in . These components are briefly described as:
The HUMBOLDT GeoModel Editor
An easy-to-use editor for application experts, aiming at collecting all required information on the geodata inputs. The HUMBOLDT GeoModel Editor is producing and providing a graphical and a textual representation of the data model containing basic spatial data types. It was implemented on a model-based framework (Eclipse) and thus is able to support the so-called vertical mapping, which is the serialisation to transfer standards or other representations (e.g. XMI, GML, INTERLIS, ISO19131).
The HUMBOLDT Alignment Editor (HALE)
A tool with a rich graphical user interface for defining mappings between concepts in conceptual schemas (application schemas created with the HUMBOLDT GeoModel Editor), as well as for defining transformations between attributes of these schemas. The HUMBOLDT Alignment Editor has several properties that make it stand out from other data transformation definition tools (Reitz and Kuijper Citation2009).
The Workflow Design and Construction Service (WDCS)
The Workflow Design and Construction Service is a tool with a graphical user interface that allows users to register, manage and graphically compose geoprocessing components into workflows that are the schematic representations of the harmonisation processes chain, mainly composed of WPS either incapsulated into the framework or external. The Workflow Frontend therefore offers a quite similar functionality as e.g. the GUI of the ArcGIS Model Builder or a BPEL Workflow Designer.
The Geodatabase/Repository service
Aside from processing tools and transformation components, the Geodatabase and Repository services support catalogue enhanced search, through the Information Grounding Service (IGS) and the Model Repository. The IGS is a cascading catalogue in the sense that it holds information on other catalogues and metadata stores, in addition to metadata of data sources. What makes the Information Grounding Service unique is that it does not only return those data sources that directly satisfy a user request but additionally those that can potentially be transformed to satisfy the request. This makes a new concept in catalogue services delivering for geospatial information and it supports the vision of the formulation of demand versus provision of directly usable information. The Model Repository is a service component that allows maintenance of application schemas (e.g. those created with the HUMBOLDT GeoModel Editor) and mappings between those (e.g. those created with the HUMBOLDT Alignment Editor) for future reuse.
A set of Transformation Services
The working facade of the harmonisation framework, the Transformation Services are in charge of the actual transform of the data, following the harmonisation requests made using other components. They mainly consist of WPS, such as:
• | Coordinate reference system transformation serviceThe Coordinate Transformation allows transforming coordinates between various geographic reference systems, i.e. geoids and projections. | ||||
• | Conceptual schema transformation serviceThe Conceptual Schema Transformer is able to apply a schema transformation to a source dataset expressed in a certain Application Schema. | ||||
• | Multiple-representation merging serviceThe Multiple Representation Merging Service (MRM) is capable of fusing Features of datasets with a spatial overlap, such as along a common border, where water bodies are part of both datasets. | ||||
• | Edge-matching serviceThe Edge Matching Service aligns edges and points of vector geometries so that they will be gapless. | ||||
• | Language transformation serviceThe Language Transformation transforms single terms from one language to another and enables the language transformation of datasets. |
Being that the harmonisation software framework is very articulated and flexible, its deployability depends upon the user's needs and requirements, covering a broad range of harmonisation issues (e.g. format conversions, multilinguality, schema mapping). Nevertheless, the transformation and harmonisation process and its feasibility and efficiency, strongly depends on the availability of the description of transformation rules from the viewpoint of the conceptual schema level. Therefore, the great importance of fuelling domain and expert knowledge into the activities of implementation of the HUMBOLDT Framework, which is done through the continuous interaction between developers and application experts involved in HUMBOLDT Scenarios.
An application scenario introduction: Protected Areas
Among the HUMBOLDT Application Scenarios (listed in the previous section) designed to demonstrate the capabilities of the Framework, both as a test bed for harmonisation components in real world conditions and as user community application momentum, there is the one covering Protected Areas themes and use cases. The Protected Areas Scenario exploits the valuable background of activities related to Spatial Data Infrastructures for Protected Sites in the framework of NATURE-GIS and NATURA 2000 that have delivered knowledge and expertise to the implementation of guidelines for INSPIRE Data Specifications on Protected Sites (INSPIRE Drafting Team DS Citation2010).
HUMBOLDT Protected Areas Scenario aims to transform geoinformation, managed by park authorities, into a seamless flow that combines multiple information sources from different governance levels (European, national, regional), and exploits this newly combined information for the purposes of planning, management and tourism promotion.
The Protected Areas Scenario Demonstrator Portal has been developed and one example as an application case for the Scenario is described in the work. During the work, a Desktop and Web GIS environment was set up together with a server environment test and created examples of data harmonisation in this domain, and the resulting tests with two HUMBOLDT Web Processing Services have been documented.
Protected Area Scenario is structured into Use Cases that consist of applications of different harmonisational instances related to different requirements and users that make use of harmonisation capabilities provided by the HUMBOLDT Framework. In detail, the Use Cases are:
Management of a Protected Area
This use case refers to the management of the area. Users of geographic information are planners and officers, but the management of a Protected Area is a decision-maker's responsibility. The objective is to embed geographic information in a seamless flow that gathers information from all available sources and exploits it for planning and management. The main task is to create plans and managing the protected area.
Tourism valorisation in a Protected Area
This use case refers to the promotion of the area and implies access to geographic information especially by citizens and commercial operators who are also final users looking for browsing tourism information. The objective is to embed geographic information in a seamless flow that gathers information from all available sources and exploits it for promotion. The main task is to exploit the area in the best way to enjoy what nature offers.
Application and use cases developed in the Protected Areas Scenario have, of course, a European focus on INSPIRE compliant data provision, which is the creation of a new data structure for protected areas datasets based on the INSPIRE schema for Protected Sites; the Scenario delivers it side to side with some examples also showing a schema harmonisation case using a data model specifically created in the Protected Areas Scenario (see ). The Protected Areas Scenario has explored the relation between the Protected Areas Scenario and the data Specification for Protected Sites from INSPIRE. Scenario activities were actively participating in the process of putting into practice INSPIRE themes, especially dealing with the Annex I Protected Sites data theme.
The main harmonisation issues of the Scenario are related to the mapping of different schemas and transformation of the structure and geometry of datasets of Protected Areas. Portuguese, Spanish and Italian datasets used in the Scenario are currently not based on a Data Model. The creation of a Common Protected Areas Target data model (or the use of existing ones like INSPIRE) is a major need for the schema mapping and transformation tasks. Also, high priority is given to spatial and thematic consistency (i.e. seamless and consistent map layers using edge matching techniques).
Following the opensource approach of the HUMBOLDT Framework, all the operations and processing steps within the Protected Areas Scenario have been performed using opensource tools, from the pre-processing of the datasets to the visualisation of final results. A schema of the HUMBOLDT components architecture, which are deployed for Protected Areas Scenario, is given in .
Figure 4. Overview of the HUMBOLDT Framework Components architecture, as utilised in Protected Areas Scenario.
![Figure 4. Overview of the HUMBOLDT Framework Components architecture, as utilised in Protected Areas Scenario.](/cms/asset/7700ea82-7413-40fc-a4ff-40bc860bd870/tjde_a_585183_o_f0004g.gif)
The Scenario develops the harmonisation process via active engagement with various stakeholders at the national and transnational levels including national authorities and European agencies. Protected Areas Scenario actors are people/institutions using geodata and geoinformation for preservation, sustainable exploitation, tourism and science/education.
A schematic classification of actors/users involved in Protected Areas Scenario is structured this way (the classes of users are summarised in ):
Table 1. Classification of geoinformation users and characteristics in relation to the HUMBOLDT Framework and Protected Areas Scenario.
End users
The end users are involved in browsing geoinformation (aggregated information: the lowest level of access) or geodata (information elements). They are decision makers, tourism operators or citizens (the last ones intended as persons). This group can be split into two categories.
End users of geodata
They are the decision makers who need to access information for taking decisions. In general they browse but do not process it digitally. Their main needs are: (1) access via a friendly user interface in order to browse data, (2) efficient management of heterogeneous documents access and handling, (3) present and discuss the debated issues to public administrators or vice versa to citizens.
End users of geoinformation
They are the citizens (as persons) who need to access geoinformation for participation/awareness, personal exploitation, education: they are the final recipients of information by the stakeholders of protected areas, who only browse information and do not have any specific technical skill.
Data integrators
The data integrator is considered as responsible to collect and analyse relevant data and give derived information in different forms (verbal, reports, prepared maps) to different user groups (e.g. the End User of geodata). They are planners/officers and/or scientific researchers, that is users involved in data processing. Their main needs are: (1) producing plans at the various levels, (2) exchanging of information with other departments, (3) reporting to the other levels of responsibility, (4) communicating to citizens, (5) exporting into web-services the outputs of models as structured and complex information.
Data providers
They provide data, ‘on catalogue’ or tuned to specific uses. We consider in this class only commercial operators, because Administrations, which play a major role in data providing, are considered as ‘Data custodian’ and not as a mere provider. They can be divided into Geodata providers and Geoservice providers and they must be able, overall, to deal with all kinds of input and release products according to the requested output. They do not represent their specific own needs but have to meet all needs arising from the other actors.
Data custodians
This class includes people or institutions providing data (harmonised or not), adapted to given standards. It includes different kinds of actors and in general can be considered a ‘super-set’ of data providers. They are in fact responsible, in several cases, not only of data production (as it is the case of data providers) but are as well responsible for the whole cycle of life of geodata: production and documentation (metadata), data modelling and compliance with standard, maintenance and update. At the protected areas level, they are mainly producers of the basic maps and/or producers of the thematic geoinformation related to nature conservation. Data Custodians are responsible for the harmonisation of the available data (in case they are coming from different regions), for the creation of specific application profiles in case of complex and multilateral tasks, and for the creation of web services for provision of data.
Namely, the Protected Areas Scenario is intended to provide harmonisation support especially for the interaction between various levels of work and administration: management bodies, local stakeholders, national authorities, European agencies, cross-border administrative bodies. The needs of these users for harmonised data in the scope of Protected Areas Scenario, described above, are also summarised in .
Table 2. Harmonisation issues and needs in the HUMBOLDT Protected Areas Scenario.
The process of data harmonisation is addressed to make interoperable the information shared by the different data providers. It is important to distinguish data harmonisation on different levels (conceptual schema, logical schema or physical schema). The Scenario counts on a good and representative catalogue of datasets to enable understanding of harmonisation issues and the use of the HUMBOLDT tools, as well as the need for using and integrating several data layers. A number of interoperability and data harmonisation issues were addressed within the work. The following data harmonisation requirements have been identified by users of protected areas geodata:
• | Data formats: There is a need for the creation-modification of Web Services (WMS, WFS) with standardised syntax. | ||||
• | Spatial reference systems: There is a need for a common reference system. | ||||
• | Metadata profile: Different metadata profile had been identified for the data made available for the Scenario. For instance, Portuguese metadata is based on the ISO 19139 standard and Italian metadata uses the 19115 profile. | ||||
• | Conceptual schemas (data models): Since the data structure for the datasets object of study in our Scenario is different, there is the need for the creation of a Common Protected Areas Target data model for heterogeneous data from different protected areas data providers. The used approach is to be as much as possible compliant with the Protected Sites INSPIRE data model. | ||||
• | Classification schemas: Datasets have been created on different classification schemes. | ||||
• | Scale/resolution: It is important to be able to deal with the different planning and management levels. | ||||
• | Spatial consistency of data: The geometry of real-world objects must be consistent between different datasets. | ||||
• | Multiple representation of the ‘same’ spatial objects. | ||||
• | Terminology and Multilinguality support. |
In this context, it was a crucial fact that the development of scenario services and data modelling was meant to be compliant with existing standards (INSPIRE, OGC, UML, etc.), which suggests that it would be possible for the external geospatial community to reutilise the existing components and/or extend them for their customised purposes (e.g. introducing new services, adding new data sources, etc.). This makes a strong point in assessing the transferability and efficiency of HUMBOLDT outcomes, especially for the applications delivered through Scenarios.
From the point of view of geodata harmonisation benefits for Protected Areas, results are very promising and show the high relevance and benefits of the Scenario demonstrators in achieving HUMBOLDT objectives, solving specific harmonisation problems, and usefulness and relevance to INSPIRE and GMES communities. In conclusion, it can be said that HUMBOLDT Scenarios, and Protected Areas Scenario among them, provide the proof and the concepts useful for solving a subset of identified harmonisation problems by using the HUMBOLDT Framework to various communities of geodata users, in an easy and accessible way. Also, HUMBOLDT Framework and the Scenario demonstrators have established a foundation for solving other major geospatial data harmonisation problems that were tackled during the lifetime of the project and provides the flexibility of the architecture to adapt and re-use in the context of other harmonisational issues not yet covered during the HUMBOLDT project.
An application scenario example in practice: Protected Areas schema alignment
The Scenario is tested in three areas, one between Portugal and Spain, covering the Douro River Natural Park in Portugal and the Arribes del Duero Park in Spain, one covering Protected areas of Community of Castile and León, one of the 17 autonomous communities of Spain, and one in Italy, covering the Beigua Regional Park, in Liguria. An example of harmonisation using HUMBOLDT, and in particular on performing semiautomatic schema mapping and alignment with HUMBOLDT Alignment Editor (HALE) involves the use of the dataset for the Protected Areas in the second site listed above: the Community of Castile and León natural areas, in Spain.
One of the main objectives of the HUMBOLDT project is to provide tools to map and transform complex database and application schemas. In this sense, the work of the Protected Areas Scenario has been focused on harmonising Protected Areas data from various countries using the HUMBOLDT Alignment Editor.
The example which is being introduced makes use in particular of some components of the HUMBOLDT Framework, briefly described in previous sections:
• | The HUMBOLDT Alignment Editor (HALE), which helps us map and transform complex database and application schemas. | ||||
• | The Conceptual Schema Translation Service (CST), a Web Processing Service for transforming data from one application schema to another. |
In the HUMBOLDT Protected Areas Scenario, data from Portugal, Italy and Spain have been used. The Scenario conceptual data model is based on the relevant data covering the test sites, and the INSPIRE Protected Sites data model has been used as a reference. We have tried to make the model, features, and attributes as similar as possible to the INSPIRE model. This means that, when a HUMBOLDT Scenario schema attribute (or feature) shares the meaning with an INSPIRE attribute, we have changed our attribute to the same name as the one in INSPIRE model.
Once the target data schema/model is defined, HALE (briefly introduced in the previous section) helps in establishing the mapping rules for the classes and attributes of a source to a target conceptual schema.
The first step in using HALE for schema mapping, a very crucial harmonisation instance for geodata, is to load the schemas in the HALE Schema Explorer. We start with our source schema and we import into HALE as source schema the Protected areas dataset, which is in this case provided in vector format (shapefiles).
After loading our source schema we can also load our source data, permitting the visualisation of a cartographic representation of the reference data for the source schema and the transformed data alongside each other. Once the source schema is loaded, we can load alongside the target schema, which in this example consists of a specifically defined HUMBOLDT Protected Areas schema.
As a second step, after the inspection of our schemas, we will continue with the mapping of the elements, selecting the elements we want to map in the Schema Explorer. Once the mapping is performed using HALE, the matching table created can be used for applying the transformations in the Schema Explorer, after selecting the appropriate mapping function. In this case we used just the ‘Rename Attribute’ function. To use the ‘Rename Attribute’ function, you must select the attribute in the source schema you would like to rename, then select the element in the target schema that the attribute should be copied to. We can also use the ‘Attribute Default Value’ function to fill a field with no data in the source like ‘IUCNCategory.’ When running the ‘Attribute Default Value’ function, a list of available values appear. In this case we choose ‘Protected Landscape/Seascape.’ You can interactively check the results of matching operations requested. The features in the ‘Transformed Data’ view are already transformed using the alignment mapping. For instance, by expanding the attributes, you can view what value HALE assigns to them. In shows an overview of HALE interface and visualisation tools.
Figure 5. Example of Schema Mapping and Alignment with HALE, based on Protected Areas data models and attributes. The features in the ‘Transformed Data’ view are already transformed using the alignment mapping. For every attribute, you can view what value HALE assigns to them. In the figure you can see how the data structure has changed for the ‘name’ attribute of a given Protected Area. In this case we use the dataset for the Protected Areas (Red de Espacios Naturales) in the Community of Castile and León, in Spain.
![Figure 5. Example of Schema Mapping and Alignment with HALE, based on Protected Areas data models and attributes. The features in the ‘Transformed Data’ view are already transformed using the alignment mapping. For every attribute, you can view what value HALE assigns to them. In the figure you can see how the data structure has changed for the ‘name’ attribute of a given Protected Area. In this case we use the dataset for the Protected Areas (Red de Espacios Naturales) in the Community of Castile and León, in Spain.](/cms/asset/fe25213c-19eb-45c0-98bb-89d786112337/tjde_a_585183_o_f0005g.jpg)
After the Schema alignment is performed, mapping rules delivered by HALE are used as input for schema translation and transformation to the target schemas, using the HUMBOLDT Conceptual Schema Translation Service (CST). Its main feature consists of the ability to apply a schema transformation to a spatial dataset in order to provide another dataset modelled in the target application schema. Hence, CST actually performs the geospatial transformations defined beforehand, for instance making use of the HUMBOLDT Alignment Editor (HALE). From the design point of view, CST can be accessed in two different ways: as a Web Processing Service (WPS) through any WPS client, as Snowflake, Jump or Udig, or making use of the HUMBOLDT Mediator Service (in this case CST 3-4 can be a library or a WPS).
Summary and outlook
In the context of the vision envisaged under the umbrella of Digital Earth, a crucial issue is the interoperability and integration of services, tools and data in a wide range of domains and uses. In the European context, this vision is put into practice in geoinformation fields among which the main ones are the activities related to ESDI implementation, following the guidelines marked by INSPIRE Directive.
The HUMBOLDT project has taken charge of providing solutions to geodata and geoservices harmonisation, covering harmonisation concept as a whole.
This paper has described the structure and approach of the HUMBOLDT project, giving a rationale for the HUMBOLDT Framework capabilities and discussing the outcomes of the HUMBOLDT Protected Areas Application Scenario.
The major aim of HUMBOLDT is the implementation of efficient, cost-effective, reliable, generic, interoperable and sustainable solutions for the issue of spatial data harmonisation and integration of geographic services in the framework of an ESDI. This objective is to be reached by putting INSPIRE principles into practice, applying international standards and using, as core reference, the users’ requirements and needs, finally establishing a community of users and developers.
The HUMBOLDT Framework is an architecture of software components and services aimed at managing the harmonisation process of geoinformation within the European context. The methodology of the HUMBOLDT development is based on a dual approach, comprising both a technological and an application side, and on an iterative process of implementation, during which the solutions found are tested and validated with the cooperation of an application momentum, composed of Scenarios that cover topics of great importance also in GMES.
Geoinformation has become more and more relevant in supporting decision making during the last two decades. With the rise of geodatabases and digital information sharing, data models were developed and consistency rules were established. Although large investments have been spent on this migration, it has paid off in more streamlined procedures and higher quality of the data. This has reduced cost and increased the efficiency of data management within the data production organisations. As an example, the reduction in delivery times and costs for providing basic maps and carthography, using geoservices instead of paper map sheets is clear.
It is sometimes claimed that data is used for making decisions and value is created when decisions are turned into action (Krek and Frank Citation2000). When building up a European SDI, we then face the problem of having the data producers having the major costs of the SDI implementation, while the data users, or the ones having the benefits, sit at other organisations. As more and more services in the field of data harmonisation become available, the more efficient and complete can be the availability and reliability of those data and services.
The HUMBOLDT Protected Areas Scenario applications have demonstrated in practice that the implemented framework for data harmonisation described is a working solution to tackle a variety of geodata harmonisational issues.
Automation is one key aspect in improving the cost efficiency of data management. Unified harmonised data delivered through an opensource, modular and integrable framework, like the software developed and described within this work, is one key element in this automation. Moreover, geospatial data is to be used for decisions, and harmonised data (in this case focused on protected Sites) provide opportunities for making more efficient decisions (higher level of automation, less uncertainty about the semantics, etc.). In economic terms, we can therefore state that the transaction costs are reduced. Finally, since harmonised data reduce the uncertainty in the semantic interpretation, it may also give opportunities to making reliable decisions and the approach shown in this paper assures a variety of solutions for making this reliable decision making (through the exploitation of heterogeneous data in a unique frame) affordable (almost no cost, open source, strong community of developers, INSPIRE compliancy, flexibility and multi-solutions availability).
Those are the main achievements of the HUMBOLDT approach and tools for geodata harmonisation, which most importantly adds to the state of the art in data harmonisation the delivery of a framework that is both a theoretical one and a framework of software tools that can handle the harmonisation process as a whole, tackling multiple harmonisational issues as instances of the same harmonisational process, which is the point that makes HUMBOLDT a relevant advancement in enabling as smooth as possible data harmonisation to the user.
The outcomes and benefits of HUMBOLDT provided results (Framework and Scenarios) are mainly related to the reduction of implementing efforts for the future ESDI, both from a technological point of view and a cost-effective approach to geodata sharing, and can be summarised briefly in the following points:
• | Support to cross-borders geoinformation management (all over Europe and beyond) | ||||
• | Enabling cross-domain applications through geoinformation sharing and integration (affecting scientific fields of geosciences, social studies, security) | ||||
• | Overcoming limitations in spatial data availability (incompatible data formats, semantic gaps, lacking data and metadata models) | ||||
• | Enabling access to geospatial services not available using current technological solutions (due to inconsistencies in data definitions and formats or lacking of data documentation) | ||||
• | Creating new information through the access to additional data and services (thus making decision-making exploiting geospatial data easier) | ||||
• | Enhancing and facilitating data and services access and distribution (affection technological and commercial sector investments) |
The HUMBOLDT project shows challenges to geosciences research, covering topics in data harmonisation at a continental scale. Nonetheless, the more relevant the challenges to face, the better the benefits that will surge from their solutions: benefits for specialised and non-specialised users of spatial data, for policy-makers, planners and managers, for European citizens and their organisations, at a level that varies from local to regional to European. These benefits have been demonstrated possible to achieve using the described approach and tools to harmonisation.
Notes on contributors
Paolo Villa is an Environmental Engineer and has a PhD in Geodesy and Geomatics at Polytechnic of Milan. He specialises in Remote Sensing and Environmental Analyses based on geoinformation. He works at the Institute for Electromagnetic Sensing of the Environment of the National Research Council of Italy on the topic of change detection methodologies and applications using mid-resolution satellite data for urban and flood monitoring studies, hyperspectral data processing and SDI management and implementation, including the field of geodata and geoservices harmonisation. His main expertise covers GMES topics in the context of the European SDI implementation.
Roderic Molina Perez is a Geographer and MSc in Geographic Information Technologies with more than 10 years of experience as a GIS technician and consultant, both in Italy and Spain. His current work is focused on geodata integration, e-learning and GIS projects at the European level related to the INSPIRE Directive. As Technical Manager he is involved in the development of key projects at GISIG, a non-profit international association on GIS in Italy. Catalan by birth, currently lives and works in Genoa.
Mario A. Gomarasca is a researcher expert in environmental information management Geomatics at the Institute for Electromagnetic Sensing of the Environment of the National Research Council of Italy. He is an Agronomist with expertise in environmental hazard and risks management and currently works on geomatics and geoinformation management and harmonisation. The author acquired a specialisation at the International Institute for Geo-Information Science and Earth Observation (ITC), Enschede, The Netherlands (1987), and he was Visiting Scientist at the Purdue University, Laboratory for the Application of Remote Sensing (LARS).
Emanuele Roccatagliata is Director of the Association GISIG, Geographical Information Systems International Group, promoted in 1992 as University Enterprise partnership for European co-operation in GIS technology and applications. He graduated in Physics and his role in GISIG is related to the technical aspects of defining and developing the projects, and to the production and revision of training contents. Along the years, special attention was paid to the themes of nature conservation and protected areas in mountain environment. From 2002 to 2009 he was Secretary-General of ICCOPS, Landscape Natural and Cultural Heritage Observatory, a study centre dedicated to coastal management, heritage and landscape.
Acknowledgements
This paper was partially supported by EC FP6 project HUMBOLDT (Contract SIP5-CT-2006-030962).
References
- Annoni , A. and Smits , P.C. 2003 . Main problems in building European environmental spatial data . International Journal of Remote Sensing , 24 ( 20 ) : 3887 – 3902 .
- Bernard , L. 2005 . The European geoportal – one step towards the establishment of a European spatial data infrastructure . Computers, Environment and Urban Systems , 29 : 15 – 31 .
- Commission of the European Communities , 2007 . Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE) . Official Journal , L 108 , 1 – 14 .
- Donaubauer A. , et al. , 2006 . Model driven approach for accessing distributed spatial data using web services-demonstrated for cross-border gis applications . Proceedings of the XXIII FIG Congress , 8–13 October 2006 , Munich , Germany .
- Eriksson , H. and Hartnor , J. , 2006 . Data harmonisation requirements (RISE project) . Available from: http://www.eurogeographics.org/eng/documents/RISE13_Data_Harmonisation_Requirements_v1.0.pdf [Accessed 20 April 2011] .
- European Commission , 2008 . Global monitoring for environment and security (GMES): we care for a safer planet . Communication from the Commission. Available from: http://www.gmes.info/pages-principales/library/reference-documents/?no_cache=1&cHash=8caf820e41ff3d7e8827392098d6dcd0 [Accessed 9 February 2011] .
- Fitzner , D. and Reitz , T. , 2009 . A lightweight introduction to the HUMBOLDT framework V3.0 . HUMBOLDT Public project report. Available from: http://www.esdi-humboldt.eu/files/0982-a5_2-d3__3_0__a_lightweight_introduction-fhg-igd-004-final.pdf [Accessed 9 February 2011] .
- GEO-Group on Earth Observation , 2009 . GEOSS strategic targets . Technical report. Available from: http://www.earthobservations.org/documents/geo_vi/12_GEOSS%20Strategic%20Targets%20Rev1.pdf [Accessed 9 February 2011] .
- Gnägi , H.R. , Morf , A. , and Staub , P. , 2006 . Semantic interoperability through the definition of conceptual model transformations . 9th AGILE International Conference on Geographic Information Science , 20–22 April 2006 , Visegrád , Hungary .
- Gore , A. 1999 . The digital earth: understanding our planet in the 21st century . Photogrammetric Engineering and Remote Sensing , 65 ( 5 ) : 528
- Grossner , K. , Goodchild , M.F. and Clarke , K. 2008 . Defining a digital earth system . Transactions in GIS , 12 ( 1 ) : 145 – 160 .
- Hall , M. , 2006 . HarmonISA Land-Use viewer system: system documentation and handbook . HarmonISA . Available from: http://harmonisa.uni-klu.ac.at/
- INSPIRE Drafting Team Data Specifications , 2010 . INSPIRE data specification on protected sites – guidelines . Technical report. Available from: http://inspire.jrc.ec.europa.eu/documents/Data_Specifications/INSPIRE_DataSpecification_PS_v3.1.pdf [Accessed 9 February 2011] .
- Krek , A. and Frank , A. 2000 . The production of geographic information – the value tree . Geo-Informations-Systeme-Journal for Spatial Information and Decision Making , 13 ( 3 ) : 10 – 12 .
- MacKenzie , C.M. et al. , 2006 . Reference model for service oriented architecture 1.0 . Technical report. Available from: http://docs.oasis-open.org/soa-rm/v1.0/soa-rm.pdf [Accessed 9 February 2011] .
- McKee , L. , 2001 . OGC's role in the spatial standards world . Technical report. Available from: http://portal.opengeospatial.org/files/index.php?artifact_id=6207&version=1&format=pdf [Accessed 9 February 2011] .
- Portele , C. , 2006 . Methodology & guidelines on use case and schema development (RISE project) . Available from: http://www.eurogeographics.org/eng/documents/RISE15_Methodology-Guidelines_v1.0.pdf [Accessed 20 April 2011] .
- Reitz , T. and Kuijper , A. , 2009 . Applying instance visualisation and conceptual schema mapping for geodata harmonisation . In : Advances in GIScience, Proceedings of the 12th AGILE conference , 2–5 June 2009 , Hanover , Germany . Lecture Notes in Geoinformation and Cartography . Springer , 173 – 194 .
- Schaap , D.M.A. and Lowry , R.K. 2010 . SeaDataNet – Pan-European infrastructure for marine and ocean data management: unified access to distributed data sets . International Journal of Digital Earth , 3 ( 1 ) : 50 – 69 .
- Schulze Althoff , J. and Giger , C. , 2009 . Concept of data harmonisation process . HUMBOLDT Public project reportx. Available from: http://www.esdi-humboldt.eu/files/0954-a7_0_d2__concept_of_data_harmonisation_processes-ethz-001-final.pdf [Accessed 9 February 2011] .
- Smits , P.C. and Friis-Christensen , A. 2007 . Resource discovery in a European spatial data infrastructure . IEEE Transactions on Knowledge and Data Engineering , 19 ( 1 ) : 85 – 95 .
- Tikunov , V.S. , Ormeling , F. and Konecny , M. 2008 . Atlas information systems and geographical names information systems as contributants to spatial data infrastructure . International Journal of Digital Earth , 1 ( 3 ) : 279 – 290 .
- Vanderhaegen , M. and Muro , E. 2005 . Contribution of a European spatial data infrastructure to the effectiveness of EIA and SEA studies . Environmental Impact Assessment Review , 25 ( 2 ) : 123 – 142 .
- Villa , P. , Gomarasca , M.A. and Reitz , T. 2008 . HUMBOLDT project for data harmonization in the framework of GMES and ESDI: introduction and early achievements . International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , XXXVII ( B4 ) : 1741 – 1746 .
- Villa , P. , Reitz , T. , and Gomarasca , M.A. , 2007 . HUMBOLDT project: implementing a framework for geo-spatial data harmonization and moving towards an ESDI. Geoinformation in Europe . Proceedings of the 27th EARSeL Symposium , 4–7 June 2007 , Bolzano , Italy . Amsterdam : Millpress , 29 – 36 .
- Ziegler , P. and Dittrich , K.R. , 2004 . Three decades of data integration – all problems solved . In : IFIP congress topical sessions, building the information society, IFIP 18th world computer congress, topical sessions , 22–27 August 2004 , Toulouse , France , Kluwer , 3 – 12 .