573
Views
9
CrossRef citations to date
0
Altmetric
Articles

Enhancing the OGC WPS interface with GeoPipes support for real-time geoprocessing

&
Pages 48-63 | Received 31 Oct 2016, Accepted 11 Apr 2017, Published online: 05 May 2017

ABSTRACT

Real-time geospatial information is used in various applications such as risk management or alerting services. Especially, the rise of new sensing technologies also increases the demand for processing the data in real time. Today’s spatial data infrastructures, however, do not meet the requirements for real-time geoprocessing. The OpenGIS® Web Processing Service (WPS) is not designed to process real-time workflows. It has some major drawbacks in asynchronous processing and cannot handle (geo) data streams out of the box. In previous papers, we introduced the GeoPipes approach to share spatiotemporal data in real time. We implemented the concept extending the Message Queue and Telemetry Transport (MQTT) protocol by a spatial and temporal dimension, which we call GeoMQTT. In this paper, we demonstrate the integration of the GeoPipes idea in the WPS interface to expose standardized real-time geoprocessing services. The proof of the concept is illustrated in some exemplary real-time geo processes.

1. Introduction

Real-time phenomena are often monitored with sensors deployed in different environments. New concepts like the Internet-of-Things and Services (IoTS), Smart Cities or the Industry 4.0 are built with these new sensor technologies. In addition, with the implementation of the sensor web concept, which states that sensors should be connected to the World Wide Web (WWW) in a standardized way, sensor data becomes accessible by means of the Internet (Jirka, Broering, and Stasch Citation2009). This easily accessible near real-time data opens numerous opportunities but also set new requirements in the field of geoprocessing (McCullough, Barr, and James Citation2011).

Real-time geoprocessing systems are not a novel approach to exploit these opportunities. For instance, systems for real-time air quality or real-time health monitoring are implemented incorporating spatial and temporal dimensions (Resch, Blaschke, and Mittlboeck Citation2010). However, the geoprocessing components are often tailored to the application and do not use a uniform interface. But nowadays, service-oriented Sensor and Spatial Data Infrastructure (SSDI) are often deployed with web services standardized by the Open Geospatial Consortium (OGC). In these OpenGIS web service environments the Web Processing Service (WPS) defines the interface for geoprocessing services (Schut Citation2007). This interface is designed to build offline geoprocessing but is not able to cope with real-time geoprocessing and unbounded data streams. In this paper, we present an approach to process real-time spatiotemporal data with the WPS interface.

According to Appice et al. (Citation2014), one major challenge in processing real-time sensor data is to integrate (geo-) sensor networks (SNs) in computational infrastructures. We already introduced in a previous paper the concept of GeoPipes to share spatiotemporal data in real time between different distributed instances such as sensor nodes (Herle and Blankenbach Citation2016). The concept is implemented by extending the Message Queue and Telemetry Transport (MQTT) protocol with a spatial and temporal dimension, which we call GeoMQTT. Here, this mechanism is coupled with the WPS service to integrate real-time geo events published in the GeoPipes and computational functionalities.

The paper first gives an introduction about real-time geoprocessing and the WPS standard. We briefly describe the concept of GeoPipes and its implementation with GeoMQTT. Subsequently, the GeoPipes extension to the WPS standard is specified and for a proof of the concept some exemplary applications are shown. The paper ends with a conclusion discussing major challenges and future work.

2. Real-time geoprocessing

The diffusion of location-aware IoT devices increases the demand to process data streams in real time. Especially in time-critical applications such as monitoring infrastructures, the need for real-time analysis and response becomes relevant. In real-time geoprocessing, both dimensions – spatial and temporal – are used to retrieve knowledge and support decisions. But, combining the two dimensions in a real-time system brings numerous computational challenges and opportunities for collection, storage and especially processing. According to Nittel (Citation2015), future real-time geoprocessing systems will most likely focus on spatiotemporal analysis instead of spatial analysis over snapshots of spatial data. Therefore, existing methods might require new algorithms and implementations to compute and deliver real-time results from much larger data sets. Typical tasks of these systems include real-time geospatial queries over large amounts of data streams while keeping up with incoming data, finding patterns in data streams or computing cross-correlations with other streams respectively historical data. For these systems a formal foundations of time and spatiotemporal concepts is required.

McCullough, Barr, and James (Citation2011) define, for instance, a typology of real-time geoprocessing, which they draw from Worboys and Duckham’s (Citation2004) model about representing the dynamic nature of real world phenomena within Geographic Information System (GIS). These representations can be categorized into four different stages:

  1. Static: a single state of the real world like used in traditional GIS.

  2. Snapshot: a collection of timestamped states of dynamic phenomena.

  3. Object lifelines: since on the snapshot level the timestamped states are still static, events can be still unobserved. The object lifelines stage additionally addresses the changes of state in a single object, such as creation, transformation or deletion.

  4. Events, actions and processes: the final representation of real world phenomena where continuous and instantaneous phenomena can be modelled.

Consequently, ‘real-time’ geoprocessing deals with the processing of spatial data enriched with temporal representations of the snapshot stage or above. McCullough, Barr, and James (Citation2011) simplify their typology into snapshot geoprocessing and stream geoprocessing. While snapshot geoprocessing still refers to static processing of spatiotemporal data where input data is specified once, read in once and the operation has a finite lifetime, stream geoprocessing is a radically different approach. Since geospatial data streams are an unbounded sequence of tuples with a time dimension but also a space dimension, querying and processing these open-ended data streams meet different requirements than processing static finite data. The amount of spatiotemporal data increases rapidly over time, and thus applying data stream models to geospatial data becomes relevant when querying or analysing them in real time (Appice et al. Citation2014).

Different data stream models are applied in data stream systems to be able to query continuously arriving data tuples (Babcock et al. Citation2002). Common techniques are the so-called window approaches. A count-based window model, for instance, decomposes the data stream into non-overlapping windows of fixed size. These windows can be queried once they are completed. After processing, the windows are discarded. The sliding window model, on the other hand, always considers the most recent data of the stream. It has a fixed window size and is similar to a first-in, first-out queue ordered by time, which is updated if a new data tuple arrives. Queries are compiled and executed on that queue.

A data stream model forms the basis for further knowledge discovery in the (geo) data stream. Subsequent analyses could be summarization tasks of the data streams, such as sampling or histograms, or more advanced and complex processing in the field of data stream mining to predict values, cluster or find anomalies (Appice et al. Citation2014).

3. WPS and real-time processing

As mentioned in the introduction, in modern SSDIs standardized geo web services are used to guarantee interoperability. The commonly used interface standard for geospatial processing is the OpenGIS® WPS. It standardizes rules for the inputs and outputs of deployed services, as well as the request methods of a service. The interface in version 1.0.0 is standardized by the OGC since 2007 (Schut Citation2007). Since the version 1.0.0 has some drawbacks, a standard working group was formed in 2009 to work on a new interface standard, WPS 2.0, to evaluate and process change requests. Finally, in 2014 the new standard WPS 2.0 has been released (OGC Citation2016). However, since there is still no server-side implementation of the WPS 2.0 standard, we focus on the version 1.0.0 in this paper, keeping in mind to be able to transfer the concept also to the new interface.

Several open-source WPS 1.0 server implementations exist, for instance the 52°North WPS serverFootnote1 written in Java or the PyWPS serverFootnote2 implemented in Python version 3. A recent overview of different implementations and their performances can be found in Poorazizi and Hunter (Citation2015).

3.1. WPS methods

The WPS is a request/response interface like the well-known OGC Web Map Service (WMS) or the Web Feature Service (WFS) interfaces. It defines three core operations, which can be used by clients to interact with the server. Like the WMS or WFS, the WPS uses a GetCapabilities operation to inform the requesting client about service metadata in form of an XML-document that describes the capabilities of a specific server implementation. This includes, for instance, information about the server provider but also about all available processes. Clients are able to request detailed information about each process with the DescribeProcess operation by specifying the process identifier in the request. The returned document informs about the process’s title and abstract but also its requirements including the input and output parameters as well as allowed data formats. Finally, to invoke a process, clients use the Execute operation. The client provides the input parameters in its request. The server parses the request, executes the processes with respect to the inputs and returns the result to the client. The requests can be performed using three different methods: first with key value pairs encoding in HTTP’s GET, second a XML-document in HTTP’s POST or last a SOAP/WSDL (Simple Object Access Protocol/Web Service Description Language) approach.

3.2. WPS input and output types

The WPS interface was developed to process geospatial data, vector and/or raster data, but can also be used to implement non-spatial processes. Therefore, input and output data can be of three different types. LiteralValue parameters basically represent string data, which is sent directly to the server. Server and client can specify units used and the atomic data type for these parameters. Various data types, such as integer, string or date, can be chosen. The ComplexValue type represents large data sets, which can also be binary. This type is usually used for geospatial raster or vector data sets. WPS servers specify acceptable input rules with an XML schema and MIME types, which the client should follow. For instance, raster data are sent using base64 encoding while vector data are usually encoded in Geography Markup Language (GML) or other formats such as GeoJSON. However, the standard (Schut Citation2007) specifies the content of the ComplexValue data structure as ‘any’. Accordingly, custom types can be defined as well. Last, the BoundingBoxValue can be used to define an area of interest with a left-bottom and a right-top corner using some coordinate reference systems.

3.3. Real-time processing and WPS interface

The WPS interface is based on synchronous HTTP as described and thus has some restrictions in asynchronous real-time processing. For long-lasting complex computations which exceed the HTTP time-out duration, a polling approach is defined for asynchronous operations in the standard. A requesting client is able to poll the server to check the state of the processes, for example, if it has finished or not. According to Resch et al. (Citation2009), the significant overhead in exchanging messages is a major disadvantage of this approach, since the client has to continuously poll the server. They conclude that a notification mechanism seems to be a more suitable and optimised approach. In Westerholt and Resch (Citation2015), a WPS server is extended by such a push mechanism to inform the client about the state of the process execution. To notify the client, the extension uses the WebSocket protocol so that messages can be received in a browser. Although in this prototype, notifications are only used to inform the client, the architectural approach could also be utilized to integrate complex geospatial analysis into event-driven, real-time workflows. This includes the input of live geospatial information into the process itself.

Across the literature, some exemplary real-time applications are described, which process sensor data with the WPS interface. For instance, in Schaeffer et al. (Citation2012) or Kmoch et al. (Citation2016) a simulation process is started by a WPS service first querying a Sensor Observation Service for the latest available sensor data collected in a SN. The WPS service is regularly invoked in a cycle after the previous process is finished. So, there still is a time and methodology gap between the (geo) data streams emitted by SNs and the service processing the near real-time sensor data. To be able to process live geospatial information in a WPS service without polling a database actively (e.g. from a geo data stream published by sensors), a notification mechanism has to be integrated in the process itself. In Foerster, Baranski, and Borsutzky (Citation2012), a full streaming WPS is described utilizing the HTTP live streaming protocol to integrate data streams into the process. In this approach, the WPS is able to receive input data streams, process them and send intermediate results back to the client as an output stream. The streams are represented by the playlist data format, which helps to transport multimedia data chunks.

We follow the latter approach to be able to process geo data streams in the WPS interface. The streams are realized with a push-based spatiotemporal notification mechanism, which is described in the following section.

4. Geopipes using GeoMQTT

In Herle and Blankenbach (Citation2016), we describe the concept of GeoPipes and implemented them using an extension to the MQTT standard, which we call GeoMQTT. In this paper, we couple the GeoPipes concept with the WPS 1.0 to provide real-time geoprocessing functionalities. Therefore, we give a brief introduction into the concept and the implementation with GeoMQTT.

A GeoPipe is a push-based mechanism for sharing spatiotemporal data in real time between different types of distributed devices and applications. Producers of spatiotemporal enriched data publish their messages to the pipe while consumers specify their interest in the particular pipe and receive the messages in near real time. GeoPipes basically form a message-oriented middleware for multiple producers and consumers of data with a temporal and spatial dimension. For the implementation, we listed and evaluated some requirements. A simple open protocol, which can be extended by our desired functionalities, was one of the most crucial requirements. Finally, following the discussion above and technical evaluation provided by Herle and Blankenbach (Citation2016), the MQTT protocol (OASIS Citation2014) is a good fit for what we intend to accomplish in the remainder of this paper.

4.1. Message queue and telemetry transport

The MQTT protocol is a lightweight protocol and implements the publish/subscribe interaction scheme, an event-based communication model between publishers and subscribers. Publishers produce certain information, an event, and tag them with a so-called topic name. Subscribers register their interest in events with a topic filter. If the topic filter matches the topic name, they receive the event, which is called a notification. The central component of the publish/subscribe system is the broker, which manages the subscriptions registered by the subscribers, evaluates the topic names of incoming events against the topic filters and potentially distributes the events to interested clients. This way, publishers and subscribers are connected by events and notifications but decoupled in time, space and synchronization. shows the architecture of a publish/subscribe system. Multiple subscribers but also multiple publishers are possible. In MQTT, events and notifications are tagged with the topic name.

Figure 1. The publish/subscribe interaction scheme (ad. Herle and Blankenbach Citation2016).

Figure 1. The publish/subscribe interaction scheme (ad. Herle and Blankenbach Citation2016).

The topic name in MQTT can be hierarchically structured by a topic level separator, a forward slash. For example, a temperature sensor node tags an MQTT message with the topic name room/217/temperature and publishes the room temperature in the payload of the message. Topic filters are used by the subscribers to specify their interest in specific events, tagged by a topic name. A topic filter is of similar shape like the topic name but can contain so-called wildcards for different topic levels. The single-level wildcard ‘+’ can be used to match topic names where one single level is different. Additionally, the multi-level wildcard ‘#’ can be applied to the end of the topic filter string to replace multiple levels of the hierarchy.

MQTT is based on TCP/IP but with the extension MQTT for Sensor Networks (MQTT-SN), it also supports connectionless communication protocols like UDP or ZigBee (Stanford-Clark and Truong Citation2013). This is essential for meeting the requirements of the GeoPipes concept since resource-constrained sensor nodes without TCP/IP interface can also publish messages to GeoPipes. With MQTT-SN and the WPS extension presented here, real-time geoprocessing can be applied to the measurements issued directly by sensor nodes in SNs.

4.2. GeoMQTT

We utilized the MQTT protocol to implement the concept of GeoPipes. The extension is called GeoMQTT since each event is tagged with a timestamp and a geometry besides the topic name. Furthermore, subscribers specify their interest in events with the topic filter inherited from MQTT but also with a temporal and spatial filter. Subscribers are notified and receive a message if it meets all three filters. To achieve this behaviour, we introduced new message types in GeoMQTT, namely the GeoPublish, GeoSubscribe and GeoUnsubscribe messages. Thereby, the original MQTT protocol is not modified and MQTT clients can also connect to GeoMQTT brokers. Additionally, we implemented a conflict handling strategy between both protocols (Herle and Blankenbach Citation2016).

The GeoPublish message can be used by data producers to generate spatiotemporal events. On the one hand, it is tagged with a timestamp in ISO8061 syntax or in UNIX time. On the other hand, a geometry is specified by the publisher, which can be encoded in different common description languages for geometries such as Well-known Text (WKT), Extended WKT, GeoJSON or GML. The geometries can be defined in any Spatial Reference Systems (SRS). Like in MQTT’s Publish message, the payload of a GeoPublish message can be arbitrary.

The meta information added to the events allows establishing new filter capabilities. The geo subscription mechanism uses a temporal filter and/or a spatial filter as well as the topic filter inherited from ordinary MQTT subscriptions. A GeoSubscribe message is used by clients to indicate their interest in spatiotemporal events restricted by the different filters. The syntax of the temporal filter is based on the ISO8601 intervals and repeating intervals standard (ISO Citation2004). The spatial filter consists of a topological relation and a geometry. The geometry of the subscription is evaluated with the topological relation and the geometry of the spatiotemporal event. The topological relation can be one of the relations defined in the OpenGIS® Simple Feature Access standard (Herring Citation2011) extended by the ‘covers’ and ‘coveredBy’ relation. Like in the GeoPublish message, the geometry can be specified using common description languages and SRS.

We also developed a GeoMQTT-SN version to bridge sensor nodes in geo sensor networks and the GeoMQTT broker (Herle and Blankenbach Citation2016). Based on that, the sensor nodes can be directly linked to the broker and, ultimately, to geoprocessing services.

5. Integrating GeoPipes in the WPS interface

The WPS interface is not suitable or tailored for providing real-time geospatial processes. As shown before, there have been some solutions in the literature to overcome these drawbacks. Our approach to perform geoprocessing tasks on an unbounded sequence of data tuples involves connecting the service to GeoPipes. We define InGeoPipes, OutGeoPipes, InPipes and OutPipes, which are in our architecture currently realized with GeoMQTT respectively MQTT clients. The architecture concept is illustrated in .

Figure 2. Architecture proposal for connecting the WPS interface with GeoPipes.

Figure 2. Architecture proposal for connecting the WPS interface with GeoPipes.

A service ‘A’ is invoked with an Execute request running at a WPS 1.0 server. In the request, InGeoPipes and OutGeoPipes are defined, which encode GeoMQTT connection details. For each GeoPipe, the service connects to a GeoMQTT broker with a client in one thread. If it is an InGeoPipe, the client subscribes to the specified geo subscription. The service computes a custom function on the established geo data input stream and eventually publishes a message to an OutGeoPipe. A service may connect to different InGeoPipes and OutGeoPipes. Both types of pipe are defined by the user of the service. Unlike Westerholt and Resch (Citation2015) in which the server chooses the event server of the outputs and provides the connection details in the response of the Execute query, the connection details for the OutGeoPipes are also specified by the client and, therefore, an input to the service. To receive results published to the OutGeoPipe, the requesting client has to connect to the pipe itself and independently from the WPS service. The benefits of this design decision include that the WPS standard and the XML structures are not modified at all. It also facilitates fusing different streams and chaining stream processes since the client has control about the endpoints of the stream process. However, this implementation implies that the client has profound knowledge about possibly multiple remote servers, which can be unfavourable in certain applications.

Currently, the only supported protocol is GeoMQTT, since we implemented the concept of GeoPipes first with MQTT. Additionally, we implemented InPipes and OutPipes, which have the same behaviour like the GeoPipes version but only use the ordinary MQTT subscriptions and messages.

5.1. Geopipes input types

(Geo-)Pipes are of course not a standard input type for WPS services. Since our implementation should be compliant to the WPS 1.0 standard, we use the rule set of data types available for inputs and outputs. The WPS interface defines three different input data types: LiteralData, ComplexData and BoundingBoxData. For the definition of a (Geo-)Pipe the BoundingBoxData input type is not suitable because it strictly expects a predefined XML data structure with geographic coordinates for a rectangular area. However, the LiteralData and ComplexData input types can both be used to model and submit the connection details for a (Geo-)Pipe, which is shown in the following section.

The LiteralData input consists of a simple literal value. A ‘data type’ attribute can be included in the parameter. Typical data types are strings or integers but it is also possible to choose an ‘anyURI’ data type. So, we can define subscriptions for InPipes/InGeoPipes and topics to publish to for OutPipes/OutGeoPipes in an URI. shows the URI syntax for the different versions of the pipes.

Table 1. URI syntax for the different types of pipes.

The schema of the URI determines the used protocol, which is currently only implemented for GeoMQTT and MQTT. Further, for instance an InGeoPipe requires the address and port of the broker and potentially the credentials (client id, password) to log in. The geo subscription details are specified in the path and query of the URI. Whilst the path represents the topic filter, the temporal filter and the spatial filter are defined in the query part. With this URI syntax method, it is only possible to define a single geo subscription at a time. In an OutGeoPipe or OutPipe only a topic name, which is the path of the URI, is defined.

For a more sophisticated solution, the details of the pipes can be specified with an XML in the ComplexData input type. The WPS standard states that the content of the ComplexData data structure can be of any type. Thus, this approach is suitable and valid here. The ComplexData version has some advantages in comparison to the LiteralData version. For instance, it is more flexible in defining multiple geo subscriptions in an InputGeoPipe. Additionally, with the created XML schema files the input can be validated automatically when the user sends the request. Based on the WPS Best Practices Discussion Paper (Schaeffer Citation2012), we set up the MIME types for the different types of pipes in .

Table 2. MIME types of the different types of pipes.

As mentioned before, the connection details are encoded as an XML file. For the InGeoPipe an example is shown in .

Figure 3. An XML-encoded InGeoPipe.

Figure 3. An XML-encoded InGeoPipe.

Like in the URI case, the address and port of the broker are defined. A login tag can be used to submit credentials (omitted here). The Geosubscribe tag indicates a single geo subscription with topic filter, temporal filter and spatial filter. It can be used multiple times. The XML is parsed by the server, validated against the XML schema file and then used to connect to the broker and register the geo subscriptions. The other pipes are defined similarly. A schema is created for each of the types and handled in the same way. However, output pipes are solely used to publish process results and do not use subscriptions but expect a specified topic name to publish to.

The pipes here are only defined for the MQTT respectively GeoMQTT case. However, in the future we also want to support other messaging protocols to facilitate the fusion of streams with different sources and systems. For instance, it could also be useful to define in- and output pipes with the eXtensible Messaging and Presence Protocol (XMPP) or the Advanced Message Queuing Protocol (AMQP).

The WPS 2.0 standard uses the same types of input and output data. Therefore, both version to define a pipe, LiteralData and ComplexData, can easily be applied also to the newer standard.

5.2. Named wildcards mechanism

A special case in MQTT and GeoMQTT are the single-level and multi-level wildcards in the topic filters. Imagine a service subscribes to a topic filter car/+/velocity where the wildcard ‘+’ represents the car id for each car individually. The service computes the acceleration for each car individually and publishes the result to an output pipe with a topic name, which is customized to each car with respect to its id like in .

Figure 4. Example for named wildcards.

Figure 4. Example for named wildcards.

For these situations, we introduce the named wildcards mechanism in the topic filters of a subscription. In the Execute method of the WPS service, the user is able to assign a name in curly brackets to the wildcards in the topic filters. In the previous example, the topic filter would be car/+{car_id}/velocity. This is useful if we establish an output pipe which uses the {car_id} to distinguish between entities. In the topic name of the output pipe, the user then just needs to set the variable. In this example, the topic name would be car/{car_id}/acceleration. In a similar way, this mechanism can also be used for the multi-level wildcard. In this case, the variable would potentially consist of a string representing multiple topic levels.

5.3. Handling WPS drawbacks

As explained, the WPS interface standard is designed to support operations with a finite lifetime by using HTTP requests. On the contrary, (geospatial) data streams are an open-ended unbounded sequence of tuples. Processing these streams implies to start an operation with (possibly) an infinite lifetime. In addition, the set of methods of the WPS interface does not include a method to stop a process. A process which is executed with the Execute method runs until it succeeds or fails. The requesting client is not capable of interfering with the process execution. Thus, in McCullough, Barr, and James (Citation2011) the WPS server implementation was modified by adding a StopExecuting method in order to facilitate the management of continuous computing jobs. However, this is obviously not compliant to the standard and, therefore, not implemented in our server.

To avoid an infinite lifetime of the process or zombie processes our simple approach involves an extra parameter in each Execute request, which limits the lifetime of the process. The time to live (TTL) parameter is specified by the requesting client in seconds. The operation is applied to the streams for this amount of time. Subsequently, the service disconnects from (possibly) multiple brokers and exposes the resulting XML response of the process to the client.

The preferred way of invoking a WPS process in our approach is the asynchronous mode. This has some advantages in comparison to the synchronous mode. For instance, if the Execute request is accepted by the server, the immediate ProcessAccepted response allows the client to establish a connection to the output pipes to receive the results of the stream process. In synchronous mode, on the other hand, the client has to connect directly to the broker. Furthermore in asynchronous mode, an Execute request for a long-running stream process (the TTL parameter is set to a high value) does not exceed the HTTP time-out duration. visualizes the workflow of our intended streaming WPS mechanism. The asynchronous mode of the WPS server is utilized here. The data streams are processed for TTL seconds. After the process time is exceeded, the client as well as the process disconnects from the pipes.

Figure 5. Sequence diagram of the proposed workflow.

Figure 5. Sequence diagram of the proposed workflow.

The described workflow is tailored to the WPS 1.0 interface since the drawbacks of the version are avoided, for example, by the TTL parameter. However in WPS 2.0, a job control is introduced, which enables the client to cancel or release running jobs.

6. Proof-of-concept with selected use cases

The GeoPipes extension for the WPS interface described in the previous section is implemented and tested in PyWPS-4, a server-side implementation of the OGC WPS standard written in Python version 3 (PyWPS Development Team Citation2009). It was chosen since it is very easy to customize for our needs and new processes can be set up with low effort. Furthermore, a GeoMQTT client in Python is already implemented. Thus, processes using GeoMQTT as input and/or output GeoPipes can be implemented easily.

PyWPS-4 also offers some benefits and features we are using in our processes. For instance, input data can be validated in different modes up to a very strict validation using a given XML schema. Additionally, PyWPS-4 is under active development and WPS 2.0 features will be implemented in the future (Cepicky and de Sousa Citation2016). With this server implementation, we set up some real-time geo processes, which are presented in the following sections. The processes use stream data models, such as the sliding window technique explained before. All of the described processes are compliant to the WPS standard, which means that the operation set is not modified and the processes use the TTL parameter to specify a finite runtime.

6.1. Moving average filter

Signals from sensors are often messy or noisy. A moving average or running mean filter can be used to smooth this signal and filter spurious peaks (Duchon and Hale Citation2012). The moving average filter computes the average of the last n data points:(1)

where xt is the measured value at time t and k = (n − 1)/2.

We implemented this simple moving average filter as a service in PyWPS. It expects the input parameters specified in .

Table 3. Inputs and outputs for the moving average filter service.

The process uses a sliding window model with a variable size of n and can be specified by the client with the Execute request and the parameters in . An InPipe (inputstream) and an OutPipe (movingaverage) are specified as MQTT pipes defined by the protocol’s topic. The output is a literal data, which is returned if the process succeeded.

As an example, let an SN consists of multiple sensor nodes measuring the air temperature and humidity every second and publishing the values with MQTT and the following topics: node/<id>/temperature and node/<id>/humidity where <id> is a placeholder for the identifier of the sensor node. By using the following execute request each id – phenomenon pair can be processed with the moving average filter (see ). The named wildcards mechanism, as explained before, is applied here.

Figure 6. Execute request for the moving average filter.

Figure 6. Execute request for the moving average filter.

In this example, the process runs for five minutes (ttl= 300 seconds) and applies a moving average with the size five (n = 5) to the InPipe. The process connects to the specified MQTT server and subscribes to the topic node/+/+ where the middle wildcard is named with {id} and the tail wildcard with {value} respectively. The moving average results are computed by the process and published to the specified MQTT server and the topics node/{id}/movingaverage/{value} where the named wildcards are replaced by the id of the sensor and the measured phenomenon in the incoming messages.

6.2. Inverse distance weighting (IDW) with sliding window

The moving average example shows the application of the data stream issued by the ordinary MQTT protocol. With the GeoMQTT extension we introduced a temporal and spatial dimension in each message, so that it is now possible to specify GeoPipes in the WPS execute requests. This can be used, for instance, in real-time spatial interpolation. Traditional spatial interpolation methods include IDW (Shepard Citation1968) or kriging (Cressie Citation1990). While the traditional methods only consider the spatial dimension of measurements, newer approaches also take the temporal dimensions into account. For example, Whittier et al. (Citation2013) introduce a spatiotemporal IDW (st-IDW) which includes a temporal ‘distance’ besides the spatial ‘distance’.

However, in this process we combined the frequently used spatial interpolation method IDW with a sliding window approach. The in- and output parameters are defined in .

Table 4. Inputs and outputs for IDW sliding window service.

The inputstream is an InGeoPipe, which is specified by the user with a GeoMQTT subscription. The messages received from this pipe are used to interpolate for the location, which expects a geometry in WKT syntax. For instance, let the location parameter be POINT(6 50). A suitable subscription might use a topic filter temperature, an empty temporal filter and a spatial filter (intersects, BUFFER(POINT(6 50), 0.004)). The process now receives all messages which are published inside the buffer of the location and have a topic name temperature. The interpolation is applied with a sliding window size of n to the payload of the messages and the resulting interpolated values are published every t seconds to the OutGeoPipe interpolated.

6.3. Dynamic convex hull algorithm

The generic implementation of the GeoPipes extension in our adjusted PyWPS instance allows us to run dynamic geometry algorithms as well. We set up a real-time processing service which calculates the dynamic convex hull of a changing set of points utilizing Overmars and van Leeuwen’s (Citation1981) algorithm. The algorithm is known to be efficient when inserting or deleting points from the set. For our implementation, we used the version of Cisneros (Citation2007). The in- and outputs of the service are specified in .

Table 5. Inputs and outputs for dynamic convex hull service.

The points-InGeoPipe represents the input stream for the geometries. Only the geometry and timestamp information of the GeoMQTT messages are used. If a geometry is inserted into the set of points, the updated convex hull geometry is published with a GeoMQTT message in the OutGeoPipe defined in convexhull. The ttl_points parameter allows the user to specify the time the geometries of the messages are part of the set of points. After the defined seconds the messages (and their geometries) are deleted from the set of points. The updated convex hull geometry is published again to the OutGeoPipe.

6.4. Online map matching

The dynamic convex hull algorithm is a good example how we can utilize the spatiotemporal message received by the GeoPipes. However, it is not a very realistic use case. The last process we present in this paper is used in the research field of trajectory data mining. Trajectory data mining describes the process of knowledge discovery from trajectory data. Ultimately, methods of trajectory data mining can be used to, for example, analyse, query or classify mobility patterns or traffic (Zhen Citation2015). In the processing chain of trajectory data mining one essential step is to match the raw and noisy locations from Global Navigation Satellite Systems (GNSS) to a graph, for instance a road network. This pre-processing algorithm is called map matching, respectively online map matching if it is applied to a data stream. We set up an online geometric map matching algorithm which matches the received GNSS point to the nearest road and issues the projected map matched location. The input parameters to the WPS service are show in .

Table 6. Inputs and outputs for online map matching service.

The gpslocation-InGeoPipe represents the input stream of the data, the mmposition-OutGeoPipe is used to publish the map matched position and the roadnetwork is a user-defined road network, which can be specified as a reference pointing to a GML file or a WFS request. With the named wildcard mechanism, the GeoMQTT implementation allows to distinguish between different entities, if, for instance, their identifiers are encoded in the topic name. That way, the topic names in the output message can be adjusted to the specific map matched entity.

Since the implemented simple map matching algorithm only projects the received location onto the nearest street in the road network, it is actually not a stream but a snapshot geoprocessing application. The process could be started with each location individually. However, if we consider more sophisticated online map matching algorithms, which also incorporate the history of GNSS locations and topological features, then it becomes a stream geoprocessing algorithm. For instance, Mattheis et al. (Citation2014) introduced a Hidden Markov Model online map matching algorithm, which also considers the previous states (locations) of the car.

As mentioned, map matching is a necessary pre-processing step in trajectory data mining. We only implemented this simple algorithm but with the OutGeoPipe defined, various subsequent other services could be set up subscribing to the GeoPipe to, for example, cluster cars or analyse traffic conditions. So, a chaining of different stages in a mining chain can be realized by coupling the GeoPipes together. This also supports the chaining idea of the original WPS interface.

As a proof-of-concept, we implemented a web application for the simple online map matching algorithm of cars. A screenshot of the client is illustrated in .

Figure 7. Web application for the online map matching service.

Figure 7. Web application for the online map matching service.

The previously discussed input parameters for the Execute request can be specified in the left form. After sending the request to the PyWPS server, the web application connects with a GeoMQTT WebSocket client to the OutGeoPipe. Subsequently, the client receives the map matched positions in GeoMQTT messages with the topic names mapmatched/{id} where {id} is a placeholder for the corresponding car id. The new received location is added to the map. Videos of the web applications and demo services can be found under the web address in Herle (Citation2016).

7. Conclusion and future developments

In the presented architecture, we integrated the GeoPipes concept in the OpenGIS® WPS interface standard to provide real-time processing of live geospatial information. The GeoPipes concept allows the sharing of spatiotemporal data between different publishers and consumers in a push-based manner. With the concept and its implementation in GeoMQTT, we can easily set up geospatial data streams. These streams are used by the WPS extension to receive and publish geospatial data. We specified input and output types for the different pipes and implemented a server instance to handle these data streams. The different deployed exemplary services, for example, the dynamic convex hull or the map matching service, prove that WPS services are capable of dealing with in- and output data streams, if the GeoPipes concept is integrated into the interface. One benefit of applying GeoMQTT as a first realized GeoPipe protocol is that a lot of clients can connect to the system since it is very lightweight and scalable. Furthermore, clients for TCP/IP, connectionless protocols like ZigBee as well as WebSockets are supported. The exposed processes are able to receive the data directly from, for example, low cost sensor nodes and process them. This is especially useful in modern SSDIs.

In the future, the proposed processing approach should be extended to provide multiple pushed-based protocols such as XMPP or AMQP. Currently, only MQTT and GeoMQTT for (Geo-)Pipes are supported. Multiple protocols would improve data stream fusion and integration in real-time processes from different sources. For the sake of compliance to the WPS 1.0 standard, our design approach requires the client to have significant knowledge about remote servers and structures. For instance, the output (Geo-)Pipes are also defined by the client, which might lead to conflicts between different processes. This issue could be solved in the future by shifting responsibilities to the server. Furthermore, in this early version the limitations of the WPS interface are handled in a simple way. A StopExecution request would guarantee more flexibility but the server would lose its compliance to the standard. The WPS 2.0 standard is promising to overcome these drawbacks. However, an implementation is missing so far. Another enhancement for real-time geoprocessing in our architecture is the binding with grid computing and data stream mining technologies, which we will follow in the future. Moreover, we are planning to implement a QGIS client to support WPS with GeoPipes in a GIS desktop application.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

References

  • Appice, A., A. Ciampi, F. Fumarola, and D. Malerba. 2014. Data Mining Techniques in Sensor Networks. Summarization, Interpolation and Surveillance. London: Springer.
  • Babcock, B., S. Babu, M. Datar, R. Motwani, and J. Widom. 2002. “Models and Issues in Data Stream Systems.” In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS’02), 1–30.
  • Cepicky, J., and L. M. de Sousa. 2016. “New Implementation of OGC Web Processing Service in Python Programming Language. PyWPS-4 and Issues We Are Facing with Processing of Large Raster Data Using OGC WPS.” The international archives of the photogrammetry, remote sensing and spatial information sciences, volume XLI-B7, 2016 XXIII ISPRS Congress, Prague, Czech Republic, 12–19 July 2016.
  • Cisneros, J. A. 2007. “Maintenance of the Convex Hull of a Dynamic Set.” Ms thesis, University of Edinburgh.
  • Cressie, N. 1990. “The Origins of Kriging.” Mathematical Geology 22 (3): 239–252. doi: 10.1007/BF00889887
  • Duchon, C., and R. Hale. 2012. Time Series Analysis in Meteorology and Climatology. An Introduction. Oxford: Wiley.
  • Foerster, T., B. Baranski, and H. Borsutzky. 2012. “Live Geoinformation with Standardized Geoprocessing Services.” In Bridging the Geographic Information Sciences, edited by J. Gensel, D. Josselin, and D. Vandenbrouck, 99–118. Berlin: Springer.
  • Herle, S. 2016. “Streaming WPS.” Accessed October 20 2016. http://www.gia.rwth-aachen.de/GeoPipes.
  • Herle, S., and J. Blankenbach. 2016. “GeoPipes Using GeoMQTT.” In Geospatial Data in a Changing World: Selected Papers of the 19th AGILE Conference on Geographic Information Science, edited by T. Sarjakoski, M. Santos, and T. Sarjakoski, 383–398. Switzerland: Springer.
  • Herring, J. R. 2011. “OpenGIS® Implementation Standard for Geographic Information – Simple Feature Access – Part 1: Common Architecture.” OGC 06-103r4.
  • ISO (International Organization for Standardization) 2004. “ Representations of dates and times, ISO 8601:2004.”
  • Jirka, S., A. Broering, and C. Stasch. 2009. “Discovery Mechanisms for the Sensor Web.” Sensors 9 (4): 2661–2681. doi:10.3390/s90402661.
  • Kmoch, A., H. Klug, P. White, and S. Reichel. 2016. “SensorWeb Semantics on MQTT for Responsive Rainfall Recharge Modelling.” In 19th AGILE international conference on Geographic Information Science, Helsinki, Finland.
  • Mattheis, S., K. K. Al-Zahid, B. Engelmann, A. Hildisch, S. Holder, O. Lazarevych, D. Mohr, F. Sedlmeier, and R. Zinck. 2014. “Putting the Car on the Map: A Scalable Map Matching System for the Open Source Community.” In Proceedings of GI-Jahrestagung 2014, edited by E. Ploedereder, L. Grunske, E. Schneider, and D. Ull, 2109–2119. Bonn: Koellen.
  • McCullough, A., S. Barr, and P. James. 2011. “A Typology of Real-time Parallel Geoprocessing for the Sensor Web Era.” In Integrating Sensor Web and Web-based Geoprocessing: CEUR Workshop Proceedings 712, edited by T. Foerster, A. Bröring, B. Baranski, B. Pross, C. Stasch, T. Everding, and S. Maes, 1–5. Utrecht: CEUR.
  • Nittel, S. 2015. “Real-time Sensor Data Streams.” SIGSPATIAL Special 7 (2): 22–28. doi: 10.1145/2826686.2826691
  • OASIS (Organization for the Advancement of Structured Information Standards) 2014. “ MQTT Version 3.1.1 OASIS Standard.”
  • OGC (Open Geospatial Consortium) 2016. “Web Processing Service 2.0 Standard Working Group.” Accessed October 20 2016. http://www.opengeospatial.org/projects/groups/wps2.0swg.
  • Overmars, M. H., and J. van Leeuwen. 1981. “Maintenance of Configurations in the Plane.” Journal of Computer and System Science 23 (2): 166–204. doi: 10.1016/0022-0000(81)90012-X
  • Poorazizi, E., and A. Hunter. 2015. “Evaluation of Web Processing Service Frameworks.” OSGeo Journal 14 (1): 29–42.
  • PyWPS Development Team. 2009. “Python Web Processing Service (PyWPS), Software, Version 4.0.0.” Accessed 20 October 2016. http://pywps.org.
  • Resch, B., T. Blaschke, and M. Mittlboeck. 2010. “Live Geography: Interoperable Geo-sensor Webs Facilitating the Vision of Digital Earth.” International Journal on Advances in Networks and Services 3 (3,4): 323–332.
  • Resch, B., G. Sagl, T. Blaschke, and M. Mittlboeck. 2009. “Distributed Web-processing for Ubiquitous Information Services – OGC WPS Critically Revisited.” In Proceedings of the 6th International Conference on Geographic Information Science (GIScience2010), Zurich, Switzerland.
  • Schaeffer, B. 2012. “ Web Processing Service Best Practices Discussion Paper.” OGC 12-029.
  • Schaeffer, B., B. Baranski, T. Foerster, and J. Brauner. 2012. “A Service-oriented Framework for Real-time and Distributed Processing.” In Geospatial Free and Open Source Software in the 21th Century, edited by E. Bocher and M. Neteler, 3–20. Berlin: Springer.
  • Schut, P. 2007. “ OpenGIS® Web Processing Service.” OGC 05-007r7.
  • Shepard, D. 1968. “A Two-dimensional Interpolation Function for Irregularly-spaced Data.” In Proceedings of the 23rd ACM National Conference, ACM’68, New York, USA, 517–524.
  • Stanford-Clark, A., and H. L. Truong. 2013. “ MQTT for Sensor Networks (MQTT-SN) Protocol Specification Version 1.2, IBM Zurich Res. Lab.”, Zurich.
  • Westerholt, R., and B. Resch. 2015. “Asynchronous Geospatial Processing: An Event-driven Push-based Architecture for the OGC Web Processing Service.” Transactions in GIS 19 (3): 455–479. doi: 10.1111/tgis.12104
  • Whittier, J. C., S. Nittel, M. Plummer, and Q. Liang. 2013. “Towards Window Stream Queries Over Continuous Phenomena.” In Proceedings of 4th International ACM SIGSPATIAL Workshop on GeoStreaming (IWGS), edited by F. Banaei-Kashani, A. Basalamah, and C. Zhang, 2–11. Danvers: ACM Digital Library.
  • Worboys, M., and M. Duckham. 2004. GIS: A Computing Perspective. Boca Raton: CRC Press.
  • Zhen, Y. 2015. “Trajectory Data Mining: An Overview.” ACM Transaction on Intelligent Systems and Technology 6 (3): 1–41. doi: 10.1145/2743025

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.