1,262
Views
4
CrossRef citations to date
0
Altmetric
Articles

A container-based approach for sharing environmental models as web services

, , , , ORCID Icon, & show all
Pages 1067-1086 | Received 19 Nov 2020, Accepted 30 Apr 2021, Published online: 26 May 2021

ABSTRACT

As researchers globally work towards a fully digital representation of the earth and its processes – i.e. a true Digital Earth – the need grows for software and systems to link disparate computer simulation models of various parts of the earth in a reliable and highly functional way. Web services have been demonstrated as an effective way to share and reuse models as they enable communication and interoperation among applications via the Internet. However, even using well-designed software tools, it remains a daunting process to publish heterogeneous environmental models as web services and provide long-term maintenance in response to changing computational environments. We present an approach that enables environmental models to be published as long-term functional web services on the same platform regardless of execution mode, programming language, and computational environment conflicts. The approach adopts the OpenGMS Wrapper System (OGMS-WS) for service publishing and Docker containers for model isolation. A streamflow prediction service is developed using this approach and is applied to analyze historical streamflow trends in Bangladesh. We demonstrate that this approach can lower the barrier to deploying heterogeneous environmental models as long-term functional web services, contributing to the development of a Digital Earth.

1. Introduction

The aim of Digital Earth is to provide ‘a laboratory without walls’ for scientists to perform sophisticated modeling activities with a full range of earth data (Gore Citation1998; Grossner, Goodchild, and Clarke Citation2008; Guo Citation2020). In the era of big earth data, technological advances in satellites and computational resources produce ever-increasing earth observations and greater climate modeling results, which have opened up new pathways to better understand earth’s environmental systems (Vitolo et al. Citation2015; Jiang et al. Citation2019). Big earth data also presents challenges for scientists on how to effectively and efficiently analyze data and simulate the dynamic process in the earth system to assist decision-making.

Additionally, there is a growing need for software and systems to link disparate computer simulation models of various parts of the earth in a reliable and highly functional way. This is often called model ‘chaining’ and requires that computational models of different parts of the earth system can be executed, often in an online environment, in a consistent and reliable manner. Indeed, the concept of chaining models from different disciplines is gaining momentum for earth and environmental modeling because such chains can often better represent reality and potentially answer more questions than the individual models alone (Dubois et al. Citation2013; Bandaragoda et al. Citation2019; Chen et al. Citation2019; Gao et al. Citation2019; Chen et al. Citation2020).

Whether to support web-based model chaining, or to facilitate use of models across domains and disciplines, there is a need for standardized approaches and technologies for simplifying both human–computer and computer–computer model interaction. A number of studies have focused on improving model accessibility and interoperability by lowering the entry bar of model execution, including developing web-based modeling approaches with simple graphical user interfaces (GUIs) (Luo et al. Citation2004; Wen et al. Citation2013; Rajib et al. Citation2016; Omidipoor, Jelokhani-Niaraki, and Samany Citation2019), linking heterogeneous models through Application Programming Interfaces (APIs) (Gregersen, Gijsbers, and Westen Citation2007; Peckham, Hutton, and Norris Citation2013), and publishing models as accessible web services (Jiang et al. Citation2017; Xiao et al. Citation2019; Zhang et al. Citation2019; Li Citation2020). These methods can save considerable time in organizing hardware and installing software. Compared to the first two methods, the web service method has an advantage that it enables models written in different programming languages or running on different platforms communicate and exchange data over the Internet (Michaelis and Ames Citation2009; Castronova, Goodall, and Elag Citation2013). In fact, multiple studies have already demonstrated the flexibility and effectiveness of integrating environmental models through web services (Goodall, Robinson, and Castronova Citation2011; Goodall et al. Citation2013 Qiao et al. Citation2019b;; Zhang et al. Citation2020; Choi et al. Citation2021).

Many software tools have been developed to assist with publishing earth and environmental models or analysis workflows as web services, such as GeoServer (Citation2013), Deegree (Müller Citation2007), PyWPS (Becchi Citation2007), 52°North WPS (Citation2010), Esri ArcGIS ModelBuilder (Allen Citation2011), and Tethys WPS Server (Qiao et al. Citation2019b). However, it remains challenging to publish environmental models as web services without significant effort. First, environmental models vary in format, execution mode, programming language, and supported operating system. Some models are difficult to deploy due to the complex infrastructures required (network, storage, and computing). Collberg et al. (Citation2014) found that less than 50% of published software could be successfully installed. Second, the increasing complexity of environmental models to represent more details raises entry barriers. It is challenging for users to understand and execute a new model correctly in a short amount of time, especially when working with models in unfamiliar domains (Yue et al. Citation2016; Jiang et al. Citation2017). High complexity and low portability greatly hinder the extensive development of model services. Third, models are often designed for analysis workflows developed in specific programming languages. For example, 52°North WPS is Java-based and only supports Windows and Linux operating system. It has some support for Python scripting, but only in connection with the Esri ArcGIS Python library on Windows computers. The heterogeneity of environmental models makes it difficult to develop a comprehensive tool that supports different types of models.

To address the above challenges, Zhang et al. (Citation2019) have developed a software tool called OpenGMS Wrapper System (OGMS-WS) that can share geoanalysis models over the Internet as web services. OGMS-WS supports major operating systems (Linux, Windows, and macOS) and provides a standardized encapsulation method to publish heterogeneous models as RESTful web services. Additionally, it provides service management with a GUI that allows users to interact with the web services, including functions for search, execute, monitor, and control. Xiao et al. (Citation2019) demonstrated the effectiveness and user-friendliness of OGMS-WS by deploying the Storm Water Management Model (SWMM) as a web service-oriented system. However, OGMS-WS is still not flexible enough to work with environmental models in different forms, programming languages, and supported operating systems, especially when being deployed on the same platform. First, the operating system supported by the model needs to be consistent with the OGMS-WS server. In other words, models that support different operating systems are unable to be deployed on the same server. However, as the processing power and capacity of servers increases, the need of consolidating various models onto a single server is becoming more important. It requires that models to work well across different server environments. Second, even being deployed in the matched OGMS-WS platform, some models are difficult or even impossible to be installed and executed due to environment or dependency conflicts, such as using different programming environments, or requiring the same dependency but in different versions. Service developers have to either fix the conflicts (if the model source code is available and under control) or deploy them on separate platforms. Moreover, another challenge of publishing model services through OGMS-WS is service maintenance, which is also a common issue in web service development. It requires a lot of effort to keep web services functioning properly in the long term. Some environmental models are difficult to maintain due to changing computational environments and dependencies. Third-party resources and dependencies are not static and they are updated regularly to fix bugs, add new features, and remove old features. Any of these changes can make the model unable to execute. Zhao et al. (Citation2012) found that the ability and success of re-executing scientific workflows significantly reduces year by year because of changing dependencies and missing third-party resources.

Process isolation, which refers to isolating each process and preventing the conflicts between them on the same platform, is seldomly considered in web service publishing tool development. Many process isolation technological methods have been developed in recent decades. For executing applications that require different operating systems on the same server, virtual machines (VMs) can capture the operating system and everything running on it, provide the functionality of a physical computer, and can be transferred and shared as files. For running isolated applications with conflicts in the same operating system, containers (e.g. Docker) can encapsulate an application and all the related dependencies in a virtual container that can run as lightweight, standalone, and executable software. Compared to VMs, containers are more lightweight, portable, and high-performing because they do not contain an operating system but share the operating system kernel with the host machine (Boettiger Citation2014). Indeed, Li (Citation2020) used Docker to facilitate R Shiny web app deployment and support simultaneous access by multiple users.

In this study, we propose an approach for sharing encapsulated environmental models over the Internet as web services by leveraging OGMS-WS and Docker. OGMS-WS supports publishing heterogeneous geoanalysis and environmental models as web services, but the lack of model isolation management hampers multiple models being published, especially on the same platform. Docker is used to encapsulate models, making them isolated, portable, and reusable. We choose the Linux operating system for this work because it is the leading and most used operating system on servers, but this container-based approach could also be implemented in other operating systems.

The remainder of this paper is organized as follows. First, Section 2 shows the system design of this approach, as well as the design of an example application for publishing a streamflow prediction service. Section 3 shows the system implementation of this approach and the deployment of the streamflow prediction service. This section also presents the results of using the streamflow prediction service in analyzing Bangladesh historical streamflow trends. Finally, Section 4 summarizes the contributions of this approach and outlines future research opportunities. A software availability section is also provided to encourage the exploration and reuse of the software tools used in this work.

2. Methods

2.1. System design

2.1.1. OpenGMS Wrapper System (OGMS-WS)

OpenGMS is a platform for sharing geographic and environmental data and models among multi-disciplinary users to conduct integrated simulation and solve earth-related questions (Yue et al. Citation2016; Wang et al. Citation2018). OGMS-WS is an open source software tool in OpenGMS platform used to publish models as web services. OGMS-WS uses a standardized method to encapsulate heterogenous models and deploy them as RESTful web services through GUI. The GUI also provides users with the ability to interact with hosted model services, including search, execute, monitor, and control.

OGMS-WS is a portable application that can be easily launched from an executable file. To deploy a model as a web service through OGMS-WS (), it requires a Model Description Language (MDL) file to standardize the model communication with OGMS-WS, and a model encapsulation file to encapsulate the model. The MDL file () describes the model following an encapsulation standard, which summarizes heterogeneous models in three interfaces: model description, model execution, and model deployment. The MDL file provides all the essential information to encapsulate and deploy the model. Additionally, it is used to map the model to a GUI that allows users to manually interact with the model service and a RESTful API that can be used by third-party applications or clients. Detailed instruction on documenting the MDL file can be found from Xiao et al. (Citation2019). The model encapsulation file defines the interaction between users and the model service, including accepting requests from users and extracting model inputs, invoking and executing the model, and returning model outputs as responses to users when completed.

Figure 1. OGMS-WS workflow diagram.

Figure 1. OGMS-WS workflow diagram.

Table 1. Content of the Model Description Language (MDL) file.

The MDL file and model encapsulation file are packaged with all the model-related files and a license file into a model service deployment package that can be deployed in OGMS-WS and become a web service (shown as ). Model-related files include the model executable files or scripts (model folder), dependency libraries that support model execution (assembly folder), model running instances (instance folder), environment dependencies for model encapsulation (supportive folder), and model test data (testify folder). Ultimately, the service can be deployed by uploading the zipped package file to the ‘Deployment’ module in OGMS-WS (see ).

Figure 2. Generation of a model service deployment package.

Figure 2. Generation of a model service deployment package.

Figure 3. OGMS-WS Service Management Page.

Figure 3. OGMS-WS Service Management Page.

shows the home page of OGMS-WS. It contains multiple modules to provide different functionalities. The ‘Local Services Items’ module is used to list all the services hosted in the OGMS-WS server (). After invoking a service, users can check its execution status in the ‘Instances’ module. When the service execution is completed, users can check the configuration, log info, and results of historical executions in the ‘Records’ module. The ‘Data cache’ module backups uploaded input data. OGMS-WS also provides other useful functions including notifications, system information, and linking remote services.

Figure 4. Local Services Items Module.

Figure 4. Local Services Items Module.

2.1.2. Brief introduction to Docker

Docker is an open source project designed to develop, deploy, and run applications using isolated containers (Merkel Citation2014). Containers allow users to bundle an application with all of the parts it needs as one package. Containers are more lightweight than virtual machines because they share the operating system kernel with the host machine. A typical desktop computer could run no more than a few virtual machines at once but would have no trouble running 100 Docker containers (Boettiger Citation2014). Docker was initially designed to package applications in Linux systems. Docker images (a container is an instance of an image) share the Linux kernel of the host machine, which means that Docker images must be based on Linux system with Linux-compatible software. However, Docker Linux containers can be installed and run on both Linux and other platforms that are not based on Linux Kernel (like macOS and Windows), which is accomplished through the use of a small VirtualBox-based VM running on the host OS. In recent years, the Docker developer team released Windows containers for packaging Windows applications. Docker Windows containers share the Windows kernel with the host machine and currently only support deploying in Windows 10 system.

There are many advantages of using Docker in environmental modeling. First, Docker keeps models functional by creating isolated containers in which all the software and dependencies are already installed, configured and tested. Second, Docker supports major platforms (Linux, Windows and macOS), which means Docker can avoid running environment conflicts, enabling models to be deployed across platforms (note that Windows containers can only run in Windows operating systems). Third, a Docker image is created through reading a ‘Dockerfile’, which is a simple script that defines all the commands, necessary dependencies with detailed versions, and the OS to assemble the image. The Dockerfile is a small plain text file that can be easily shared. Docker also provides a public repository (Docker Hub, https://hub.docker.com/) for publishing and sharing Docker images. All of these Docker capabilities significantly improve model sharing and versioning. Last but not least, Docker allows users to link any directory on the host OS to the running Docker container. This allows users to directly use data saved on the host OS and rely on familiar tools and environment for data collecting, preparing, and editing, while executing code in the development environment of the container, avoiding data transferring across different platforms.

2.1.3. OGMS-WS-Docker approach

shows the design of the OGMS-WS-Docker approach. Each web service has an MDL file that documents the service metadata and configures the encapsulation file as the service entry point. When a user sends a request to execute a web service, OGMS-WS assigns the execution instance with a unique record ID and create an independent directory on the server to store input and output files – thereby avoiding multiuser conflicts. Then, the inputs are extracted from the request and saved in the execution instance directory. A Docker container is created based on the Docker image of the model. At the same time, the execution instance directory is mounted into the container through the bind mounts method (Docker Citation2020). The model is executed inside the container by running the ‘docker run’ command. After the execution is completed, the output file is mounted to the execution instance directory and sent back to the user as a response. By deploying different models as isolated Docker containers, this approach enables each model to remain functional regardless of the external changing environment. It enables deployment of earth system models that have environment or dependency conflicts on the same platform.

Figure 5. The OGMS-WS-Docker approach.

Figure 5. The OGMS-WS-Docker approach.

2.2. Example application design

This section aims to demonstrate the benefits of the OGMS-WS-Docker approach through an example application of publishing a streamflow prediction system as a web service. The design process is presented to show the workflow of publishing environmental models as web services using this approach. The reasons for choosing the streamflow prediction system as an example include, first, it follows the generic pattern of environmental models, reading a set of inputs, executing the model, and generating outputs. Second, the chosen model consists of multiple installation steps and has system and dependency requirements. Third, a non-Docker attempt at implementing the system in OGMS-WS failed because of environmental conflicts. Hence this model serves as a good candidate for simplification of installation and use through the proposed Docker oriented method. Through publishing the streamflow prediction service as a web service, users can use it directly without having to set up the system locally.

2.2.1. Streamflow prediction system

We developed an automated streamflow prediction system to map the runoff generated from large-scale grid-based land surface models (LSMs) onto high-resolution vector-based stream networks, then route the results using a vector-based river routing model to provide streamflow forecasts at the river level. The purpose of this streamflow prediction system is to provide flood management support to data-scarce regions by enabling the streamflow estimates at specific locations using global climate forecasts. It has been demonstrated that this system has comparable accuracy with the well-established Global Flood Awareness System (GloFAS) but provides streamflow predictions on significantly higher resolution stream networks with much less computational cost (Qiao et al. Citation2019a). This system currently supports multiple large-scale LSM datasets, including ERA-Interim (Dee et al. Citation2011), NLDAS (Xia et al. Citation2012), GLDAS (Lorenz et al. Citation2015), ERA5 (Karl Hennermann Citation2018), HIWAT (Gatlin et al. Citation2018), and COSMO (Rockel, Will, and Hense Citation2008).

shows the workflow of the streamflow prediction system and its hardware and software requirements. The system contains the Routing Application for Parallel computatIon of Discharge (RAPID) model to simulate the propagation of water flow waves in a river network (David et al. Citation2011), and RAPIDpy to prepare model inputs from LSM datasets. RAPID is FORTRAN-based and requires the code to be compiled as a library (.dll on Windows,.so on Linux). It also requires the Ubuntu 14.04 operating system and a few specific dependencies in specific versions for the model to be compiled. RAPIDpy is Python-based and requires multiple version-specific dependencies installed in a specific order. It is recommended to install RAPIDpy through Conda, which is a package manager that allows users to create an isolated environment for a package with its necessary dependencies in specific versions. But we found that the streamflow prediction system failed to work when deployed in OGMS-WS because of using different Python environments.

Figure 6. The workflow of the Streamflow Prediction System (gray boxes are inputs).

Figure 6. The workflow of the Streamflow Prediction System (gray boxes are inputs).

2.2.2. Streamflow prediction service design

Before encapsulating the streamflow prediction system as required by OGMS-WS, we first dockerized the system and published it as a Docker image on Docker Hub (see Software Availability section). Then, we set up an OGMS-WS environment in a Linux system. After dockerizing the system and setting up the environment, we designed the service, including input/output variables, interactions between end-users and the service, and interactions between the service and OGMS-WS.

OGMS-WS provides a GUI that consists of a brief description of each service and the interactions between users and the service, including interfaces to upload inputs, buttons to run and cancel service, buttons to download and visualize outputs, and execution monitoring. shows the interactions between user, service GUI, OGMS-WS, and Docker container throughout the service execution process. First, the user is asked to provide the required input data. The service becomes executable when all of the required inputs are provided. After clicking on the ‘Run service’ button, a service execution request with the input data is sent to OGMS-WS. All of the services in OGMS-MS support asynchronous execution by default, so the service GUI receives a response immediately after submitting an execute request. The first response confirms that the request is received and accepted by the server, a processing job has been created and will be run. After verifying the required input data, the processing job begins. A Docker container is created based on the Docker image of the streamflow prediction system (‘Run Docker’ in ). The system in the container is launched and executed with the input data saved on the server. The service GUI receives the execution response by repeatedly checking the execution status until the processing job has completed. Next, the execution status on the service GUI changes to ‘succeed’ and the user can download the result by clicking on the activated output download button.

Figure 7. Interactions between users, service interface, OGMS-WS, and the model container.

Figure 7. Interactions between users, service interface, OGMS-WS, and the model container.

As described in 2.1.1, deploying a web service in OGMS-WS requires an MDL file that defines the essential information to encapsulate and deploy the model, and a model encapsulation file that defines interactions between users and the model service. Following is a brief description of the inputs, outputs, and encapsulation function of the streamflow prediction service.

(1) Web Service Input and Output Variables

The streamflow prediction system is executed by running a launch function with three required parameters: (1) the full path of the LSM surface and subsurface runoff forcing files (in NetCDF format); (2) the full path of the RAPID model input files, including a file documenting the river network's topological connectivity, two files defining the Muskingum routing parameters (k and x), a weight table describes the area ratio of each LSM grid cell to the intersected catchments that are delineated upon the upstream point of each river reach and a model initialization file describing the streamflow of each river at the start time of the simulation; (3) the full path where to save the simulation results. Consequently, we set three input variables and one output for the service ().

Table 2. Variable design of the streamflow prediction service.

(2) Encapsulation Function

The encapsulation file extracts model inputs from a request, executes the model, and returns outputs as a response when the simulation is completed. On the OGMS-WS server, each execution instance of a web service has a unique record ID and an independent directory for storing the inputs and output results. OGMS-WS provides an encapsulation template file that contains scripts for the interactions between users and the service. Service developers are responsible for encoding the model execution steps. For the streamflow prediction service, it includes the following steps to execute the model: (1) build a Docker container from the Docker image of the streamflow prediction system, (2) mount the folder with user input files to the Docker container, and (3) run the model. According to Docker Command Line Interfaces, the following command line can mount a folder to a Docker container and execute the model file in it.

docker run –name [container name] -w [path to the model file in container] -v [source folder]:[target folder in container] [Docker image] [command to run the model file]

On the OGMS-WS server, each running service instance has a unique record ID and an independent directory to store the input and output data. The running instance directory can be mounted into the Docker container and the model can be executed by running the mounted Python file. For the streamflow prediction service, the Docker command is:

docker run –name rapidpy -w /home/python_file -v ∼/mainProcess:/home xhqiao89/rapidpy:1.0 python run_rapid.py

The model description file and model encapsulation file are packaged following the structure shown in . The model folder contains the MDL file, model encapsulation file, and two Python files that contain the dependency libraries for model encapsulation. The other folders are all empty because all the files associated with the streamflow prediction model are stored in the Docker container.

2.2.3. Service application in Bangladesh historical streamflow analysis

In this section, we applied the streamflow prediction service on a hydrologic analysis to show how the service can be used to contribute digital earth research. The service was used to generate 20 years (1995–2014) daily streamflow in each mainstream of Bangladesh, providing hydrologic baseline data for flood management and related research in Bangladesh.

Bangladesh is regularly devastated by flooding due to its low sea level and large rivers. Several catastrophic floods happened during the past 30 years. Over 75% of the total area of the country was flooded in 1998s flood. However, it is difficult to build an effective flood early warning system to manage floods for Bangladesh because of the lack of long-term hydrologic data. It requires equipment, technology, and human resources to maintain a stable hydrologic data collection and management system. For data-scarce countries like Bangladesh, leveraging satellite data and global models can be an efficient way to obtain baseline hydrologic information and supplement other available resources. ERA-Interim/Land is a global land surface reanalysis dataset with parameterization-improved European Center for Medium-range Weather Forecasts (ECMWF) model driven by meteorological forcing from the ERA-Interim atmospheric reanalysis and observed precipitation adjustments (Balsamo et al. Citation2009; Wipfler et al. Citation2011; Balsamo et al. Citation2015). ERA-Interim/Land provides integrated and coherent daily land surface estimates from 1980/01/01–2014/12/31. It has been widely used in research to provide hydrologic baseline information for data-scarce regions and used as the initialization of weather prediction models (Dee et al. Citation2011; Balsamo et al. Citation2015; Albergel et al. Citation2018). However, the low resolution (80 km) of the ERA-Interim/Land dataset makes it difficult to apply at specific locations on local stream networks. Rather, it provides total runoff volume per 6400 km2 grid cell.

The streamflow prediction service can convert the low-resolution ERA Interim/Land runoff data into streamflow at local rivers, providing historical streamflow estimates for Bangladesh. In this study, we generated 20 years (1995–2014) of historical streamflow in Bangladesh by using the streamflow prediction service and ERA-Interim/Land dataset. The historical streamflow trend in the main streams of Bangladesh during this period was investigated.

3. Results

3.1. Software implementation of the streamflow prediction service

We have successfully deployed the streamflow prediction model in OGMS-WS and published it as a web service (http://cosmo.byu.edu:8060/modelser/all). While this particular service is live at the time of publication of this article, it is an experimental server that may not be available long-term. However, all of the code and executables that comprise this live server are provided in the Software Availability section. The service requires three inputs: a zip file of the LSM runoff NetCDF files, a zip file of the RAPID input files, and a ‘run_rapid.py’ Python file to run the model. It returns a NetCDF file with the simulated streamflow in each river segment. The service is open access and can be consumed through either the service GUI or the service API.

Once deployed, the streamflow prediction service appears within OGMS-WS in the local services items showing its name, version, type, status, permission, accessibility, and allowed operation options. Users can click the ‘invoking’ button to enter the service GUI (a). The service GUI first provides an introduction of the service to help users understand its functionality. It then lists service inputs and outputs, including instruction to prepare input data, buttons and popup windows to upload input data (b). The GUI also provides a set of test data that can be directly loaded to test the service or downloaded as an example to help users prepare input data. When users click on the ‘Run’ button to execute the service, the GUI shows service execution status, log information, and model output messages. After the execution is completed, users can download the output through the ‘Download’ button (c) and visualize the river hydrograph through the ‘Visualization’ button (). OGMS-WS supports the visualization of many commonly used data formats in earth and environmental modeling research, including GeoTIFF, ASCII GRID, NetCDF, etc (Wang et al. Citation2018).

Figure 8. Streamflow prediction service GUIs.

Figure 8. Streamflow prediction service GUIs.

Figure 9. Streamflow results visualization.

Figure 9. Streamflow results visualization.

3.2. Bangladesh historical streamflow analysis

Another method to consume the service and execute its models is through its API. The services hosted in OGMS-WS are RESTful web services and support CRUD (create, retrieve, update, delete) operations. The services can be invoked by HTTP methods (HTTP GET, POST, DELETE, and PUT) or URL-based requests that RESTful APIs support. We ran the service with the ERA-Interim/Land runoff reanalysis data in Bangladesh from 1995/01/01–2014/12/31 to generate the historical streamflow data in Bangladesh. A Python-based service client SDK (Zhang et al. Citation2020) was used to interact with the service, including sending requests with input files to the service and receiving responses with outputs from the service.

Through the service execution, we obtained 20 years of daily simulated streamflow in the high-resolution river network of Bangladesh (shown as the blue lines in ). We selected 4 reach segments ( and as the red circles in ) located in four major rivers to investigate the streamflow level changes from 1995 to 2014. Reach A is on the Ganges River and reach B is on the Jamuna River. The Ganges River and Jamuna River join together to form the Padma river, where reach C is located. Reach D is the last segment of the Meghna river before joining with the Padma River. The Meghna River is the longest river in Bangladesh.

Figure 10. High resolution river network in Bangladesh.

Figure 10. High resolution river network in Bangladesh.

Table 3. Selected reach segments.

shows the simulated streamflow of the selected four reach segments from 1995 to 2014. shows the boxplots of their monthly average streamflow in these 20 years. From the results, we found that the rivers in Bangladesh have clear flood and dry seasons every year. The streamflow varies a lot throughout the year and the differences between flood season and dry season can be enormous. The river flood season is mainly from June to October. Streamflow increases significantly and remains high from July to September. Many flash floods occurred during the flood season with the streamflow increasing rapidly from a couple of hundreds to tens of thousands cubic meters per second in a few days and then quickly reducing down. Flash floods are the most destructive disasters in Bangladesh as they happen with little or no advance warning for preparation and evacuation, and they also carry more and coarser sediment than normal floods. shows significant high streamflow around 1995, 1998, 2004 and 2005, which is consistent with the catastrophic flood records of Bangladesh (Wikipedia Citation2019). Above all, global datasets are certainly not as accurate as observations. But hydrologic data poor regions like Bangladesh can quickly convert global datasets into streamflow at rivers by using the service to obtain baseline hydrologic information, hence assisting flood management and research.

Figure 11. Simulated streamflow of selected reach segments.

Figure 11. Simulated streamflow of selected reach segments.

Figure 12. Streamflow monthly distribution from 1995 to 2014.

Figure 12. Streamflow monthly distribution from 1995 to 2014.

4. Discussion and conclusion

Publishing environmental models as web services remains challenging due to their heterogeneous nature in form, execution mode, programming language, and supported operating system. We developed an approach for publishing models as web services, regardless of their diversity, using OGMS-WS for service publishing and Docker for model isolation. OGMS-WS was used because of its cross-platform nature (support for Windows, macOS, and Linux operating systems), simple interfaces, and demonstrated successes working with earth systems models. Docker was included to resolve deployment conflicts and keep models long-term functional in the changing environment. This approach inherits all the benefits of OGMS-WS and Docker with more flexibility. To demonstrate the benefits of this approach, we presented the development and deployment of a test streamflow prediction service. The service was then used to analyze the historical flood trends in Bangladesh by generating 20 years (1995–2014) high-resolution streamflow data with the ERA-interim/Land global land surface reanalysis dataset.

Based on the example application and our experience, the key outcomes of this approach for earth and environmental modeling include:

  1. Avoiding deployment conflicts: The OGMS-WS-Docker approach keeps each model isolated to avoid potential dependency or environment conflicts when deploying models. The approach enables multiple models with conflicting dependencies, or developed in different programming languages, or supporting different operating systems to be deployed as web services on the same platform. It is especially beneficial for institutions that want to host multiple model repositories in a single machine.

  2. Lowering the barrier to publishing environmental models as long-term functional web services: The OGMS-WS-Docker approach allows users to set up a model by creating a Docker container based on its Docker image, ensuring the ability to execute the model successfully. This approach lowers the barrier to using OGMS-WS by unifying the encapsulation function as simple Docker command lines rather than complicated scripts to configure the environment and initiate the model. Moreover, this approach can save service maintenance effort due to the ability of Docker to keep web services functional and unaffected by changes in the computing environment and dependencies in the long term. As people have gradually realized the benefits of using Docker, more and more earth and environmental models have been dockerized and published on Docker Hub, such as Stormwater Management Model (SWMM), HEC-RAS, SUMMA (Clark et al. Citation2015), and WRF-Hydro (Gochis et al. Citation2018). The OGMS-WS-Docker approach provides a way to deploy these models as web services, which is exceptionally challenging with traditional methods.

  3. Facilitating model sharing: Docker has been used for sharing models by capturing a model and all necessary dependencies in a lightweight, portable, and isolated way. The OGMS-WS-Docker approach enables dockerized models to communicate with each other more easily and seamlessly exchange data over the Internet in a standard way. Once a dockerized model is published as a web service, it can be accessed and executed via a URL from any location or device without being downloaded to local computers. It also enables dockerized models to be more easily chained into workflows to address broader questions, facilitating complex earth and environmental modeling. In addition, compared with using Docker in isolation, the OGMS-WS-Docker approach provides service management with a friendly GUI that allows users to interact with web services and manage execution-related data, lowering the technical barrier for model users.

  4. Operating cross-platform support: This approach supports major operating systems as Windows, macOS, and Linux because both OGMS-WS and Docker are cross-platform software.

In addition, this work demonstrates the benefits of using containers for model isolation management to avoid potential conflicts in the service publishing process. It can also serve as guidance for integrating container technology with other service publishing tools, and as a demonstration for using containers for model isolation on servers. We expect that this work and outcomes can motivate contribution on digital earth web services to maximize current outcomes and facilitate environmental modeling research. We have provided the streamflow prediction service as a case study to demonstrate the approach. The next step of this work is to implement web services for more models to more fully demonstrate the utility and understand the limitations of this approach in terms of models of varying complexity.

Software Availability

Acknowledgements

This work was supported by NASA SERVIR [Grant Number NNX16AN45G] and the National Science Fund for Excellent Young Scholars of China [Grant Number 41622108]. Author contributions are as follows: Xiaohui Qiao conducted the bulk of the work and is the primary contributor, Zhiyu Li provided computational technical support and contributed to the paper design, Fengyuan Zhang was the primary developer of OGMS-WS and the service client SDK. Min Chen was the primary designer of OpenGMS platform and related tools. Daniel P. Ames was the primary research advisor and contributed to the writing, E. James Nelson provided financial support.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by National Science Fund for Distinguished Young Scholars, China: [grant number 41622108]; NASA ROSES SERVIR Applied Research: [grant number NNX16AN45G].

References

  • 52°North. 2010. “WPS - Standardized Web-based Geo-processing.” https://52north.org/software/software-projects/wps/.
  • Albergel, Clement, Emanuel Dutra, Simon Munier, Jean-Christophe Calvet, Joaquin Munoz-Sabater, Patricia de Rosnay, and Gianpaolo Balsamo. 2018. “ERA-5 and ERA-Interim Driven ISBA Land Surface Model Simulations: Which one Performs Better?” Hydrology & Earth System Sciences 22 (6): 18.
  • Allen, David W. 2011. Getting to Know ArcGIS ModelBuilder. Redlands: Esri Press.
  • Balsamo, G., C. Albergel, A. Beljaars, S. Boussetta, E. Brun, H. Cloke, D. Dee, et al. 2015. “ERA-Interim/Land: A Global Land Surface Reanalysis Data Set.” Hydrology Earth System Sciences 19: 18. doi:10.5194/hess-19-389-2015.
  • Balsamo, Gianpaolo, Anton Beljaars, Klaus Scipal, Pedro Viterbo, Bart van den Hurk, Martin Hirschi, and Alan K. Betts. 2009. “A Revised Hydrology for the ECMWF Model: Verification from Field Site to Terrestrial Water Storage and Impact in the Integrated Forecast System.” Journal of Hydrometeorology 10 (3): 623–643. doi:10.1175/2008jhm1068.1.
  • Bandaragoda, C., A. Castronova, E. Istanbulluoglu, R. Strauch, S. S. Nudurupati, J. Phuong, J. M. Adams, et al. 2019. “Enabling Collaborative Numerical Modeling in Earth Sciences Using Knowledge Infrastructure.” Environmental Modelling & Software 120, 104424. doi:10.1016/j.envsoft.2019.03.020.
  • Becchi, Jáchym Cepický, and Lorenzo. 2007. “Geospatial Processing via Internet on Remote Servers - PyWPS.” OSGeo Journal 1: 5.
  • Boettiger, Carl. 2014. “An Introduction to Docker for Reproducible Research, with Examples from the R Environment.” ACM SIGOPS Operating Systems Review - Special Issue on Repeatability and Sharing of Experimental Artifacts 49 (1): 9.
  • Castronova, Anthony M, Jonathan L Goodall, and Mostafa M Elag. 2013. “Models as web Services Using the Open Geospatial Consortium (ogc) web Processing Service (wps) Standard.” Environmental Modelling & Software 41: 72–83.
  • Chen, Min, Alexey Voinov, Daniel P. Ames, Albert J. Kettner, Jonathan L. Goodall, Anthony J. Jakeman, Michael C. Barton, et al. 2020. “Position Paper: Open Web-Distributed Integrated Geographic Modelling and Simulation to Enable Broader Participation and Applications.” Earth-Science Reviews 207: 103223. doi:10.1016/j.earscirev.2020.103223.
  • Chen, Min, Songshan Yue, Guonian Lü, Hui Lin, Chaowei Yang, Yongning Wen, Tao Hou, Dawei Xiao, and Hao Jiang. 2019. “Teamwork-oriented Integrated Modeling Method for Geo-Problem Solving.” Environmental Modelling & Software 119: 111–123. doi:10.1016/j.envsoft.2019.05.015.
  • Choi, Young-Don, Jonathan L. Goodall, Jeffrey M. Sadler, Anthony M. Castronova, Andrew Bennett, Zhiyu Li, Bart Nijssen, et al. 2021. “Toward Open and Reproducible Environmental Modeling by Integrating Online Data Repositories, Computational Environments, and Model Application Programming Interfaces.” Environmental Modelling & Software 135, 104888. doi:10.1016/j.envsoft.2020.104888.
  • Clark, Martyn P., Bart Nijssen, Jessica D. Lundquist, Dmitri Kavetski, David E. Rupp, Ross A. Woods, Jim E. Freer, et al. 2015. “A Unified Approach for Process-Based Hydrologic Modeling: 1. Modeling Concept.” Water Resources Research 51 (4): 2498–2514. doi:10.1002/2015WR017198.
  • Collberg, C., T. Proebsting, G. Moraila, A. Shankaran, Z. Shi, A. M. Warren. 2014. “Measuring Reproducibility in Computer Systems Research.” Tech. Rep., Dep. Comput. Sci., Univ. Ariz., Tucson http://reproducibility.cs.arizona.edu/tr.pdf.
  • David, Cédric H., David R. Maidment, Guo-Yue Niu, Zong-Liang Yang, Florence Habets, and Victor Eijkhout. 2011. “River Network Routing on the NHDPlus Dataset.” Journal of Hydrometeorology 12 (5): 913–934. doi:10.1175/2011jhm1345.1.
  • Dee, D. P., S. M. Uppala, A. J. Simmons, P. Berrisford, P. Poli, S. Kobayashi, U. Andrae, et al. 2011. “The ERA-Interim Reanalysis: Configuration and Performance of the Data Assimilation System.” Quarterly Journal of the Royal Meteorological Society 137 (656): 553–597. doi:10.1002/qj.828.
  • Docker, Inc. 2020. “Use Bind Mounts | Docker Documentation.” https://docs.docker.com/storage/bind-mounts/.
  • Dubois, Grégoire, Michael Schulz, Jon Skøien, Lucy Bastin, and Stephen Peedell. 2013. “eHabitat, a Multi-Purpose Web Processing Service for Ecological Modeling.” Environmental Modelling & Software 41: 123–133.
  • Gao, Fan, Peng Yue, Chenxiao Zhang, and Mi Wang. 2019. “Coupling Components and Services for Integrated Environmental Modelling.” Environmental Modelling & Software 118: 14–22. doi:10.1016/j.envsoft.2019.04.003.
  • Gatlin, Patrick N., Jonathan L. Case, Jordan Bell, Walter A. Petersen, and Dan Cecil. 2018. Monitoring Intense Thunderstorms in the Hindu-Kush Himalayan Region. Huntsville, AL, US: NASA Marshall Space Flight Center.
  • GeoServer. 2013. “GeoServer Web Processing Service Home Page.” http://docs.geoserver.org/2.3.5/user/extensions/wps/index.html.
  • Gochis, D. J., M. Barlage, A. Dugger, K. FitzGerald, L. Karsten, M. McAllister, J. McCreight, et al. 2018. “The NCAR WRF-Hydro Modeling System Technical Description, Version 5.” NCAR Technical Note:107.
  • Goodall, Jonathan L., Bella F. Robinson, and Anthony M. Castronova. 2011. “Modeling Water Resource Systems Using a Service-Oriented Computing Paradigm.” Environmental Modelling & Software 26 (5): 573–582. doi:10.1016/j.envsoft.2010.11.013.
  • Goodall, Jonathan L., Kathleen D. Saint, Mehmet B. Ercan, Laura J. Briley, Sylvia Murphy, Haihang You, Cecelia DeLuca, and Richard B. Rood. 2013. “Coupling Climate and Hydrological Models: Interoperability Through Web Services.” Environmental Modelling & Software 46: 250–259. doi:10.1016/j.envsoft.2013.03.019.
  • Gore, Al. 1998. “The Digital Earth.” Australian Surveyor 43 (2): 89–91. doi:10.1080/00050348.1998.10558728.
  • Gregersen, J. B., P. J. A. Gijsbers, and S. J. P. Westen. 2007. “OpenMI: Open Modelling Interface.” Journal of Hydroinformatics 9 (3): 175–191.
  • Grossner, Karl E., Michael F Goodchild, and Keith C. Clarke. 2008. “Defining a Digital Earth System.” 12 (1): 145–160. doi:10.1111/j.1467-9671.2008.01090.x.
  • Guo, Huadong. 2020. “Manual of Digital Earth – A Milestone Book in Digital Earth History.” International Journal of Digital Earth 13 (1): 1–1. doi:10.1080/17538947.2019.1700631.
  • Jiang, Peishi, Mostafa Elag, Praveen Kumar, Scott Dale Peckham, Luigi Marini, and Liu Rui. 2017. “A Service-Oriented Architecture for Coupling Web Service Models Using the Basic Model Interface (BMI).” Environmental Modelling & Software 92 (Supplement C): 107–118. doi:10.1016/j.envsoft.2017.01.021.
  • Jiang, Hao, John van Genderen, Paolo Mazzetti, Hyeongmo Koo, and Min Chen. 2019. “Current Status and Future Directions of Geoportals.” International Journal of Digital Earth, 1–22. doi:10.1080/17538947.2019.1603331.
  • Karl Hennermann, Paul Berrisford. 2018. “What are the Changes from ERA-Interim to ERA5?”. European Centre for Medium-Range Weather Forecasts. https://confluence.ecmwf.int/pages/viewpage.action?pageId=74764925.
  • Li, Yu. 2020. “Towards Fast Prototyping of Cloud-Based Environmental Decision Support Systems for Environmental Scientists Using R Shiny and Docker.” Environmental Modelling & Software 132: 104797. doi:10.1016/j.envsoft.2020.104797.
  • Lorenz, Christof, Mohammad J. Tourian, Balaji Devaraju, Nico Sneeuw, and Harald Kunstmann. 2015. “Basin-scale Runoff Prediction: An Ensemble Kalman Filter Framework Based on Global Hydrometeorological Data Sets.” Water Resources Research 51 (10): 8450–8475. doi:10.1002/2014WR016794.
  • Luo, Wei, Kirk Duffin, Edit Peronja, Jay Stravers, and George Henry. 2004. “A web-Based Interactive Landform Simulation Model (WILSIM).” Computers & Geosciences - COMPUT GEOSCI 30: 215–220. doi:10.1016/j.cageo.2004.01.001.
  • Merkel, Dirk. 2014. “Docker: Lightweight Linux Containers for Consistent Development and Deployment.” Linux Journal 239 (2).
  • Michaelis, Christopher D, and Daniel P Ames. 2009. “Evaluation and Implementation of the OGC Web Processing Service for use in Client-Side GIS.” Geoinformatica 13 (1): 109–120.
  • Müller, Markus. 2007. “deegree – Building Blocks for Spatial Data.” OSGeo Journal 1: 4.
  • Omidipoor, Morteza, Mohammadreza Jelokhani-Niaraki, and Najmeh Neysani Samany. 2019. “A Web-Based Geo-Marketing Decision Support System for Land Selection: A Case Study of Tehran, Iran.” Annals of GIS 25 (2): 179–193. doi:10.1080/19475683.2019.1575905.
  • Peckham, Scott D., Eric W. H. Hutton, and Boyana Norris. 2013. “A Component-Based Approach to Integrated Modeling in the Geosciences: The Design of CSDMS.” Computers & Geosciences 53: 3–12. doi:10.1016/j.cageo.2012.04.002.
  • Qiao, Xiaohui, E. James Nelson, Daniel P. Ames, Zhiyu Li, Cédric H. David, Gustavious P. Williams, Wade Roberts, et al. 2019a. “A Systems Approach to Routing Global Gridded Runoff Through Local High-Resolution Stream Networks for Flood Early Warning Systems.” Environmental Modelling & Software 120: 104501. doi:10.1016/j.envsoft.2019.104501.
  • Qiao, Xiaohui, Zhiyu Li, Daniel P. Ames, E. James Nelson, and Nathan R. Swain. 2019b. “Simplifying the Deployment of OGC Web Processing Services (WPS) for Environmental Modelling – Introducing Tethys WPS Server.” Environmental Modelling & Software 115: 38–50. doi:10.1016/j.envsoft.2019.01.021.
  • Rajib, Mohammad Adnan, Venkatesh Merwade, I. Luk Kim, Lan Zhao, Carol Song, and Shandian Zhe. 2016. “SWATShare – A Web Platform for Collaborative Research and Education Through Online Sharing, Simulation and Visualization of SWAT Models.” Environmental Modelling & Software 75: 498–512. doi:10.1016/j.envsoft.2015.10.032.
  • Rockel, Burkhard, Andreas Will, and Andreas Hense. 2008. The Regional Climate Model COSMO-CLM (CCLM). Vol. 17.
  • Vitolo, Claudia, Yehia Elkhatib, Dominik Reusser, Christopher J. A. Macleod, and Wouter Buytaert. 2015. “Web Technologies for Environmental Big Data.” Environmental Modelling & Software 63: 185–198. doi:10.1016/j.envsoft.2014.10.007.
  • Wang, Jin, Min Chen, Guonian Lü, Songshan Yue, Kun Chen, and Yongning Wen. 2018. “A Study on Data Processing Services for the Operation of Geo-Analysis Models in the Open Web Environment.” 5 (12): 844–862. doi:10.1029/2018ea000459.
  • Wen, Yongning, Min Chen, Guonian Lu, Hui Lin, Li He, and Songshan Yue. 2013. “Prototyping an Open Environment for Sharing Geographical Analysis Models on Cloud Computing Platform.” International Journal of Digital Earth 6 (4): 356–382. doi:10.1080/17538947.2012.716861.
  • Wikipedia. 2019. “Floods in Bangladesh.” Wikipedia. The Free Encyclopedia. Accessed October 11. https://en.wikipedia.org/w/index.php?title=Floods_in_Bangladesh&oldid=920590097.
  • Wipfler, E. L., K. Metselaar, J. C. van Dam, R. A. Feddes, E. van Meijgaard, L. H. van Ulft, B. van den Hurk, S. J. Zwart, and W. G. M. Bastiaanssen. 2011. “Seasonal Evaluation of the Land Surface Scheme HTESSEL Against Remote Sensing Derived Energy Fluxes of the Transdanubian Region in Hungary.” Hydrology and Earth System Sciences 15: 15. doi:10.5194/hess-15-1257-2011.
  • Xia, Youlong, Kenneth Mitchell, Michael Ek, Justin Sheffield, Brian Cosgrove, Eric Wood, Lifeng Luo, et al. 2012. “Continental-scale Water and Energy Flux Analysis and Validation for the North American Land Data Assimilation System Project Phase 2 (NLDAS-2): 1. Intercomparison and Application of Model Products.” Journal of Geophysical Research: Atmospheres 117 (D3), doi:10.1029/2011jd016048.
  • Xiao, Dawei, Min Chen, Yuchen Lu, Songshan Yue, and Tao Hou. 2019. “Research on the Construction Method of the Service-Oriented Web-SWMM System.” ISPRS International Journal of Geo-Information 8 (6): 268. doi:10.3390/ijgi8060268.
  • Yue, Songshan, Min Chen, Yongning Wen, and Guonian Lu. 2016. “Service-oriented Model-Encapsulation Strategy for Sharing and Integrating Heterogeneous Geo-Analysis Models in an Open Web Environment.” ISPRS Journal of Photogrammetry and Remote Sensing 114: 258–273.
  • Zhang, Fengyuan, Min Chen, Daniel P. Ames, Chaoran Shen, Songshan Yue, Yongning Wen, and Guonian Lü. 2019. “Design and Development of a Service-Oriented Wrapper System for Sharing and Reusing Distributed Geoanalysis Models on the Web.” Environmental Modelling & Software 111: 498–509. doi:10.1016/j.envsoft.2018.11.002.
  • Zhang, F., M. Chen, S. Yue, Y. Wen, G. Lü, and F. Li. 2020. “Service-oriented Interface Design for Open Distributed Environmental Simulations.” Environmental Research 191: 110225. doi:10.1016/j.envres.2020.110225.
  • Zhao, J., J. M. Gomez-Perez, K. Belhajjame, G. Klyne, E. Garcia-Cuesta, A. Garrido, K. Hettne, M. Roos, D. De Roure, and C. Goble. 2012. “Why Workflows Break — Understanding and Combating Decay in Taverna Workflows.” 2012 IEEE 8th International conference on E-science, 8–12 October 2012.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.