Full article: Generation of Vessel Track Characteristics Using a Conditional Generative Adversarial Network (CGAN)

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Machine learning (ML) models often require large volumes of data to learn a given task. However, access and existence of training data can be difficult to acquire due to privacy laws and availability. A solution is to generate synthetic data that represents the real data. In the maritime environment, the ability to generate realistic vessel positional data is important for the development of ML models in ocean areas with scarce amounts of data, such as the Arctic, or for generating an abundance of anomalous or unique events needed for training detection models. This research explores the use of conditional generative adversarial networks (CGAN) to generate vessel displacement tracks over a 24-hour period in a constraint-free environment. The model is trained using Automatic Identification System (AIS) data that contains vessel tracking information. The results show that the CGAN is able to generate vessel displacement tracks for two different vessel types, cargo ships and pleasure crafts, for three months of the year (May, July, and September). To evaluate the usability of the generated data and robustness of the CGAN model, three ML vessel classification models using displacement track data are developed using generated data and tested with real data.

Introduction

The movement of vessels through both national and international waters are of interest to both security and defense personnel. In international waters, vessel monitoring takes place for the benefit of the environment, safety, and the defense and security of national jurisdictions. In national waters, the motivation behind vessel monitoring is expanded to include protection of natural habitats, enforcement of national shipping regulations, collision avoidance, etc. In general terms, more constraints are added to vessel movement within a nation’s water space.

Vessel movement constraints are environmental (e.g., coastline, ice, ship density), vessel-based (e.g., maximum speed, fuel capacity), and regulatory (e.g., speed regulations, shipping lanes). As well, depending on the specific constraint, violating the constraint will result in different consequences. The lower end of the consequence scale would include a warning from an authority to reduce vessel speed. The upper end of the consequence scale could involve the grounding or sinking of a vessel due to an impact with the coastline or an underwater obstacle.

Some national waters have an abundance of physical hazards that produce constraints. As well, the opening of new potential waterways adds to the complexity of vessel movement, as these waterways often have old or incomplete charts. An example is the Canadian Arctic, where the intricate physical characteristics of the land mass are combined with the moving constraint of ice.

As vessel traffic is introduced into the waterways of the Canadian Arctic, it is important to combine modern monitoring capabilities with modern vessel predictive techniques. For example (Campbell, Isenor, and Dais Ferreira Citation2022), used machine learning (ML) to successfully identify a large group of fabricated vessels that were reported to be crossing the Atlantic. Although ML can be considered predictive in terms of classification, ML does require a sizable amount of high quality (i.e., clean) data. This means modern analysis approaches (e.g., ML) need to be supported with an information infrastructure that can provide high quality data (Syms et al. Citation2021).

Access to large, diverse, and annotated data setsFootnote¹ represents another challenge to using ML. Annotations are a problem as they often are assigned by humans which is a time-consuming process. However, generated data are automatically annotated when created, which avoids the labeling issue. In this paper, we use a form of generative Artificial Intelligence (AI), called a conditional generative adversarial network (CGAN) (Mirza and Osindero Citation2014) to generate vessel displacement tracks using Automatic Identification System (AIS) data.Footnote² The displacement track, which is the shortest path of a vessel between its first and last position of the day, is considered preliminary progress toward the generation of more complete and complex vessel tracks under constraints. The conditional model is used in this work as a method to isolate the statistical characteristics of a specific vessel type.

Generative Adversarial Networks and Related Work

In 2014, generative adversarial networks (GANs) were introduced by Goodfellow et al. (Citation2014). as a form of generative AI used to create synthetic data values. A GAN consists of two neural networks, the discriminator (D) and the generator (G) that compete in an adversarial setting (Campbell, Ferreira, and Isenor Citation2023). A GAN can produce a large volume of generated data with minimal effort. Generated data can be used to improve data quality, enhance scalability, help remove bias, up-sample rare events to improve diversity, and address privacy issues associated with granting access to data (Campbell, Ferreira, and Isenor Citation2023).

Since the introduction of GANs, there have been promising applications in the field of image generation (thispersondoesnotexist.com Accessed: 2024) (Bao et al. Citation2017; Zhang et al. Citation2022),; text-text generation (OpenAI:Chatgpt Accessed: Citation2024b) (Chen et al. Citation2020; Li et al. Citation2023),; and text-image generation (OpenAI:Dalle Citation2024a) (Liao et al. Citation2022; Tao et al. Citation2022); that has fueled research toward developing and evolving the architectures and applications of GANs. As a result, many adaptations to the original GAN architecture have developed including Deep Convolutional GAN (Radford, Metz, and Chintala Citation2016), CGAN (Mirza and Osindero Citation2014), Wasserstein GAN (Arjovsky, Chintala, and Bottou Citation2017), InfoGAN (Chen et al. Citation2016), etc.

GANs have also been used for various tasks related to the generation and imputation of missing values in a wide variety of data sets. This includes application to medical records and historical air quality data (Yonghong et al. Citation2018), as well as, traffic flow data, basketball player trajectories, and billiard ball trajectories (Lui, et al. Citation2019). GANs have been used to model and generate classical music (Mogren Citation2016), real-value medical data (Esteban, Hyland, and Rätsch Citation2017), and stock prices and energy data (Yoon, Jarrett, and van der Schaar Citation2019).

Considering spatio-temporal data, a review by Gao, et al (Nan et al. Citation2022). indicates the considerable depth of spatio-temporal GANs applications including in the areas of time series, trajectories, spatio-temporal events, and spatio-temporal graphs. In topics related to trajectories of objects, GANs have been used to model and predict pedestrian trajectories and interactions (Gupta et al. Citation2018; Kosaraju et al. Citation2019; Liu et al. Citation2020; Sadeghian et al. Citation2019), taxi hot-spots (Yu et al. Citation2020), and crime related events for security and protection (Jin et al. Citation2019).

Specific to the movement of maritime vessels, GANs have been used to generate several minutes of missing positions in a vessel trajectory (Zhang et al. Citation2023). As well, the handling behavior of a vessel has been predicted using a GAN (Gao and Shi Citation2020) for the purpose of collision avoidance. In the current literature, the research focuses on augmenting the track of an existing vessel. To the best of our knowledge, generating vessel trajectories for non-existent ships using GANs is still an open area to be investigated and is explored in this piece of work.

In the application of constraints within a GAN, work has been done on pedestrian trajectory prediction while considering stationary and mobile obstacles (Sadeghian et al. Citation2019). For vessels, trajectory prediction within a channel has been considered using a repulsive potential field method (Lu et al. Citation2023). Specific to Liang et. al (Liang et al. Citation2024) is the introduction of spatial and temporal correction, which is relevant to the vessel positional data used in this study. Although spatial correlations have not been incorporated in the current work, such correlations do exist in vessel data. These correlations result from report-to-report relationships for a single vessel and vessel-to-vessel relationships in well-defined vessel traffic routes (McArthur and Isenor Citation2021).

Motivation and Contributions

The ultimate goal of this work is the generation of realistic vessel tracks for the Canadian Arctic that considers the constraints of the intricate coastline and ice conditions. The vessel tracks will be used for algorithm testing related to defense and security monitoring. This paper represents a step toward this goal, with the establishment of an operating GAN that generates realistic daily positions of vessels without accounting for environmental, vessel-based, and regulatory constraints.

The contributions of this work include:

the application of a GAN to geospatial vessel movement data in an area of complex shoreline;
investigation of the realism of generated vessel positions using sparse input data, such as would exist in the Arctic;
the generation of representative features for vessel displacement tracks for two different vessel types;
a comprehensive evaluation of the model performance using loss curves, column shapes, correlation measures, boundary adherence, and new row synthesis;
the evaluation of the proposed model using machine learning efficacy tests that show that the use of generated data is promising for the development of vessel models deployed in a real-world setting; and
showing the robustness of the model by generating data for different times of the year.

The paper is structured as follows. Section 2 describes how GANs and CGANs are trained to generate synthetic data. Section 3 discusses the data set and model architectures used to perform the generation of vessel displacement tracks. Section 4 covers the results of the generation process and discusses the outcome and evaluation of all the generated data. Lastly, Section 5 summarizes the conclusions and discusses the future work.

Conditional Generative Adversarial Networks

GANs were first introduced as a form of generative modeling used to create synthetic data samples that resemble or mimic real data. The CGAN is an adaptation of the original GAN framework that incorporates conditional information into the training process. This section will discuss the architecture and training process associated with the original GAN framework and then cover how the condition information is incorporated to create the CGAN.

Original GAN Framework

The original GAN framework consists of two neural networks that compete with one another in an adversarial setting with opposing objectives. The networks compete in a zero-sum game where the loss of one network is the other’s gain and vice versa (Saxena and Cao Citation2021). The two networks that make up the architecture of the GAN are the $G$ and $D$ . The goal of each network is as follows:

$D$ : the $D$ receives both real and synthetic data as input and estimates the probability that a data instance came from the real data class. The goal of the $D$ is to correctly classify a data point as real or generated.
$G$ : the $G$ creates synthetic data from random noise that captures and mimics the distribution of the real data. The goal of the $G$ is to create data that fools the $D$ into believing it is real.

illustrates the original GAN architecture (solid lines only). The training process starts with random noise being fed into the $G$ where synthetic data samples are created. These data are then fed into the $D$ alongside real data where the $D$ must classify each instance as real or generated. The $D$ outputs a scalar value that represents the probability that the data point is from the real data distribution. The probabilities for the real and generated data are combined into a value function for the GAN, given below:

(1)

E_{X \sim p_{data}} [log (D (X))] + E_{Z \sim p_{Z}} [log (1 - D (G (Z)))]

(1)

Figure 1. The depiction of the GAN architecture is represented by the solid lines. When the purple and green dotted lines are taken into consideration the diagram represents the structure of a CGAN.

This can be generalized into more common probability theory notation as:

(2)

E_{X} [{log}_{2} (D_{θ} (X))] + E_{Z} [{log}_{2} (1 - D_{θ} (G (Z)))]

(2)

where $E_{X} / E_{Z}$ is the expected value over real $(X)$ /random noise $(Z)$ samples, $D_{θ} (X)$ is the $D$ ’s probability that a real instance is real, $G (Z)$ is the $G$ output for a generated sample created from random noise, $D_{θ} (G (Z))$ is the $D$ ’s probability that a generated instance is real, $θ$ are the hyperparameters that define the $D$ , and $1 - D_{θ} (G (Z))$ is the probability that a generated instance is generated.

The value function above is then used to define an objective function. The objective function is based on a game theory concept that involves the “simultaneous” optimization of both the $G$ and $D$ networks (Campbell, Ferreira, and Isenor Citation2023). As a result, each network competes to maximize its probability of “winning” this adversarial game. The goal of the $G$ is to minimize the value function as it wants to reduce the number of generated data points being assigned a low probability by the $D$ . In contrast, the $D$ ’s goal is to maximize the number of generated data points being assigned a low probability, and maximize the number of real data points being assigned a high probability. This will occur when the value function is maximized.

To train both networks, independent backpropagation procedures are performed (Nielsen Citation2015). Backpropagation depends on the loss calculation which is computed using binary cross entropy in this study. The loss compares the actual outcome to the predicted outcome to assess the error of the model output, which is then used to adjust the values of the model parameters. This training process continues until a balance is reached where the $G$ creates realistic data and the $D$ cannot easily identify the synthetic data from the real data. The training process aims to find the Nash equilibrium (Hazra and Anjaria Citation2022), where the $G$ and $D$ reach a state where neither can improve significantly without the other changing.

Conditional GAN

The conditional GAN extends the capability of the original GAN framework by incorporating conditional information into the training process for both the $G$ and $D$ . This extra information can be a label or data from other modalities which are fed into $G$ and $D$ as part of the input (Mirza and Osindero Citation2014). This condition is fed into the $D$ and $G$ as part of the input as illustrated by the purple and green dotted lines in . In the case of the condition being a class label, it would provide specific information to inform the $G$ which class the synthetic data should mimic or resemble. For the $D$ , the particular class preconditions the $D$ before it assesses if the data point is real or synthetic. shows how this conditional information is fed into a CGAN via the dotted lines.

In a CGAN the value function used to evaluate the samples is now defined as (Mirza and Osindero Citation2014):

(3)

E_{X} [{log}_{2} (D_{θ} (X | Y))] + E_{Z} [{log}_{2} (1 - D_{θ} (G (Z | Y)))]

(3)

where $Y$ is the predefined condition. The value function now incorporates both the adversarial aspect of detecting real from generated, as well as the predefined condition. The CGAN is trained in the same manner as the GAN described in Section 2.1.

The benefit to using a conditional model is that this allows control of the modes or classes that the $G$ creates. In the original GAN framework, the only method of incorporating a condition on the generated data is through the separate execution of the model for each vessel type. For this work, the choice of implementing a CGAN allows for an input condition, such as the vessel type, that limits the generated data to be applicable to that specific category of vessel. Therefore, the CGAN offers the flexibility of generating many vessel types or modalities within a single model.

The Experiment

This work investigates the use of CGANs for generating data that is representative of vessel displacement tracks in a constraint-free environment; i.e., the data generated are not required to adhere to any ship movement, environmental, or regulatory constraints.

The choice to use a GAN based architecture was made due to the model’s ability to generate diverse instances when very little data is available. Furthermore, the recent success of GAN based architectures in other applications of data generation (thispersondoesnotexist.com Accessed: Citation2024; OpenAI:Chatgpt Accessed: Citation2024b; OpenAI:Dalle Citation2024a) has shown remarkable success with convincing results. For this work, the CGAN was chosen so that different types of ship data could be generated using a single architecture.

The Data

The Automatic Identification System (AIS)Footnote³ data set used in this experiment was obtained from MarineCadastre.gov,Footnote⁴ which is an open and available source that provides ocean information, tools, and vessel traffic data. These AIS data are collected by the United States Coast Guard in both US and international waters. The data sets used for this work cover the months of May, July, and September of 2022 and is available at (NOAA Citation2024). The data files contain the following features: day, time, a vessel identifier called the Maritime Mobile Service Identity (MMSI), vessel latitude position, vessel longitude position, the vessel’s Speed Over Ground (SOG), the vessel’s Course Over Ground (COG), and vessel type. The month of July was empirically chosen as the data set used to train and select the CGAN architecture presented in this work. The remaining May and September data sets are used to test the robustness of the selected architecture by generating data for months where seasonal changes occur.

The cleaning process involved the removal of records that contained invalid entries, such as SOG $<$ 0 knots, COG $<$ 0° T (degrees true), COG $\geq$ 360° T, $|$ latitude $|$ $>$ 90°, $|$ longitude $|$ $>$ 180°. Duplicate records that have the same entries for MMSI, day, and time were also removed. In addition, records with missing values were removed as these pose an issue in the development of the machine learning models. Lastly, records with SOG = 0 knots had the COG set to −1, since a stationary vessel does not have a valid COG.

The final data set used to train the models consisted of features that were aggregated representations of the original data features. This aggregation was based on day and MMSI. The features are as follows:

avg_sog: average speed over ground.
start_lat: the first latitude point of the day.
end_lat: the last latitude point of the day.
start_lon: the first longitude point of the day.
end_lon: the last longitude point of the day.

These particular features were chosen as they represent relevant characteristics used to describe a vessel track. They were selected as a way to simplify the modeling process by removing the temporal component associated with a full vessel track. This data set was filtered to obtain only pleasure crafts and cargo vessels in and near the Juan de Fuca Strait, which is located on the western coast of Canada.

The final aggregated data sets had:

July: 3763 aggregated entries for pleasure crafts and 1374 aggregated entries for cargo vessels;
May: 2332 aggregated entries for pleasure crafts and 1432 aggregated entries for cargo vessels; and
September: 3480 aggregated entries for pleasure crafts and 1664 aggregated entries for cargo vessels.

The total number of vessels in each class were as follows:

July: 2258 pleasure crafts and 824 cargo ships.
May: 1399 pleasure crafts and 859 cargo ships.
September: 2088 pleasure crafts and 998 cargo ships.

Note, all input data were standardized before the training process.

The distribution for each feature was inspected prior to model development. This showed that many of the feature distributions were highly skewedFootnote⁵ which can pose a problem for artificial neural networks. To address this, a power transform (Scikit-learn developers Accessed: Citation2024) was applied to the data to make the data features more Gaussian-like and reduce the skewness.

CGAN Architecture

For this work, the CGAN $D$ and $G$ were fully-connected feed-forward neural networks. An in-depth description of feed-forward neural networks, the components of the architecture, and hyperparameters for this type of model are described in SR1. The following architecture was selected to create the (Campbell Citation2021) model:

• $D$ first hidden layer: 38

• $D$ second hidden layer: 114

• $D$ third hidden layer: 128

• $D$ batch normalization: True

• $D$ activation function: Leaky ReLU (threshold of 0.2)

• $D$ learning rate: 0.001

• $G$ first hidden layer: 28

• $G$ second hidden layer: 140

• $G$ third hidden layer: 26

• $G$ batch normalization: False

• $G$ activation function: ReLU

• $G$ learning rate: 0.0006

• $G$ optimizer function: AdamFootnote⁶

This architecture was trained using a batch sizeFootnote⁷ = 16 with the number of training epochsFootnote⁸ = 128. Through the training process, the loss is monitored with respect to the number of epochs to evaluate the stability and convergence of the model. Note, the selection of the parameters and hyperparameters listed above were chosen using a Bayesian optimization search.

Results and Discussion

This section will analyze the performance of the CGAN architecture developed using the July 2022 data set. The data generated by this CGAN will be evaluated, the loss curves for both the $G$ and $D$ will be assessed, and the column shapes, correlations, boundaries, and synthesis are examined. Visualizations will be shown to help compare the real and synthetic data for both vessel types, and a machine learning efficacy test will also be performed to assess the usability of the synthetic data, and the generated and real displacement tracks will be plotted for comparison. Lastly, the CGAN model will then be applied to two additional data sets to test the models robustness in generating data from different months of that year when seasonal changes occur (spring, summer, and fall).

Loss

The loss function evaluates how well the model predicts the “ground truth” result. Monitoring this loss throughout the training process can provide insight on the stability and progression of the model. The goal of the combined $D$ and $G$ is to find a balance or equilibrium from this adversarial training process. GANs in general are difficult to train due to stability issues, resulting in convergence issues. It is important to note that small fluctuations in the loss value are expected between training batches.

The loss curves for both the $D$ and $G$ are shown in . This plot shows the change in loss with respect to the number of epochs for the $D$ and $G$ architectures described in Section 3.2. The loss curves for both networks appear stable at 150 training epochs while still exhibiting small fluctuations as expected in the training.

Figure 2. Loss curves for the $D$ and $G$ of the CGAN. The values indicate the $D$ ’s classification on both real and generated data for the July data set. Note, the Keras package in python was used for the training process and calculation of the binary cross entropy loss. This package uses $lo g_{e}$ instead of $lo g_{2}$ .

The $D$ ’s classification performance on the real and generated data can be examined by interpreting the confusion matrix of the final classifications made by the $D$ (). This table shows that the true negatives are higher than the false positives which implies that the $D$ can detect the generated data more easily then it can detect the real data. Both the true positives and false negatives are more evenly split, implying that the $D$ has more difficulty classifying the real data as real.

Table 1. Confusion matrix outputs for the CGAN using the July data set.

Download CSV Display Table

Metrics: Column Shapes, Correlations, Boundary Adherence, and Synthesis for the July Data Set

To assess the generated data the SDMetrics package in python (SDMetrics Citation2024e) was used to evaluate the column shapes, correlations, boundary adherence, and synthesis of the data with respect to the real data distribution.

Column Shapes

The column shape of each feature in the generated data set is analyzed using the Kolmogorov-Smirnov (KS) statistic. This statistic converts each numerical distribution into its cumulative distribution function. From here, the maximum difference between the two cumulative distribution functions is determined (SDMetrics Citation2024c). This metric inverts the statistic and returns $1 - KS statistic$ .Footnote⁹ Therefore, the higher the score the more similar the real and generated data are to one another.

reports the column shape metrics of each feature for both vessel types along with the overall average. This table shows that all features report moderate to high column shape values that are $>$ 0.70. The overall average of the column shapes for each vessel type are high with values $\geq$ 0.86.

Table 2. Column Shape Scores for the generated pleasure craft and cargo vessel data using the July data set.

Download CSV Display Table

help illustrate what the column shape metric is reporting. To visually compare the real and generated data, scatter plots of the features and the individual feature distributions are presented in these figures. The data sets are color coded where orange represents the real data and blue the generated data. The plots along the diagonal of the figures are the data distributions for each individual feature highlighting the similarities and differences between the real and generated distributions. The distribution plots of the generated data features model the real data features well, as confirmed by the moderate to high column shape values ( $> 0.70$ ).

Figure 3. Feature plots and distributions for the pleasure craft vessel type using the July data set. The axes are based on the standardized values for the corresponding feature.

Figure 4. Feature plots and distributions for cargo vessel type using the July data set. The axes are based on the standardized values for the corresponding feature.

show that the real data features are multimodal in nature. It is clear that the generated data distributions learned by the $G$ do not model the different modes perfectly, but do capture the larger scale shapes of the real distributions. The $G$ seems to struggle with modeling the features that have more distinct modes closer together (e.g., start_lat and end_lat). However, the $G$ appears to better model multimodal features with two modes when one of the modes does not have a distinct maximum (e.g., start_lon and avg_sog). Nevertheless, the individual modes may be better modeled by developing a network with deeper layers and more nodes that can capture the more complex distributions.

The scatter plots were included as a way to illustrate the density of the data from a different perspective and provide insight to the feature distribution plots. The real data distributions for both vessel types still exhibit a skewness after applying the power transform. It is likely that the cargo vessel type column shape results for the latitudes and longitudes were adversely impacted by the generator’s inability to generate data that mimicked the long right tails seen in the real data distributions. When the scatter plot is taken into consideration for these features, it shows that there are very few points that fall far outside of the distribution. As a result, these points could be considered as outliers that are affecting the overall fit of the distribution.

It is important to note that a design choice of this study was to use a simple architecture for the $G$ and $D$ as the data set contained a small set of features. Given the results of the column shape scores shown in and the feature distributions in , it is possible that using a deeper and more complex architecture may better capture the nuances within the distributions.

Pairwise Correlations

Column pair trends measure the correlation between the real and generated data for each pairwise feature and calculates the overall average (SDMetrics Citation2024b). The output ranges from 0 to 1, where a higher value represents a higher similarity between the real and generated data. The column pair trends for the vessel types using the July data set were

• cargo vessels: 0.93

• pleasure craft vessels: 0.94

The column pair trends for both vessel types are high as both are $>$ 0.90. This shows that generated and real data features within the data set for both vessel types are similar.

show two visualizations in a single figure. The left side of the figures show the correlation plot of the features from the real data set while the right sides show the correlation plot of the features from the generated data. Comparing the correlation plots of the real and generated data for a specific vessel type can show how similar the correlation relationships are between both data sets.

Figure 5. Correlation plots for real data (left) and generated data (right) for the cargo vessel type using the July data set. The scale indicates the strength of the relationship between the two features, where 1 is perfectly correlated.

Figure 6. Correlation plots for real data (left) and generated data (right) for the pleasure craft vessel type using the July data set. The scale indicates the strength of the relationship between the two features, where 1 is perfectly correlated.

The correlation plots of the real and generated data appear visually similar for both vessel types. This implies that the correlations between the features in the generated data closely mimic that of the real data. However, it is noted that for the cargo vessel type the real and generated correlation plots have greater visual intensity variations as compared to the pleasure crafts, showing that the correlations for the cargo vessel type are not modeled as well as the pleasure crafts. Such comparisons are important when training certain types of ML models as they use the relationships within the feature space to make predictions and classifications. This is important when training certain types of ML models as they use the relationships within the feature space to make predictions and classifications.

Boundary Adherence

Boundary adherence measures whether the generated data for each feature respects the minimum and maximum values of the real data (SDMetrics Citation2024a). The boundary metric returns the percentage of generated rows that fall within the real data boundaries. The average feature boundary adherence’s for the July data set were as follows:

• Cargo Vessels: 0.98

• Pleasure Crafts: 0.99.

This shows that the generated data respects the boundaries of the real data.

Synthesis

New row synthesis reports the number of new rows generated and if there were any copies of the real data (SDMetrics Citation2024d). For the generated July data set, both vessel types reported that all of the synthetic data points were new and there were no copies of the real data.

Machine Learning Efficacy Evaluation

Machine learning efficacy is a common approach used to evaluate the performance of synthetic data (Brenninkmeijer Citation2019). There are a number of ways to perform an efficacy test. The methods differ in how the training and testing data sets are selected. For this study, the machine learning efficacy test was conducted by training classification models on the generated data and then testing each model on real data. The models are trained to classify the data by vessel type: pleasure crafts and cargo ships. The three machine learning methods used were decision tree, random forest, and multi-layer perceptron (MLP). The result of this efficacy test shows how well the method can classify vessel types from the real data based on the model that was developed using the generated data from the CGAN.

As a baseline, the models were first trained and tested on the real data in order to see how well this classification task can be performed. The results for the July data set are shown in and the calculated F1 scoresFootnote¹⁰ are used to measure the success of the model. All three classification models performed with F1 scores $\geq 0.84$ with an overall performance average of 0.89. The overall average of the F1 scores for the individual vessel types were 0.83 and 0.89 for cargo ships and pleasure crafts, respectively. These results show that the classification models had more success at classifying pleasure crafts then cargo ships.

Table 3. Results from machine learning efficacy test using real data for training and testing. This test was performed using the July data set.

Download CSV Display Table

displays the classification test performance using the generated data for the training process and the real data points for the testing of the models. It is important to note that the classification model architectures for the results in are identical. All models trained using the generated data were able to classify the real data with a F1 score $\geq$ 0.77 with an overall average of 0.80.

Table 4. Results from machine learning efficacy test using generated data to train the model and real data for testing. This test was performed using the July data set.

Download CSV Display Table

When examining the classification performance of each vessel type for the models trained on the real and generated data the overall averaged F1 scores show a 12% and 7% difference in performance for cargo ships and pleasure crafts, respectively. However, the results from the classification models trained with generated data follow the same pattern of the models trained on real data where the success of classifying the pleasure crafts is higher then for cargo ships.

Overall, the results from this efficacy test show promise toward the use of generated data for model development deployed in a real-world setting for vessel tracking characteristics. In order to improve the classification results of models trained with generated data, a more complex CGAN architecture could be explored to improve the modeling the real data.

Comparison of Real and Generated Displacement Tracks

To visually compare the real and generated data for both vessel types, the starting and ending coordinates were used to create displacement tracks. These displacement tracks are illustrated in , respectively. Note, outlier tracks were removed so a closer view of the generated displacement tracks could be displayed.

Figure 7. Plots illustrating the displacement tracks created by the starting and ending coordinates for the pleasure crafts. The left plot and right plot show the displacement tracks for the real data and generated data, respectively. The star represents the end of the displacement track. These plots represent the displacement tracks generated for the month of July.

Figure 8. Plots illustrating the displacement tracks created by the starting and ending coordinates for the cargo vessels. The left plot and right plot show the displacement tracks for the real data and generated data, respectively. The star represents the end of the displacement track. These plots represent the displacement tracks generated for the month of July.

These figures show that the generated displacement tracks are localized to the same regions as the real displacement tracks. However, it is clear that the generated displacement tracks are generalized within the region and not as tightly bound to the traveled paths in the real data. For example, when observing the generated data some of starting and ending positions of the displacement tracks appear on land masses. This is likely a result of the real-world relationship between latitude and longitude being broken in the generated data. The validly of the vessel’s latitude and longitude positions is dictated by it being over water. A possible approach to rectifying this issue is to incorporate a feature into the model that represents the relationship between the latitude and longitude values or incorporate land constraints.

In addition, when examining the location of the start and end points for the real data, it is observed that they tend to be concentrated in similar areas. For example, the end point of a displacement track is represented by the star symbol and these points seem to have a higher concentration of points grouped near the coastline, likely near a location suitable for docking. As a result, the real data displacement tracks follow more consistent route directions as compared to the generated data. The start and end points of the generated displacement tracks are generalized and are not concentrated in specific regions. This is another feature that could be incorporated into the model so that route directions can be captured within the modeled regions.

Overall, such results are promising and the generalization of generated data provided by the CGAN can help train models and make them more robust.

Testing the CGAN Model’s Robustness to Seasonal Changes

In this section, additional experiments are conducted using the CGAN model defined in Section 3.2 to generate data from different months of the year. The months of May and September in 2022 were chosen as they cover the changes from spring-to-summer and summer-to-fall using July to represent the summer period. Testing the CGAN architecture on data sets that vary in season will test robustness of the model. Both data sets are fed into the CGAN to generate data and the results are evaluated using the metrics from Section 4.2 and tested with the vessel classification models reported in Section 4.3. Note, winter months that contain seasonal changes were not used for this study.

Assessing Model’s Generation Performance for May and September Data Sets

highlight the results of the evaluation metrics used to assess the generated data with respect to the real data for May and September, respectively. These results will be compared to the July metrics in order to assess the models robustness to seasonal changes.

Table 5. Average of Feature Column Shapes, Correlations, Boundary Adherence, and Synthesis for May 2022 data.

Download CSV Display Table