ABSTRACT
The goal of this work was to evaluate if routinely collected but seldom used airborne lidar metadata – ‘point attribute data’ (PAD) – analyzed using machine learning/artificial intelligence can improve extraction of shallow-water (less than 20 m) bathymetry from lidar point clouds. Extreme gradient boosting (XGB) models relating PAD to an existing bathymetry/not bathymetry classification were fitted and evaluated for four areas near the Florida Keys. The PAD examined include ‘pulse specific’ information such as the return intensity and PAD describing flight path consistency. The R2 values for the XGB models were between 0.34 and 0.74. Global classification accuracies were above 80% although this reflected a sometimes extreme Bathy/NotBathy imbalance that inflated global accuracy. This imbalance was mitigated by employing a probability decision threshold (PDT) that equalizes the true positive (Bathy) and true negative (NotBathy) rates. It was concluded that 1) the strength of the bathymetric signal in the PAD should be sufficient to increase accuracy of density-based lidar point cloud bathymetry extraction methods and 2) ML can successfully model the relationship between the PAD and the Bathy/NotBathy classification. A method is also presented to examine the spatial and feature-space distribution of errors that will facilitate quality assurance and continuous improvement.
Disclosure statement
The authors have no potential competing nor financial interest in the work presented.
Data and codes availability statement
The codes that support the findings of this study are available at the link https://doi.org/10.6084/m9.figshare.12597419. SBET data in the required format are provided at the figshare link. Though the.las data used are available to the public, the authors are not authorized to make them directly available. A small sample of the data for a single data tile are provided at the figshare link to demonstrate how codes function. Complete data sets (2016_420500e_2728500n.laz, 2016_426000e_2708000n.laz, 2016_428000e_2719500n.laz, and 2016_430000e_2707500n.laz) can be downloaded from https://coast.noaa.gov/htdata/lidar2_z/geoid12b/data/6246 as compressed.laz files. These can be decompressed using the LASzip tool which can be downloaded from laszip.org.
Notes
1. Global Navigation Satellite System.
2. Inertial Navigation System.
3. Post-processing Kinematic.
4. McFadden’s pseudo R2 cannot be tested for statistical significance. This statement is based on the authors’ experience with conventional R2 values with large sample sizes – i.e., greater than 500,000.
Additional information
Funding
Notes on contributors
Kim Lowell
Kim Lowell is a Research Data Scientist and holds an M.Sc. and PhD. In Forest Biometrics and MSc. In Data Analytics. He has considerable experience in the analysis of geospatial information and imagery to address land management issues while also accounting for uncertainties inherent in those data. His current focus the application of machine learning and deep learning to improve the accuracy of shallow-water bathymetric charts.
Brian Calder
Brian Calder is a Research Professor at CCOM and holds a PhD. In Electrical and Electronic Engineering. He has worked on a number of signal processing problems, including real-time grain size analysis, seismic processing, and wave-field modeling for shallow seismic applications. His current research focus is on statistically robust automated data cleaning approaches and tracing uncertainty in hydrographic data.
Anthony Lyons
Anthony Lyons is a Research Professor who holds a PhD in Oceanography. He conducts research in the field of underwater acoustics and acoustical oceanography. His current areas of interest include high-frequency acoustic propagation and scattering in the ocean environment, acoustic characterization of the seafloor, and quantitative studies using synthetic aperture sonar.