Track: Modeling and Simulation
Abstract
The Automatic Identification System (AIS) a vessel tracking system. It provides rich information on vessel particulars in addition to dynamic navigational and voyage details. AIS data have significantly contributed in the digitization of the shipping industry, but still are prone to measurement and collection errors. As poor data quality leads to inaccurate analysis and affects decision making, a thorough preprocessing of AIS data is needed before any use. In this paper, we present the main quality issues encountered when dealing with AIS data. This concerns noise, outliers, duplicates, inconsistent data, and out of range values. We also provide some errors examples and how to overcome them. As an application, we address the problem of outliers’ detection in an unsupervised way using clustering and anomaly detection techniques, which attribute an anomaly score for each observation. The case study shows promising results for spatial outliers’ detection, which can be further explored for other anomalies detection tasks.