3rd South American International Conference on Industrial Engineering and Operations Management

Feature Selection within Time Series Clustering

0 Paper Citations
1 Views
1 Downloads
Track: Artificial Intelligence
Abstract

In recent decades, the world has experienced a health crisis due to increased infectious disease cases such as COVID-19, Dengue, Zika, etc. Dengue is a neglected tropical disease transmitted by mosquito vectors, mainly by the Aedes Aegypti. This work is focused on Paraguay, where the virus has surpassed 16,000 notifications so far this 2022. This disease has an incidence throughout the country, which results in a large amount of available data. The time series clustering can find a subjacent structure within a large amount of data, simplifying analysis and interpretation of it. This article contrasts two different clustering methods (Shape-based and Feature-based), followed by a feature selection procedure. Initially, both methods are tested, getting the highest Silhouettes scores with the feature-based approach. Subsequently, one feature is removed in each experiment and the results are ranked, getting higher silhouette scores by eliminating the least important feature. Results show that better clustering is obtained by performing an adequate feature selection through the ranking procedure.

Keywords
Clustering, Time Series, Epidemiology, Dbscan, K-means, Hierarchical.

Published in: 3rd South American International Conference on Industrial Engineering and Operations Management

Publisher: IEOM Society International
Date of Conference: May 10-12, 2022

ISBN: 978-1-7923-9159-0
ISSN/E-ISSN: 2169-8767