Track: COVID-19 Analytics Competition
Abstract
The present work predicts different parameters key to a proper COVID-19 response, through machine learning methods. The number of patients affected, hospitalizations, and deaths in the context of the state of Florida is considered to provide a blueprint for regional pandemic response. To predict these variables, twenty inputs in four categories (type of tests performed, gender, race, and age group) were collected. Official data from the Florida State Department of Health were collected and submitted to a linear regression model, fuzzy logic model, and long short-term memory (LSTM) deep learning model with the intent of producing predictions as close as possible to the actual numbers. The mean absolute percentage error (MAPE) was calculated to measure the deviation of the predicted result from the actual values. In addition, a one-way analysis of variance (ANOVA) models was developed for each output parameter to statistically assess the results of these models. The LSTM deep learning model outperformed the fuzzy model, which in turn outperformed linear regression, in terms of the MAPE for the ‘number of Florida residents affected’. For the ‘number of patients hospitalized’, the LSTM deep learning model again outperformed the regression model, while the regression model and the fuzzy model were not significantly different. For ‘the number of patient deaths’, however, no model significantly outperformed any other. These findings suggest that in regional data, the fuzzy model outperforms linear regression, and the LSTM deep learning model outperforms both the fuzzy model and the linear regression model. Implications of this work include a better understanding of opportunities and appropriate tools for short-term prediction of future trends when variability is high, as well as replacement strategies for datasets with missing values.