Track: Healthcare Systems
It is common to classify patients that arrive at an intensive care unit (ICU) by means of a classification based on gender, age and health record. However, it is interesting and helpful to take advantage of additional data to propose alternative patient segmentation that might help allocate more efficiently the existing infrastructure, supplies, medical staff. In this investigation several ICU patients’ segmentations were implemented and compared. Different from a supervised task like classification, where datasets have to be a priori labelled to train and to test prediction models, clustering algorithms requires no labelling. Instead, data are grouped according to their degree of similarity.
The research was carried out following a 4-phase methodology: analysis, design, development, and validation. During the analysis, a large database with record of the medical care received by patients at the ICU of a public hospital located in the south of Chile was preprocessed and analyzed. During the design, several datasets were prepared to conduct experiments. At this point, the advantages and disadvantages of different clustering algorithms were analyzed and compared, selecting Simple K-Means Algorithm (SKMA) and Expectation-Maximization Clustering (EMC) to proceed with the investigation. Whereas SKMA creates clusters of equal variance, EMC assumes a Gaussian distribution of data. The phase of development was carried out using the data mining software WEKA 3.9.6.
To complete the investigation, four datasets of five, ten, fifteen, and twenty thousand ICU records were used. Since no target class was defined, the clustering was the result of applying the selected algorithms: EMC and SKMA. For both cases, different number of clusters (k) were required to establish a comparison.
Results revealed clear differences in the outputs generated by each clustering algorithm. For instance, with 5 clusters (k=5) EMC distributes data in the following proportions: 19%, 15%, 10%, 43%, and 28%. With SKMA, instead, the proportion were: 44%, 13%, 10%m 9%, and 24%.
In conclusion, the investigation showed that popular clustering algorithms such as EMC and SKMA can be used for segmenting not only consumers but also ICU patients according to criteria that are not easy to visualize with classical tools and techniques. An adequate segmentation can provide valuable information to help estimate the requirement of medical staff, supplies and infrastructure, and also to define specific healthcare services.