Track: Data Analytics
Abstract
Changes in lifestyle during the pandemic affect people's purchasing power and habits towards the need for baby food, especially dairy products, so that increasing profits and business competitiveness becomes an important part for companies which are engaged in procuring powdered milk for toddlers. One of the strategies to increase profitability and competitiveness is to increase customer retention by predicting customer loyalty and classifying customer profiles in the Customer Relationship Management (CRM) which is often also known as the Customer Loyalty Program. This discussion tries to classify customer loyalty through customer profiles (demography, transactions, etc.) using techniques from data mining methods. Beforehand, to classify the loyal and non-loyal customers as the objective of customer loyalty program, this study used cluster analysis to map the customers based on their behavior recorded in the transactions they made and the information they provided. The objective of this study is to answer several questions such as: is the predicted performance value of loyal customers accurate? And what variables are most important in classifying loyal customers? In addition, this study also aims to understand the use of data mining methods and how to interpret the results of the techniques that have been used. The dataset contains 655,625 observations and 20 variables while for cluster analysis, there are only 7 variables are being used whereby using this method, the optimal clusters obtained are 3 clusters. With the new labels for targeted variable, labels 0 and 1 are labeled as non-loyal customer and label 3 is the loyal customer where the proportion of each label are 76%, 15.18% and 8.82% then the proportion of non-loyal and loyal customers are 91.18% and 8.82%. Using logistic regression method to classify loyal and non-loyal customers, the result showed that the model accuracy is about 96.6% where the misclassification error for the model is 3.4%. With this accuracy value, it can be recommended that this model can be used to classify customers in the customer loyalty program based on the dataset and variables that build the model.