The term Big Data has been used to refer to the extensive data gathering that cannot be managed by traditional methods. This research applies data mining and analytics techniques to give a picture of the interaction of performance between stochastic and deterministic variables and store stock-outs through predictive models. These variables materialize in different types of information, from demographic data like age and customers’ perception to operational features like shelf capacity and inventory. While these variables were previously analysed by isolated studies, this pioneering project joins this approach to provide an integral analytical solution.
This research is conducted through the application of logistic regression, and some others such as deviation analysis, clustering and Sigma, for selecting relevant family and sub-family products that were the focus of the models developed. Moreover, this study emphasizes on some recommended and specific actions aimed to reduce the in-store stock outs, based on the insights emerged from the models developed.