Track: Data Analytics
Abstract
Training models in machine learning often requires big computational capacities, even more when it involves Big Data. The Patient Rule Induction Method (PRIM), introduced as a bump hunting algorithm and a subgroup discovery method, already consumes a lot of computing resources when searching for the rules. In this computational research area, several methods have been developed to handle the problem of computing a big dataset: parallel learning, distributed learning and federated learning. In this paper, we investigate the literature review of these three methods to define each one and the differences between them, and then we determine which of these methods is best suited for the implementation of PRIM in big dataset for out futur works in Big Data.