Track: Data Analytics and Big Data
Abstract
Having precisely analyzed the data of patients with specific diseases we can obtain the patterns and knowledge of these disease or even specific characteristics of patients. A hypothesis is usually considered in medical studies when some data are gathered prospectively to prove or deny this hypothesis, but in many cases there may be relationships between the data of the patients which have never been attended and no hypothesis has been considered. Hence, the purpose of the study is to discover the hidden patterns of tuberculosis (TB) patients' datasets. According to various assessment indicators of TB, first the Entropy-Shannon method was used to identify the most important features. Then, the existing association rule of data was discovered using APIRIORI technique. R software has been used to implement these techniques on 548 data of TB patients referred. The results of the Entropy-Shannon method have identified 18 factors with values greater than 0.0300. Then APRIORI algorithm are discovered 9 association rules with highest values of lift and the minimum support and confidence value equal to 0.5 and 0.9 respectively. Discovery rules could be considered for further studies, particularly clinical trials as primary hypothesis. In addition, practitioners might also apply these rules to analyze patients' clinical status.