Track: Modeling and Simulation
Abstract
This paper describes an efficient method for classifying the four main hepatitis virus types Hepatitis-B, Hepatitis-C,
Hepatitis-D, and Hepatitis-E using Genomic Signal Processing and Machine Learning. First, we gather the dioxy
ribonucleic acid sequences for the various strains of the Hepatitis virus. Next, we convert this sequences from
characters to numbers using a variety of coding algorithms. Subsequently, We employ some well-known signal
processing methodologies along with some modified versions of signal processing developed by ourselves to extract
the characteristics from the transformed sequences and then we use Singular Value Decomposition for dimensionality
reduction. Ultimately, two machine learning models Decision Tree and Light Gradient Boosting Machine are trained
for classification. Our approach gets an accuracy of 99\% with the combination of normalisation of atomic number
with a customised Haar Wavelet.