Adaptive Hybrid Genetic Algorithm Trained Bayesian Network Framework for Spam Filtering Using Text Normalization

  • O SUNDAY
Keywords: Memetic algorithm, spam filters, semantic approach, text normalization, heuristic-processing, strings

Abstract

The popularity of the short messaging services (SMS) has created a propitious
environment for spamming to thrive. Spams are unsolicited advertising, adultthemed
or inappropriate content, premium fraud, smishing and malware. They
are a constant reminder of the need for an effective spam filter. However, SMS
limitations of 160-charcaters and 140-bytes size as well as its being rippled with
slangs, emoticons and abbreviations further inhibits effective training of models to
aid accurate classification. The study proposes Genetic Algorithm Trained
Bayesian Network solution that seeks to normalize noisy feats, expand text via use
of lexicographic and semantic dictionaries that uses word sense disambiguation
technique to train the underlying learning heuristics. And in turn, effectively help
to classify SMS in spam and legitimate classes. Hybrid model comprises of text
preprocessing, feature selection as well as training and classification section. Study
uses a hybrid Genetic Algorithm trained Bayesian model for which the GA is used for
feature selection; while, the Bayesian algorithm is used as classifier.

Published
2023-04-29