TEXT CLUSTERING BASED ON THE N-GRAMS BY BIO INSPIRED METHOD (IMMUNE SYSTEMS)

Authors

  • Hamou Reda Mohamed University Dr Tahar MOULAY of Saïda, EEDIS Lab, Computer Science Department, Algeria. hamoureda@yahoo.fr
  • Lokbani Ahmed Chaouki University Dr Tahar MOULAY of Saïda, EEDIS Lab, Computer Science Department, Algeria
  • Ahmed Lehireche University Djillali Liabes of Sidi Bel Abbes, EEDIS Lab, Computer Science Department, Algeria
  • Rahmani Mohamed University Djillali Liabes of Sidi Bel Abbes, EEDIS Lab, Computer Science Department, Algeria

Keywords:

Data classification and clustering, immune systems, biomimetic methods, data mining, N-grams

Abstract

In this paper we present the results of unsupervised classification (clustering) of unstructured
data in this case the textual data from Reuters 21578 corpus with a new biomimetic approach using
immune systems. Before to experiment the immune systems, we digitalized our data: textual
documents from the database REUTERS 21,578 corpus by the approach of N-grams. The novelty lies
on the hybridization of the n-grams and immune systems for classification. Section 1 gives an
introduction and state of the art, Section 2 presents representation of texts based on the n grams,
Section 3 describes the approach of immune systems for clustering, Section 4 shows the
experimentation and comparison results and finally Section 5 gives a conclusion and perspectives.

Downloads

Download data is not yet available.

Downloads

Published

18-08-2021

How to Cite

Hamou Reda Mohamed, Lokbani Ahmed Chaouki, Ahmed Lehireche, & Rahmani Mohamed. (2021). TEXT CLUSTERING BASED ON THE N-GRAMS BY BIO INSPIRED METHOD (IMMUNE SYSTEMS). Researchers World - International Refereed Social Sciences Journal, 1(1), 56–70. Retrieved from https://www.researchersworld.com/index.php/rworld/article/view/122

Issue

Section

Articles