Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

Yannis Marinakis, Georgios Dounias, Jan Jantzen

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify the cells. Two databases are used, constructed in different times by expert MDs, consisting of 917 and 500 images of pap smear cells, respectively. Each cell is described by 20 numerical features, and the cells fall into 7 classes but a minimal requirement is to separate normal from abnormal cells, which is a 2 class problem. For finding the best possible performing feature subset selection problem, an effective genetic algorithm scheme is proposed. This algorithmic scheme is combined with a number of nearest neighbor based classifiers. Results show that classification accuracy generally outperforms other previously applied intelligent approaches.
Original languageEnglish
JournalComputers in Biology and Medicine
Volume39
Issue number1
Pages (from-to)69-78
ISSN0010-4825
DOIs
Publication statusPublished - 2009

Keywords

  • Data mining
  • Nearest neighbor based classifiers
  • Feature selection problem
  • Genetic algorithms
  • Artificial intelligence and medical diagnosis
  • Pap-smear classification

Cite this

@article{51cb6e95d8b343bc84fa65ccfd0ab5a0,
title = "Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification",
abstract = "The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify the cells. Two databases are used, constructed in different times by expert MDs, consisting of 917 and 500 images of pap smear cells, respectively. Each cell is described by 20 numerical features, and the cells fall into 7 classes but a minimal requirement is to separate normal from abnormal cells, which is a 2 class problem. For finding the best possible performing feature subset selection problem, an effective genetic algorithm scheme is proposed. This algorithmic scheme is combined with a number of nearest neighbor based classifiers. Results show that classification accuracy generally outperforms other previously applied intelligent approaches.",
keywords = "Data mining, Nearest neighbor based classifiers, Feature selection problem, Genetic algorithms, Artificial intelligence and medical diagnosis, Pap-smear classification",
author = "Yannis Marinakis and Georgios Dounias and Jan Jantzen",
year = "2009",
doi = "10.1016/j.compbiomed.2008.11.006",
language = "English",
volume = "39",
pages = "69--78",
journal = "Computers in Biology and Medicine",
issn = "0010-4825",
publisher = "Pergamon Press",
number = "1",

}

Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification. / Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan.

In: Computers in Biology and Medicine, Vol. 39, No. 1, 2009, p. 69-78.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

AU - Marinakis, Yannis

AU - Dounias, Georgios

AU - Jantzen, Jan

PY - 2009

Y1 - 2009

N2 - The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify the cells. Two databases are used, constructed in different times by expert MDs, consisting of 917 and 500 images of pap smear cells, respectively. Each cell is described by 20 numerical features, and the cells fall into 7 classes but a minimal requirement is to separate normal from abnormal cells, which is a 2 class problem. For finding the best possible performing feature subset selection problem, an effective genetic algorithm scheme is proposed. This algorithmic scheme is combined with a number of nearest neighbor based classifiers. Results show that classification accuracy generally outperforms other previously applied intelligent approaches.

AB - The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify the cells. Two databases are used, constructed in different times by expert MDs, consisting of 917 and 500 images of pap smear cells, respectively. Each cell is described by 20 numerical features, and the cells fall into 7 classes but a minimal requirement is to separate normal from abnormal cells, which is a 2 class problem. For finding the best possible performing feature subset selection problem, an effective genetic algorithm scheme is proposed. This algorithmic scheme is combined with a number of nearest neighbor based classifiers. Results show that classification accuracy generally outperforms other previously applied intelligent approaches.

KW - Data mining

KW - Nearest neighbor based classifiers

KW - Feature selection problem

KW - Genetic algorithms

KW - Artificial intelligence and medical diagnosis

KW - Pap-smear classification

U2 - 10.1016/j.compbiomed.2008.11.006

DO - 10.1016/j.compbiomed.2008.11.006

M3 - Journal article

VL - 39

SP - 69

EP - 78

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

SN - 0010-4825

IS - 1

ER -