Abstract
Classification is a major constituent of the data mining tool kit. Well-known methods for classification are either built on the principle of logic or on statistical reasoning. For imbalanced and noisy cases, classification may however fail to deliver on basic data mining goals, i.e., identifying statistical dependencies in data. In this article, we propose a novel strategy for data mining based on partitioning of the feature space through Voronoi tessellation and Genetic Algorithm, where the latter is applied to solve a combinatorial optimization problem. We apply the suggested methodology to a range of classification problems of varying imbalance and noise and compare the performance of the suggested method with well-known classification methods such as (SVM, KNN, and ANN). The results obtained indicate the proposed methodology to be well suited for data mining tasks in case of highly imbalanced classes and significant noise.
Original language | English |
---|---|
Title of host publication | Trends and Applications in Knowledge Discovery and Data Mining - PAKDD 2018 Workshops, BDASC, BDM, ML4Cyber, PAISI, DaMEMO, Revised Selected Papers |
Publisher | Springer |
Publication date | 1 Jan 2018 |
Pages | 256-266 |
ISBN (Print) | 9783030045029 |
DOIs | |
Publication status | Published - 1 Jan 2018 |
Event | 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining - Melbourne, Australia Duration: 3 Jun 2018 → 3 Jun 2018 Conference number: 22 |
Conference
Conference | 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining |
---|---|
Number | 22 |
Country/Territory | Australia |
City | Melbourne |
Period | 03/06/2018 → 03/06/2018 |
Series | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11154 |
ISSN | 0302-9743 |
Keywords
- Classification
- Data mining
- Genetic algorithm
- Imbalance
- Noisy data
- Voronoi tessellation