Design of multi-view based email classification for IoT systems via semi-supervised learning

Research output: Contribution to journalJournal article – Annual report year: 2019Researchpeer-review

Standard

Design of multi-view based email classification for IoT systems via semi-supervised learning. / Li, Wenjuan; Meng, Weizhi; Tan, Zhiyuan; Xiang, Yang.

In: Journal of Network and Computer Applications, Vol. 128, 15.02.2019, p. 56-63.

Research output: Contribution to journalJournal article – Annual report year: 2019Researchpeer-review

Harvard

APA

CBE

MLA

Vancouver

Author

Bibtex

@article{39a79f2476884660b7ce392e13d13834,
title = "Design of multi-view based email classification for IoT systems via semi-supervised learning",
abstract = "Suspicious emails are one big threat for Internet of Things (IoT) security, which aim to induce users to click and then redirect them to a phishing webpage. To protect IoT systems, email classification is an essential mechanism to classify spam and legitimate emails. In the literature, most email classification approaches adopt supervised learning algorithms that require a large number of labeled data for classifier training. However, data labeling is very time consuming and expensive, making only a very small set of data available in practice, which would greatly degrade the effectiveness of email classification. To mitigate this problem, in this work, we develop an email classification approach based on multi-view disagreement-based semi-supervised learning. The idea behind is that multi-view method can offer richer information for classification, which is often ignored by the literature. The use of semi-supervised learning can help leverage both labeled and unlabeled data. In the evaluation, we investigate the performance of our proposed approach with two datasets and in a real network environment. Experimental results demonstrate that the use of multi-view data can achieve more accurate email classification than the use of single-view data, and that our approach is more effective as compared to several existing similar algorithms.",
keywords = "Disagreement-based learning, Email classification, IoT security, Multi-view data, Semi-supervised learning",
author = "Wenjuan Li and Weizhi Meng and Zhiyuan Tan and Yang Xiang",
year = "2019",
month = "2",
day = "15",
doi = "10.1016/j.jnca.2018.12.002",
language = "English",
volume = "128",
pages = "56--63",
journal = "Journal of Network and Computer Applications",
issn = "1084-8045",
publisher = "Academic Press",

}

RIS

TY - JOUR

T1 - Design of multi-view based email classification for IoT systems via semi-supervised learning

AU - Li, Wenjuan

AU - Meng, Weizhi

AU - Tan, Zhiyuan

AU - Xiang, Yang

PY - 2019/2/15

Y1 - 2019/2/15

N2 - Suspicious emails are one big threat for Internet of Things (IoT) security, which aim to induce users to click and then redirect them to a phishing webpage. To protect IoT systems, email classification is an essential mechanism to classify spam and legitimate emails. In the literature, most email classification approaches adopt supervised learning algorithms that require a large number of labeled data for classifier training. However, data labeling is very time consuming and expensive, making only a very small set of data available in practice, which would greatly degrade the effectiveness of email classification. To mitigate this problem, in this work, we develop an email classification approach based on multi-view disagreement-based semi-supervised learning. The idea behind is that multi-view method can offer richer information for classification, which is often ignored by the literature. The use of semi-supervised learning can help leverage both labeled and unlabeled data. In the evaluation, we investigate the performance of our proposed approach with two datasets and in a real network environment. Experimental results demonstrate that the use of multi-view data can achieve more accurate email classification than the use of single-view data, and that our approach is more effective as compared to several existing similar algorithms.

AB - Suspicious emails are one big threat for Internet of Things (IoT) security, which aim to induce users to click and then redirect them to a phishing webpage. To protect IoT systems, email classification is an essential mechanism to classify spam and legitimate emails. In the literature, most email classification approaches adopt supervised learning algorithms that require a large number of labeled data for classifier training. However, data labeling is very time consuming and expensive, making only a very small set of data available in practice, which would greatly degrade the effectiveness of email classification. To mitigate this problem, in this work, we develop an email classification approach based on multi-view disagreement-based semi-supervised learning. The idea behind is that multi-view method can offer richer information for classification, which is often ignored by the literature. The use of semi-supervised learning can help leverage both labeled and unlabeled data. In the evaluation, we investigate the performance of our proposed approach with two datasets and in a real network environment. Experimental results demonstrate that the use of multi-view data can achieve more accurate email classification than the use of single-view data, and that our approach is more effective as compared to several existing similar algorithms.

KW - Disagreement-based learning

KW - Email classification

KW - IoT security

KW - Multi-view data

KW - Semi-supervised learning

U2 - 10.1016/j.jnca.2018.12.002

DO - 10.1016/j.jnca.2018.12.002

M3 - Journal article

VL - 128

SP - 56

EP - 63

JO - Journal of Network and Computer Applications

JF - Journal of Network and Computer Applications

SN - 1084-8045

ER -