TY - JOUR
T1 - AiCareAir
T2 - Hybrid-Ensemble Internet-of-Things Sensing Unit Model for Air Pollutant Control
AU - Borah, Jintu
AU - Nadzir, Md Shahrul Md
AU - Cayetano, Mylene G.
AU - Majumdar, Shubhankar
AU - Ghayvat, Hemant
AU - Srivastava, Gautam
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The detrimental effects on human health caused by air pollution show that being able to predict air quality is a task of utmost significance. The application of artificial intelligence (AI) and the Internet of Things (IoT) is seen as promising in this domain. The performances of state-of-the-art models in terms of prediction accuracy vary with different pollutants and are acceptable only for certain pollutants. This article uses machine learning (ML) and deep learning (DL) models to predict the concentrations of six major air pollutants. Data are collected over eight months with 1400 daily instances from sensors deployed in Kuala Lumpur, Malaysia. As an intelligibly robust system, in this article a hybrid-ensemble model is proposed using a combination of ML models, specifically random forest, K-nearest neighbor (KNN), extreme gradient boosting (XGBoost), and neural network (NN) models, namely, long short-term memory (LSTM), gated recurrent units (GRUs), and convolutional NNs (CNNs). Here, a hybrid-ensemble learning model is created using five various ML models as weak learners. In previous ensemble models, a homogeneous group of weak learners are used; however, this work uses a heterogeneous group of weak learners. The prediction accuracy is compared using R2 score, absolute, squared, and root-mean-squared errors (RMSEs).
AB - The detrimental effects on human health caused by air pollution show that being able to predict air quality is a task of utmost significance. The application of artificial intelligence (AI) and the Internet of Things (IoT) is seen as promising in this domain. The performances of state-of-the-art models in terms of prediction accuracy vary with different pollutants and are acceptable only for certain pollutants. This article uses machine learning (ML) and deep learning (DL) models to predict the concentrations of six major air pollutants. Data are collected over eight months with 1400 daily instances from sensors deployed in Kuala Lumpur, Malaysia. As an intelligibly robust system, in this article a hybrid-ensemble model is proposed using a combination of ML models, specifically random forest, K-nearest neighbor (KNN), extreme gradient boosting (XGBoost), and neural network (NN) models, namely, long short-term memory (LSTM), gated recurrent units (GRUs), and convolutional NNs (CNNs). Here, a hybrid-ensemble learning model is created using five various ML models as weak learners. In previous ensemble models, a homogeneous group of weak learners are used; however, this work uses a heterogeneous group of weak learners. The prediction accuracy is compared using R2 score, absolute, squared, and root-mean-squared errors (RMSEs).
KW - Adam optimizer
KW - convolutional neural networks (CNNs)
KW - gated recurrent units (GRUs)
KW - Keras API
KW - long short-term memory (LSTM)
KW - Scikit learn
U2 - 10.1109/JSEN.2024.3397735
DO - 10.1109/JSEN.2024.3397735
M3 - Journal article
AN - SCOPUS:85193227543
SN - 1530-437X
VL - 24
SP - 21558
EP - 21565
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
IS - 13
ER -