Full text loading...
The data collected in Internet of Things (IoT) applications consist of unreliable and erroneous data due to their deployment in harsh or unattended environments. Such data is considered an anomaly as it deviates from the regular data. These anomalies need to be identified correctly to enhance decision-making. For this purpose, machine learning techniques have gained significant attention due to their ability to classify the data into normal and abnormal (or anomaly). Methods: This work proposes novel adaptations to supervised and semi-supervised machine learning algorithms by integrating the Mahalanobis Distance (MD) metric. These adapted algorithms are named as Mahalanobis Binary Classification (M-BC) and Mahalanobis One Class Classification (M-OCC). The performance of these proposed algorithms was evaluated on well-known IoT sensor datasets using performance metrics such as balanced accuracy, F1-Score, and AUC-ROC score.
The results show that the M-BC algorithm exhibits significant improvements over conventional machine learning methods across several datasets considered in this study, including SHM4, MHM1, Occupancy, and Timeseries. The M-BC achieved an average improvement of 13.03% in balanced accuracy, 10.29% in F1-Score, and 13.16% in AUC score. Similarly, the M-OCC algorithm demonstrated substantial gains in one-class classification, with an average improvement of 21.07% in balanced accuracy, 26.49% in F1-Score, and 26% in AUC score across datasets such as AnomIoT, IBRL, SHM4, MHM1, Occupancy, and Timeseries compared to OCSVM.
The results confirm that the proposed MD-based approaches are found to be simple, effective, and more accurate for detecting anomalies in IoT sensor data compared to their base methods. The integration of the MD metric significantly enhanced the ability of the algorithms to identify anomalous data points across various IoT domains.
The work presented successfully demonstrated the incorporation of the Mahalanobis distance into binary and one-class classification algorithms to improve anomaly detection performance. These M-BC and M-OCC algorithms show a robust and efficient solution to ensure data reliability in IoT sensor networks.