Publications

Related to the project

Anomaly Detection using Machine Learning to Discover Sensor Tampering in IoT Systems

Aditya Kumar Pathak, Saguna Saguna, Karan Mitra, Christer Åhlund

Abstract

With the rapid growth of the Internet of Things (IoT) applications in smart regions/cities, for example, smart healthcare, smart homes/offices, there is an increase in security threats and risks. The IoT devices solve real-world problems by providing real-time connections, data and information. Besides this, the attackers can tamper with sensors, add or remove them physically or remotely. In this study, we address the IoT security sensor tampering issue in an office environment. We collect data from real-life settings and apply machine learning to detect sensor tampering using two methods. First, a real-time view of the traffic patterns is considered to train our isolation forest-based unsupervised machine learning method for anomaly detection. Second, based on traffic patterns, labels are created, and the decision tree supervised method is used, within our novel Anomaly Detection using Machine Learning (AD-ML) system. The accuracy of the two proposed models is presented. We found 84% with silhouette metric accuracy of isolation forest. Moreover, the result based on 10 cross-validations for decision trees on the supervised machine learning model returned the highest classification accuracy of 91.62% with the lowest false positive rate.

An Automated Real-time Diagnosis Framework for Big Data Systems

Demirbaga U, Wen Z, Noor A, Mitra K, Alwasel K, Garg S, Zomaya A, Ranjan R.

Abstract

Big data processing systems, such as Hadoop and Spark, usually work in large-scale, highly-concurrent, and multi-tenant environments that can easily cause hardware and software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data processing systems’ performance degradation, perform root-cause analysis, and even overcome the issues causing such degradation. However, these solutions focus on specific problems such as stragglers and inefficient resource utilization. There is a lack of a generic and extensible framework to support the real-time diagnosis of big data systems. In this article, we propose, develop and validate AutoDiagn. This generic and flexible framework provides holistic monitoring of a big data system while detecting performance degradation and enabling root-cause analysis. We present an implementation and evaluation of AutoDiagn that interacts with a Hadoop cluster deployed on a public cloud and tested with real-world benchmark applications. Experimental results show that AutoDiagn can offer a high accuracy root-cause analysis framework, at the same time as offering a small resource footprint, high throughput, and low latency.