https://github.com/sergio11/iot_network_malware_classifier

🛡️ The IoT Network Malware Classifier 🚀 is an advanced solution tackling security concerns in IoT, employing deep learning for precise malware detection in network traffic.
https://github.com/sergio11/iot_network_malware_classifier

deep-learning iot iot-security keras-classification-models keras-neural-networks keras-tensorflow neural-networks tensorflow2

Last synced: 6 months ago
JSON representation

🛡️ The IoT Network Malware Classifier 🚀 is an advanced solution tackling security concerns in IoT, employing deep learning for precise malware detection in network traffic.

Host: GitHub
URL: https://github.com/sergio11/iot_network_malware_classifier
Owner: sergio11
License: mit
Created: 2024-04-29T19:02:26.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-26T13:32:47.000Z (10 months ago)
Last Synced: 2024-11-29T14:56:32.898Z (7 months ago)
Topics: deep-learning, iot, iot-security, keras-classification-models, keras-neural-networks, keras-tensorflow, neural-networks, tensorflow2
Language: Jupyter Notebook
Homepage:
Size: 5.38 MB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 🛡️ IoT Network Malware Classifier with Deep Learning Neural Network Architecture 🚀

Welcome to the **IoT Network Malware Classifier**, an advanced solution crafted to tackle the evolving security threats within the realm of the Internet of Things (IoT). As the proliferation of interconnected devices continues to surge within IoT networks, the need for robust cybersecurity measures becomes increasingly paramount.

In today's digital landscape, where IoT devices permeate various aspects of our lives, safeguarding these interconnected ecosystems is no longer a choice but a necessity. Malicious actors are constantly devising new methods to exploit vulnerabilities within IoT networks, posing significant risks to data privacy, system integrity, and overall network security.

Built upon cutting-edge Deep Learning Neural Network architecture, this classifier leverages the power of artificial intelligence to analyze and categorize network traffic with unparalleled precision and efficiency. By harnessing the capabilities of machine learning algorithms, this solution adapts to the dynamic nature of malware threats, providing proactive defense mechanisms to mitigate potential risks.

Don't hesitate to review the Jupyter Notebook attached, you could go through the intricate process of training and constructing the machine learning model using the renowned **Keras framework**. From data preprocessing and model design to training and evaluation, each step encapsulates the essence of project's commitment to delivering state-of-the-art cybersecurity solutions for the IoT landscape. 🛡️🔒

🙏 I would like to extend my heartfelt gratitude to [Santiago Hernández, an expert in Cybersecurity and Artificial Intelligence](https://www.udemy.com/user/shramos/). His incredible course on Deep Learning, available at Udemy, was instrumental in shaping the development of this project. The insights and techniques learned from his course were crucial in crafting the neural network architecture used in this classifier.

🙏🙏 I would like to extend my gratitude to **Stratosphere Laboratory** for providing the labeled dataset with malicious and benign IoT network traffic. This dataset was created as part of the Avast AIC laboratory with the funding of Avast Software.
> Sebastian Garcia, Agustin Parmisano, & Maria Jose Erquiaga. (2020). IoT-23: A labeled dataset with malicious and benign IoT network traffic (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4743746

[![GitHub](https://img.shields.io/badge/GitHub-View%20on%20GitHub-blue?style=flat-square)](https://github.com/sergio11/iot_network_malware_classifier)
[![PyPI](https://img.shields.io/pypi/v/IoTNetworkMalwareClassifier.svg?style=flat-square)](https://pypi.org/project/IoTNetworkMalwareClassifier/)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://github.com/sergio11/iot_network_malware_classifier/blob/main/LICENSE)

## 🌟 Key Features:
- **Precise Classification:** Utilizes Deep Learning models for accurate classification of malware in network traffic data.
- **Efficiency:** Implements optimized algorithms for fast and efficient processing of large volumes of data.
- **Scalability:** Designed to handle large data flows in high-demand IoT environments.
- **Ease of Use:** Offers an intuitive and straightforward interface for seamless integration and use across different applications and platforms.

With the **IoT Network Malware Classifier**, organizations can bolster their cybersecurity posture by identifying and mitigating malware threats in their IoT networks proactively and effectively. 🌐🔒

## Installation 🚀

You can easily install VehicleDetectionTracker using pip:

```bash
pip install IoTNetworkMalwareClassifier
```

### How IoT Network Malware Classifier Works 🛠️

The **IoT Network Malware Classifier** employs a multi-step process to preprocess the data before training the neural network model:

1. **Data Cleaning** 🧹:
- Features with a high percentage of missing values are discarded to ensure data integrity and prevent biased model training.
- High-cardinality features that do not contribute significantly to generalized prediction are also removed to streamline the preprocessing pipeline.

2. **Data Encoding and Scaling** 📊:
- Categorical features are encoded using techniques like label encoding to convert them into numerical representations suitable for model training.
- Numerical features are scaled to a common range using techniques like standardization to ensure uniformity and improve convergence during model training.

3. **Neural Network Architecture** 🧠:
- The classifier utilizes a deep neural network architecture comprising input, hidden, and output layers.
- **Input Layer**: Accommodates the preprocessed features of the network traffic data.
- **Hidden Layers**: Multiple dense layers with activation functions (e.g., ReLU) and dropout regularization capture intricate data patterns while preventing overfitting.
- **Output Layer**: Produces probabilities for different malware classes using a softmax activation function, facilitating multi-class classification.

4. **Dropout Regularization** 🚫:
- Dropout layers are strategically incorporated after each batch normalization layer in the model architecture.
- **Dropout**: Randomly deactivates a fraction of neurons during training iterations, preventing overfitting and promoting the generalization capability of the model.

5. **Model Compilation** 📋:
- The model is compiled with the Adam optimizer, categorical cross-entropy loss function, and evaluation metrics including accuracy and precision.
- **Benefits**: The incorporation of dropout regularization aids in preventing overfitting and enhancing the generalization performance of the model on unseen data.

By leveraging these preprocessing techniques and a carefully designed neural network architecture, the **IoT Network Malware Classifier** achieves precise and efficient malware classification, contributing to enhanced cybersecurity in IoT environments. 🛡️🔒

## Example to make predictions on IoT-related network data

The code demonstrates how to utilize the `MalwareClassifier` class to make predictions on IoT-related network data. Here's a breakdown of the steps:

- **Importing the classifier**: The `MalwareClassifier` class is imported from the `classifier` module within the `IoTNetworkMalwareClassifier` package.
- **Example data**: Example data is defined as a list of dictionaries. Each dictionary represents a network data record, with various features such as IP addresses, ports, protocols, etc.
- **Performing predictions**: A prediction is made using the `predict()` method of the classifier. This method takes the input data and returns a list of dictionaries containing the predicted labels and scores for each prediction.
- **Printing results**: The prediction results are printed to the console.

```python

# Create an instance of the IoT Network Malware Classifier
from IoTNetworkMalwareClassifier.classifier import MalwareClassifier

classifier = MalwareClassifier()

# Example data
data = [{
'id.orig_h': '192.168.1.195',
'id.orig_p': 37120,
'id.resp_h': '102.165.48.81',
'id.resp_p': 17769,
'proto': 'tcp',
'conn_state': 'RSTR',
'history': 'ShAdfDr',
'orig_pkts': 10,
'orig_ip_bytes': 1572,
'resp_pkts': 8,
'resp_ip_bytes': 540
},
{
'id.orig_h': '192.168.1.1',
'id.orig_p': 47805,
'id.resp_h': '192.168.1.195',
'id.resp_p': 22,
'proto': 'tcp',
'conn_state': 'SF',
'history': 'DdAaFf',
'orig_pkts': 400,
'orig_ip_bytes': 26336,
'resp_pkts': 268,
'resp_ip_bytes': 36368
},
{
'id.orig_h': '192.168.1.195',
'id.orig_p': 123,
'id.resp_h': '82.113.53.40',
'id.resp_p': 123,
'proto': 'udp',
'conn_state': 'S0',
'history': 'D',
'orig_pkts': 1,
'orig_ip_bytes': 76,
'resp_pkts': 0,
'resp_ip_bytes': 0
},
{
'id.orig_h': '192.168.1.195',
'id.orig_p': 37122,
'id.resp_h': '102.165.48.81',
'id.resp_p': 17769,
'proto': 'tcp',
'conn_state': 'RSTR',
'history': 'ShAdfDr',
'orig_pkts': 10,
'orig_ip_bytes': 1572,
'resp_pkts': 8,
'resp_ip_bytes': 540
},
{
'id.orig_h': '192.168.1.195',
'id.orig_p': 123,
'id.resp_h': '212.111.30.190',
'id.resp_p': 123,
'proto': 'udp',
'conn_state': 'SF',
'history': 'Dd',
'orig_pkts': 2,
'orig_ip_bytes': 152,
'resp_pkts': 2,
'resp_ip_bytes': 152
}]

# Perform prediction
predictions = classifier.predict(data)
print(predictions)
```

The prediction results are presented in the form of a list of dictionaries. Each dictionary contains the predicted label and associated scores for each class. Here's a detailed explanation of the fields in each results dictionary:

- **`result`**: The predicted label for the corresponding data record.
- **`scores`**: A dictionary containing the scores for each class. The keys are the class labels, and the values are the scores associated with those classes.

The scores represent the probability of the data record belonging to each class. The higher the score, the greater the classifier's confidence in the prediction for that class. Each score is formatted as a decimal number with a precision of 10 decimals.

For example, in the first data record, it is predicted to be "Malicious C&C" with a score of 0.7948055267, indicating high confidence in the prediction.

This results format facilitates understanding of the predictions made by the malware classifier and allows for decision-making based on confidence in those predictions. 🧠

```
[
{
'result': 'Malicious C&C',
'scores': {
'Benign': '0.1896078140',
'Malicious': '0.0008148123',
'Malicious C&C': '0.7948055267',
'Malicious DDoS': '0.0147715705',
'Malicious PartOfAHorizontalPortScan': '0.0000003306'
}
},
{
'result': 'Malicious DDoS',
'scores': {
'Benign': '0.3036604226',
'Malicious': '0.1889142990',
'Malicious C&C': '0.0181397330',
'Malicious DDoS': '0.4892515838',
'Malicious PartOfAHorizontalPortScan': '0.0000339999'
}
},
{
'result': 'Benign',
'scores': {
'Benign': '0.9999802113',
'Malicious': '0.0000042536',
'Malicious C&C': '0.0000123474',
'Malicious DDoS': '0.0000030778',
'Malicious PartOfAHorizontalPortScan': '0.0000000000'
}
},
{
'result': 'Malicious C&C',
'scores': {
'Benign': '0.1895969808',
'Malicious': '0.0008148027',
'Malicious C&C': '0.7948169112',
'Malicious DDoS': '0.0147710536',
'Malicious PartOfAHorizontalPortScan': '0.0000003306'
}
},
{
'result': 'Benign',
'scores': {
'Benign': '0.9999989271',
'Malicious': '0.0000000269',
'Malicious C&C': '0.0000009888',
'Malicious DDoS': '0.0000001118',
'Malicious PartOfAHorizontalPortScan': '0.0000000000'
}
}
]
```

## License 📜

This project is licensed under the MIT License - see the [LICENSE](https://github.com/sergio11/iot_network_malware_classifier/blob/main/LICENSE) file for details.

## Acknowledgments:
I would like to extend my heartfelt gratitude to [Santiago Hernández, an expert in Cybersecurity and Artificial Intelligence](https://www.udemy.com/user/shramos/). His incredible course on Deep Learning, available at Udemy, was instrumental in shaping the development of this project. The insights and techniques learned from his course were crucial in crafting the neural network architecture used in this classifier.

I extend my sincere gratitude to the **Stratosphere Laboratory** for providing the labeled dataset with malicious and benign IoT network traffic. This dataset has served as a crucial starting point for developing the machine learning model. The dataset includes labels that explain the linkages between flows connected with harmful or possibly malicious activity, providing invaluable insights for network malware researchers and analysts. Special thanks to Agustin Parmisano, Sebastian Garcia, and Maria Jose Erquiaga for their contributions to the dataset, made available on January 22th, and for their ongoing efforts to advance cybersecurity research.

The dataset used in this project is available [here](https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis) and can also be found on the [Stratosphere Laboratory website](https://www.stratosphereips.org/datasets-iot23).

> Sebastian Garcia, Agustin Parmisano, & Maria Jose Erquiaga. (2020). IoT-23: A labeled dataset with malicious and benign IoT network traffic (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4743746

## Contribution

Contributions to IoTNetworkMalwareClassifier are highly encouraged! If you're interested in adding new features, resolving bugs, or enhancing the project's functionality, please feel free to submit pull requests.

## Get in Touch 📬

IoTNetworkMalwareClassifier is developed and maintained by **Sergio Sánchez Sánchez** (Dream Software). Special thanks to the open-source community and the contributors who have made this project possible. If you have any questions, feedback, or suggestions, feel free to reach out at [[email protected]](mailto:[email protected]).

## ¡Happy coding! 🚀

## Visitors Count

## Please Share & Star the repository to keep me motivated.

## License ⚖️

This project is licensed under the MIT License, an open-source software license that allows developers to freely use, copy, modify, and distribute the software. 🛠️ This includes use in both personal and commercial projects, with the only requirement being that the original copyright notice is retained. 📄

Please note the following limitations:

- The software is provided "as is", without any warranties, express or implied. 🚫🛡️
- If you distribute the software, whether in original or modified form, you must include the original copyright notice and license. 📑
- The license allows for commercial use, but you cannot claim ownership over the software itself. 🏷️

The goal of this license is to maximize freedom for developers while maintaining recognition for the original creators.

```
MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sergio11/iot_network_malware_classifier

Awesome Lists containing this project

README