Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tiarmdhnt/detect-botnets-in-network-traffic
Application of Deep Learning to Detect Botnets in Network Traffic Using CTU-13 Dataset
https://github.com/tiarmdhnt/detect-botnets-in-network-traffic
botnet-detection deep-learning machine-learning matplotlib network-security neural-networks pandas python pytorch scikit-learn seaborn tensorflow
Last synced: 21 days ago
JSON representation
Application of Deep Learning to Detect Botnets in Network Traffic Using CTU-13 Dataset
- Host: GitHub
- URL: https://github.com/tiarmdhnt/detect-botnets-in-network-traffic
- Owner: tiarmdhnt
- Created: 2024-12-25T02:28:42.000Z (22 days ago)
- Default Branch: main
- Last Pushed: 2024-12-25T02:47:10.000Z (22 days ago)
- Last Synced: 2024-12-25T03:20:20.043Z (22 days ago)
- Topics: botnet-detection, deep-learning, machine-learning, matplotlib, network-security, neural-networks, pandas, python, pytorch, scikit-learn, seaborn, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Detect-Botnets-in-Network-Traffic
Application of Deep Learning to Detect Botnets in Network Traffic Using CTU-13 DatasetThis repository contains code and resources for building a machine learning-based botnet detection system using network traffic data. The project leverages Python libraries for data preprocessing, visualization, and model building.
# **Features**
- Data Preprocessing: Handle missing values, feature scaling, and label encoding.
- Data Visualization: Analyze label distributions and other key features using Seaborn and Matplotlib.
- Deep Learning Model: Implementation of a deep learning model using PyTorch for botnet detection.# **Dataset**
The dataset used in this project is publicly available and can be downloaded directly:
- Source: CTU-Malware-Capture-Botnet-42
- Download Command:
```bash
!wget https://mcfp.felk.cvut.cz/publicDatasets/CTU-Malware-Capture-Botnet-42/detailed-bidirectional-flow-labels/capture20110810.binetflow
# **Installation**
To use this project, ensure you have the following dependencies installed:
```bash
pip install numpy pandas scikit-learn tensorflow keras matplotlib seaborn torch
```
# **Project Workflow**
**1. Data Preprocessing**
- Load the dataset using Pandas.
- Handle missing values.
- Normalize numerical features using StandardScaler.Encode categorical labels using LabelEncoder.**2. Data Visualization**
- Plot label distributions using Seaborn and Matplotlib to understand the data.**3. Machine Learning Model**
- Model Architecture:
- Input layer for network traffic features.
- Two hidden layers with ReLU activation.
- Output layer with softmax for multi-class classification.
- Framework: PyTorch.**4. Training and Evaluation**
- Split dataset into training and testing sets.
- Train the model using the Adam optimizer and CrossEntropyLoss.
- Evaluate the model using metrics like accuracy, precision, recall, and F1-score.# **Code Snippets**
**Data Loading and Preprocessing**
```bash
import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder# Load the dataset
file_path = "/content/capture20110810.binetflow"
data = pd.read_csv(file_path, delimiter=',')
data = data.dropna()# Feature scaling
scaler = StandardScaler()
data[['Dur', 'TotPkts', 'TotBytes', 'SrcBytes']] = scaler.fit_transform(data[['Dur', 'TotPkts', 'TotBytes', 'SrcBytes']])# Label encoding
le = LabelEncoder()
data['Label'] = le.fit_transform(data['Label'])
```
**Model Definition**
```bash
import torch
import torch.nn as nnclass BotnetDetectionModel(nn.Module):
def __init__(self, input_dim, num_classes):
super(BotnetDetectionModel, self).__init__()
self.fc1 = nn.Linear(input_dim, 64)
self.fc2 = nn.Linear(64, 32)
self.fc3 = nn.Linear(32, num_classes)
self.dropout = nn.Dropout(0.5)def forward(self, x):
x = torch.relu(self.fc1(x))
x = self.dropout(x)
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
```
**Training Loop**
```bash
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from torch.utils.data import DataLoader, TensorDataset# Data preparation
X = data.drop(columns=['Label'])
y = data['Label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# Convert to tensors
X_train_tensor = torch.tensor(X_train.values, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.long)train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)# Model initialization
model = BotnetDetectionModel(input_dim=X_train.shape[1], num_classes=len(y.unique()))
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)# Training loop
```bash
num_epochs = 10
for epoch in range(num_epochs):
model.train()
running_loss = 0.0for batch_X, batch_y in train_loader:
optimizer.zero_grad()
outputs = model(batch_X)
loss = criterion(outputs, batch_y)
loss.backward()
optimizer.step()
running_loss += loss.item()print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss/len(train_loader):.4f}")
```# **Metrics**
The model evaluates the following metrics during training and testing:
- Accuracy
- Precision
- Recall
- F1-Score# **Technology Used**
- Python
- PyTorch
- Scikit-learn
- Pandas
- Matplotlib
- Seaborn