https://github.com/iitis/dnsclass

Reference implementation of the DNS-Class algorithm in Python
https://github.com/iitis/dnsclass

Last synced: 8 months ago
JSON representation

Reference implementation of the DNS-Class algorithm in Python

Host: GitHub
URL: https://github.com/iitis/dnsclass
Owner: iitis
License: gpl-3.0
Created: 2014-01-24T12:32:55.000Z (over 12 years ago)
Default Branch: master
Last Pushed: 2014-01-24T12:34:33.000Z (over 12 years ago)
Last Synced: 2025-04-13T11:59:09.363Z (about 1 year ago)
Language: Python
Size: 484 KB
Stars: 5
Watchers: 5
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.markdown
- License: LICENSE

Awesome Lists containing this project

README

          About

=====

**dnsclass**: open source, reference implementation of the DNS-Class algorithm in Python.

The classifier takes as input ARFF files generated with [the Flowcalc

program](http://mutrics.iitis.pl/flowcalc) (using the `dns` and `lpi` plugins). **dnsclass**

classifies given network traffic flows basing on their DNS context and outputs a classification

report.

The classification process is divided into several steps, into script files named `stepN_*`, e.g.

`step6_predict.py`. There are also scripts named `cvN_*` that support cross-validation.

For scientific works, please cite the following paper:  

> Foremski P., Callegari C., Pagano M., *"DNS-Class: Immediate classification of IP flows using DNS"*

**Author**: Paweł Foremski   

**Copyright (C)** 2012-2013 [IITiS PAN Gliwice](http://www.iitis.pl/)  

**Licensed** under GNU GPL v3

This software package uses [libshorttext](http://www.csie.ntu.edu.tw/~cjlin/libshorttext/), which is

included in the dnsclass repository, but may be licensed differently.

Classifier steps

================

The purpose of the steps:

* `step1_reformat.sh`: reformat input ARFF files into the target text input format; skip all flows

   but those of selected protocols; some corrections may be required to match your ARFF files

* `step2_divide.sh`: divide the dataset into training and testing (may be skipped)

* `step3_convert_train.py`: convert the training dataset into the libsvm format (Vector Space Model (VSM))

* `step4_train.sh`: train the model

* `step5_convert_test.py`: as step 3, but for the testing dataset

* `step6_predict.py`: classify the testing dataset

* `step7_analyze.py`: show the confusion matrix and errors made in step 6

Project information

================

Project realized at [The Institute of Theoretical and Applied Informatics of the Polish Academy of

Sciences](http://www.iitis.pl/), under grant nr 2011/01/N/ST6/07202 of the [Polish National Science

Centre](http://www.ncn.gov.pl/).

Project website: http://mutrics.iitis.pl/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/iitis/dnsclass

Awesome Lists containing this project

README