An open API service indexing awesome lists of open source software.

https://github.com/snehawk20/log_anomaly_detection

Detecting anomalous log entries
https://github.com/snehawk20/log_anomaly_detection

logistic-regression tfidf-vectorizer

Last synced: 9 months ago
JSON representation

Detecting anomalous log entries

Awesome Lists containing this project

README

          

# Log Anomaly Detection

This solution was submitted to round 1 of Convolve, an ML/AI hackathon jointly organized by 6 IITs.

In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or log entry is recorded for each such event. Log Anomaly Detection is simply detecting anomalies in logs deposited by softwares using Machine Learning.

Anomaly is anything that is different from what is usually perceived as normal - an exception. In software engineering, anomaly can be defined as occurrence of rare or unexpected events that does not fit into the normal patterns and hence a suspicious one.

The train and test set contain logs generated by software. Our task is to train a ML model on the given training data that can predict whether a given log in testing data is an anomaly or normal.

Please find the Kaggle page for the above contest [here](https://www.kaggle.com/competitions/convolve-epoch1/data).

``log_detection.ipynb`` - final submission notebook. runs in `Python3`
``train.json`` - training data
``test.json`` - testing data
Train set is too large to be pushed. It can be found on Kaggle.