{"id":23892208,"url":"https://github.com/wapiti08/loganalyzer","last_synced_at":"2025-04-10T12:08:33.116Z","repository":{"id":49815719,"uuid":"209313171","full_name":"Wapiti08/LogAnalyzer","owner":"Wapiti08","description":"Ensemble framework of some log based anomaly detection work.","archived":false,"fork":false,"pushed_at":"2024-03-25T10:06:26.000Z","size":9937,"stargazers_count":35,"open_issues_count":0,"forks_count":16,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-05-15T10:04:44.118Z","etag":null,"topics":["lstm-neural-networks","pandas","python3","shell"],"latest_commit_sha":null,"homepage":"","language":"Roff","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Wapiti08.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"custom":["https://github.com/sponsors/Wapiti08?","https://paypal.me/wapiti08?country.x=GB\u0026locale.x=en_GB"]}},"created_at":"2019-09-18T13:17:40.000Z","updated_at":"2024-04-30T03:21:49.000Z","dependencies_parsed_at":"2024-03-25T11:27:42.913Z","dependency_job_id":"75e3b76b-01f8-4f38-9264-85dafb1cc291","html_url":"https://github.com/Wapiti08/LogAnalyzer","commit_stats":null,"previous_names":["wapiti08/loganalyzer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wapiti08%2FLogAnalyzer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wapiti08%2FLogAnalyzer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wapiti08%2FLogAnalyzer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wapiti08%2FLogAnalyzer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Wapiti08","download_url":"https://codeload.github.com/Wapiti08/LogAnalyzer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232470841,"owners_count":18528594,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lstm-neural-networks","pandas","python3","shell"],"created_at":"2025-01-04T13:39:38.878Z","updated_at":"2025-01-04T13:39:39.451Z","avatar_url":"https://github.com/Wapiti08.png","language":"Roff","funding_links":["https://github.com/sponsors/Wapiti08?","https://paypal.me/wapiti08?country.x=GB\u0026locale.x=en_GB"],"categories":[],"sub_categories":[],"readme":"# LogAnalyzer\n\n![Authour](https://img.shields.io/badge/Author-Wapiti08-blue.svg) \n![Python](https://img.shields.io/badge/Python-3.8-brightgreen.svg) \n![Classification](https://img.shields.io/badge/Multi-Class%20Classification-redgreen.svg)\n![LSTM](https://img.shields.io/badge/RNN-LSTM-redgreen.svg)\n![Analysis](https://img.shields.io/badge/Analysis-Anomaly%20logs-redgreen.svg)\n![License](https://img.shields.io/badge/license-MIT3.0-green.svg)\n[![DOI](https://zenodo.org/badge/209313171.svg)](https://doi.org/10.5281/zenodo.13881252)\n\n---\n\n- Ensemble framework of some log based anomaly detection work.\n\n- It is the basic thought with feature engineering to analyse raw logs and finally report the potential malicious logs based on a series of processings.\n\n## Ongoing:\n- dvc experiments\n- dvc dags\n\n## Feature:\n\n- convert the logs to structured pandas framework\n- extract the log keys from raw logs\n- analyse the log key exeuction path\n- analyse the paramaters in log key\n- analyse the time series data generated from window size and time interval by PCA. \n- online learning for feedbacks\n\nFor the dataset, I have given some examples and you can put your own data into that folder.\n\n## pre-preparation:\n\n```\n# in order to match the libraries versions, please run and build the project in virtual environment\nvirtualenv env\npip3 install -r requirement.txt\n```\n\n## Instructions (In Deeplog_demo folder):\n\n###  1. Source data:\nWhen the data format is in csv, we need translate them into txt files and split them into batches.\n```\npython3 csv_txt_trans.py \n```\nYou will get notice on inputing the source location and output location.\n\n###  2. Data analysis:\nwe use the logparser tool to transform the source txt log files into structured csv files under a folder, the folder is named by the start and end time. (Find the Lenma_demo under the logparser/logparser/demo)\n\n**(use Lenma_demo.py with python2)** ---\u003e The python3 version is not provided here.\nYou need to set the locations first:\n```\ninput_dir = '../../Dataset/Linux/Clear/'   # set the location to yours\noutput_dir = '../../Dataset/Linux/Clear_Separate_Structured_Logs/'    # set the location to yours\n```\nThen you can execute the demo file with python 2.x:\n```\npython Lenma_demo.py \n```\n\nIn the stage, we calculate the EventTemplate for every log. \n\n###  3. Variable Selection:\nThe log_value_vector.py will be used to generate the csv file, which will be used to implement the anomaly detection later. \n\n![Parameter_vector.png](https://github.com/Wapiti08/DeepLog/blob/master/Deeplog_demo/Pic/Dataframe.png)\n\n\n\n**(and has been integrated into models already in demo)**\n\n###  4. Model detection:\nBasiclly, we have two modules for DeepLog \n\n- Whereas, before implementing the modules, we will first see whether there is obvious malicious logs, we will report them first.\n\n- After that, we will first implement execution path anomaly detection with Execution_Path_Anomaly.py\n\t\n- Finally, we will implement parameter values anomaly detection with Parameter_value_performance_anomaly.py\t\n\n- As a plus, there is the ML model using PCA in loglizer.\n\n```\n# go to the folder of model\npython3 Execution_Path_Anomaly.py\n# go to the folder of model\npython3 Parameter_Value_Vector.py \n```\n## Statement:\n- The model is based on off-line work, the online real-time detection is not available.\n- The [loglizer](https://github.com/logpai/loglizer) and [logparser](https://github.com/logpai/logparser) are open source tools, author's rights are reserved.\n- I enriched the two tools in the project, notice the differences from the original version.\n\n## References：\n*1.Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis*\n\n*2.DeepLog: Anomaly Detection and Diagnosis from System Logs*\n\n*3.Incremental Construction of LSTM Recurrent Neural Network*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwapiti08%2Floganalyzer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwapiti08%2Floganalyzer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwapiti08%2Floganalyzer/lists"}