{"id":16779224,"url":"https://github.com/0xk1h0/weblog-based-anormaly-detection-using-linux-cli","last_synced_at":"2025-03-16T19:44:09.363Z","repository":{"id":109690571,"uuid":"422529483","full_name":"0xk1h0/WebLog-Based-Anormaly-Detection-Using-Linux-CLI","owner":"0xk1h0","description":"KISA","archived":false,"fork":false,"pushed_at":"2022-08-17T07:31:41.000Z","size":52,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-23T06:26:21.106Z","etag":null,"topics":["awk","command-line-tool","linux-shell","shell","ubuntu"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0xk1h0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-29T10:14:40.000Z","updated_at":"2023-06-05T10:40:20.000Z","dependencies_parsed_at":"2023-03-13T14:05:08.951Z","dependency_job_id":null,"html_url":"https://github.com/0xk1h0/WebLog-Based-Anormaly-Detection-Using-Linux-CLI","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xk1h0%2FWebLog-Based-Anormaly-Detection-Using-Linux-CLI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xk1h0%2FWebLog-Based-Anormaly-Detection-Using-Linux-CLI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xk1h0%2FWebLog-Based-Anormaly-Detection-Using-Linux-CLI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xk1h0%2FWebLog-Based-Anormaly-Detection-Using-Linux-CLI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0xk1h0","download_url":"https://codeload.github.com/0xk1h0/WebLog-Based-Anormaly-Detection-Using-Linux-CLI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243923238,"owners_count":20369505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["awk","command-line-tool","linux-shell","shell","ubuntu"],"created_at":"2024-10-13T07:29:34.714Z","updated_at":"2025-03-16T19:44:09.357Z","avatar_url":"https://github.com/0xk1h0.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# WebLog-Based-Anormaly-Detection Using Linux CLI\n\nLinux(Ubuntu 20.04) CLI를 활용한 Weblog Analysis\n\n### Data\n* Datetime / SIP / Method / Payload / Version / ResponseCode / ResponseByte\n\n![image](https://user-images.githubusercontent.com/47383452/141668694-5991c6e0-7566-4828-a291-abfcffff3e0b.png)\n* about 154MB, 60 million Line of Session data included\n\n### 1. Connection based Analysis\n\n![image](https://user-images.githubusercontent.com/47383452/141672002-7acd0782-50b1-4da8-b6be-506b44f55c1d.png)\n\n` cat srv1_access_daily.tsv | feedgnuplot --domain --timefmt \"%Y-%m-%d\" --with \"boxes lt -1\" --legend 0 \"daily HTTP Session\"`\n\n##### SIP connection\n![image](https://user-images.githubusercontent.com/47383452/141672273-7fddfc6b-9c45-4e5f-9ef0-437679151439.png)\n\n\n![image](https://user-images.githubusercontent.com/47383452/141672287-d6a13606-2c26-44a2-9ed2-a974a88a8d07.png)\n\n![image](https://user-images.githubusercontent.com/47383452/141672674-3cb289d2-6fb3-4851-81da-1441f5cfad89.png)\n\n![image](https://user-images.githubusercontent.com/47383452/141672741-1270a547-21c6-4872-bc84-44a46381944b.png)\n* Daliy SESSION, SIPCNT, SESS/SIPCNT visulization\n\n![image](https://user-images.githubusercontent.com/47383452/141672840-f86d0f38-ac9a-4db7-b752-2ab8238d5ca0.png)\n* Boxplot\n\n* SIP COUNT \u003e 100, Top SEESSION / SIPCOUNT\n![image](https://user-images.githubusercontent.com/47383452/142240852-a9db5f73-7fab-424a-b7e0-b29f58967bfe.png)\n\n` cat sess_ovr_sipcnt.tsv | awk '$3 \u003e 100{print $0}' | awk '{print $1 \"\\t\" $4}' | feedgnuplot --domain --timefmt '%Y-%m-%d' --lines --points --legend 0 \"SESS/SIPCNT\"`\n ![image](https://user-images.githubusercontent.com/47383452/142242977-e84a1b2b-18b1-4aba-9747-dd95674d89ec.png)\n* upper 400 \n* SESS.CIPCNT \u003e 400 인 일자를 ddos_event.tsv로 생성\n ```\n for d in $(cat ddos_event.tsv | awk '{print $1}')\n do\n zcat ../srv1_accesslog.gz | awk '$1==date{print $0}' date=$d \u003e $DATE\"_ddos_evt.tsv\"\n done\n ```\n![image](https://user-images.githubusercontent.com/47383452/142251515-550f6cdb-9d13-4a1c-97c4-3462a1057988.png)\n\n- ` zcat 2017-07-03_ddos_evt.tsv.gz | awk '{print $2}' | sort | uniq -c | awk '{print $2 \"\\t\" $1}' | feedgnuplot --domain --timefmt \"%H:%M:%S\" --with 'boxes lt -1' --legend 0 \"2017-07-03 sps\"`\n ![image](https://user-images.githubusercontent.com/47383452/142252794-944c0a6c-fa02-4091-9331-896017b5ea25.png)\n- ` zcat 2017-07-03_ddos_evt.tsv.gz | awk '{print $2 \"\\t\" $3}' | sort -u | awk '{print $1}' | sort | uniq -c | awk '{print $2 \"\\t\" $1}' \u003e 2017-07-03.sipsec.tsv`\n- ` cat 2017-07-03.sipsec.tsv | feedgnuplot --domain --timefmt \"%H:%M:%S\" --with 'boxes lt 3' --legend 0 \"SIP per Second\"`\n  - SIP PER SECOND\n ![image](https://user-images.githubusercontent.com/47383452/142253954-1dcb617d-053d-4b0c-a6b8-bdf2834e407a.png)\n \n\n- ` cat spm.tsv | feedgnuplot --domain --timefmt \"%H:%M\" --lines --points --legend 0 \"SESSION PER MINUTE\"`\n  - SESSION PER MINUTE\n ![image](https://user-images.githubusercontent.com/47383452/142255708-0a1694c6-43e3-4283-9cd9-45a624e62672.png)\n \n ```\n   for m in $(cat min)\n   do\n   TS=$m\n   SPM=$(cat spm.tsv | awk '$1==min{print $2}' min=$m)\n   SIP=$(cat sipmain.tsv | awk '$1==min{print $2}' min=$m)\n   echo $TS $SPM $SIP\n   done | awk '{print $0 \"\\t\" ($2+1)/($3+1)}' \u003e ddos_min.tsv\n```\n- ` cat ddos_min.tsv  | feedgnuplot --domain --timefmt \"%H:%M\" --lines --points --y2 2 --legend 0 \"Session per Minute\" --legend 1 \"Nr. of SIP per Minute\" --legend 2 \"SPM / SIPMIN\"`\n ![image](https://user-images.githubusercontent.com/47383452/142257473-7b9c7ae5-e6d9-421b-b345-3bca484add68.png)\n  - 2017-07-03 15:00 ~ 19:00 / 20:30 ~ 21:00\n \n* Revisiting attack IP\n   ```\n   cat ts_min_sip | awk '{print $2}' | sort | uniq -c | while read line\n   do\n   IP=$(echo $line | awk '{print $2}')\n   FN=$(echo $line | awk '{print $1 \".revisit\"}')\n   echo $IP \u003e\u003e $FN\n   done\n   ```\n   ![image](https://user-images.githubusercontent.com/47383452/142261157-7d8275bc-34d3-48f1-8c67-701e82368c6f.png)\n   \n   ```\n   zcat 2017-07-03_ddos_evt.tsv.gz | awk '{print $3}' | sort | uniq -c | sort -rn | head\n   200465 IP0041058\n   118835 IP0040986\n    70076 IP0001113\n    68682 IP0040922\n    29853 IP0084544\n    22450 IP1062364\n    18371 IP0000782\n    17692 IP0001719\n    11529 IP0173656\n     7691 IP0001227\n   ```\n   * IP0041058\n   ![image](https://user-images.githubusercontent.com/47383452/142262931-3aeeec52-d08d-43c8-83f6-dda7d9f30473.png)\n   * IP0040986\n   ![image](https://user-images.githubusercontent.com/47383452/142263134-619abc5e-451f-4ca1-b7f8-de6b8b88c802.png)\n   * IP0001113\n   ![image](https://user-images.githubusercontent.com/47383452/142263268-c65bc249-48f0-42de-aeec-1bba10263fde.png)\n   * IP0040922\n   ![image](https://user-images.githubusercontent.com/47383452/142263473-11b65b5c-75b1-4a99-83c0-4fab9f6df664.png)\n   \n   #### DDoS IP discovered.\n\n### 2. Response Code Based Analysis\n\n##### Response code 400~\n\n```\nzcat srv1_accesslog.gz | awk '$7~/^[12345]/{print $1 \"\\t\" $7}'|sort | uniq -c | awk '{print $2 \"\\t\" $3 \"\\t\" $1}' | \nawk '{if ($2 \u003e= \"400\" \u0026\u0026 $2 \u003c 500) print ($0)} \u003e 400_rcode.tsv\n```\n```\ncat 400_rcode.tsv | sort -rnk 3 | head | awk '{print $3}'|head\n13851\n13574\n5228\n3583\n3166\n3109\n2976\n2760\n2698\n2665\n```\n* Response code 4XX record visualization\n ```\n cat 400_r_code.tsv | sort -rnk 3 | awk ‘{print $3}’ | feedgnuplot –histogram 0 –ymax 5\n ```\n![image](https://user-images.githubusercontent.com/47383452/142265566-814b9174-acda-48f3-882b-74e1ec627a38.png)\n```\ncat 400_rcode.tsv | sort -rnk 3 | awk '{print $1 \"\\t\" $3}' | feedgnuplot --domain --timefmt '%Y-%m-%d' --points\n```\n![image](https://user-images.githubusercontent.com/47383452/142265627-e720ffc2-ca86-4874-acda-871dfa11d92e.png)\n\n* Response Code 4XX SIP\n```\nzcat srv1_accesslog.gz | awk '$7~/^[12345]/{print $3 \"\\t\" $7}'|sort | uniq -c | awk '{print $2 \"\\t\" $3 \"\\t\" $1}' | awk '{if ($2 \u003e=400 \u0026\u0026 $2 \u003c 500) print $0}' \u003e sip_rcode.tsv\n```\n![image](https://user-images.githubusercontent.com/47383452/142265796-82e872a3-abcf-42e9-9958-cd9834112a23.png)\n\n```\ncat sip_rcode.tsv | sort -rnk 3 | head\nIP1093735\t404\t17010\nIP0053005\t404\t11867\nIP0008180\t404\t11730\nIP0013767\t404\t11297\nIP0099074\t404\t10495\nIP0002194\t404\t10088\nIP1056836\t404\t10049\nIP1086971\t404\t9032\nIP1087023\t404\t7962\nIP1056822\t404\t6520\n```\n\n* 4XX\n```\ncat sip_rcode.tsv | sort -rnk 3 | head -100 \u003e top_sip.ip\n```\n```\nfor ip in $(cat top_sip.ip)\n\u003e do\n\u003e zcat srv1_accesslog.gz | awk '$3==sip{print $0}' sip=$ip \u003e $ip\".log\"\n\u003e done\n```\n![image](https://user-images.githubusercontent.com/47383452/142265986-cdba40db-b08b-4920-a443-3adc10a6f699.png)\n```\ncat IP* | awk '$7!=\"\"{print $1 \"\\t\" $7}' | awk '{if (($2 \u003e= 200 \u0026\u0026 $2 \u003c 300) || ($2 \u003e= 400 \u0026\u0026 $2 \u003c 500)) print $0}' \u003e 2XX_4XX.log\ncat 2XX_4XX.log |awk '{print $1 \"\\t\" (int($2/200)-int($2/400)) \"\\t\" int($2/400)}' | awk '$1==prv{r2+=$2;r4+=$3;next}{print prv \"\\t\" r2 \"\\t\" r4; prv=$1;r2=$2; r4=$3}' |\nfeedgnuplot --domain --points  --timefmt \"%Y-%m-%d\" --title \"2xx AND 4xx Response Frequency\" --legend 0 \"2XX\" --legend 1 \"4XX\"\n```\n![image](https://user-images.githubusercontent.com/47383452/142266167-57ee1443-425b-4df1-9526-67ca7c9b3049.png)\n\n```\ncat IP* | awk '$7!=\"\"{print $1 \"\\t\" $3 \"\\t\" $7}' |sort| awk '{print $1 \"_\" $2 \"_\" int($3/100)*100}' | uniq -c | sort -rn | head -20\n```\n![image](https://user-images.githubusercontent.com/47383452/142266234-73b5651e-0a3c-4955-adfd-db186669e42c.png)\n\n##### IP0008180 and IP0053005 400 Response code\n\n* IP0053005\n```\ncat IP0053005.log | awk '$7!=\"\"{print $1 \"\\t\" $7}' | sort | awk '{print $1 \"\\t\" int(($2)/100)*100}' |\nawk '{print $1 \"\\t\" int($2/200)-int($2/400) \"\\t\" int($2/300) \"\\t\" int($2/400) \"\\t\" int($2/500)}'|\nawk '$1==prv{r2+=$2;r3+=$3;r4+=$4;r5+=$5;next}{print prv \"\\t\" r2 \"\\t\" r3 \"\\t\" r4 \"\\t\" r5; prv=$1;r2=$2;r3=$3; r4=$4;r5=$5}'\n```\n\n```\n\t\t \t200\t300 400\t500\n2017-07-04\t58\t3\t3\t0\n2017-08-01\t13664\t11944\t11837\t4\n2017-09-21\t63\t4\t4\t0\n2017-10-10\t44\t2\t2\t0\n2017-10-11\t318\t21\t20\t1\n2017-10-13\t3\t2\t2\t0\n2017-10-20\t36\t8\t3\t0\n2018-01-03\t80\t2\t1\t0\n```\n* IP0008180\n```\ncat IP0008180.log | awk '$7!=\"\"{print $1 \"\\t\" $7}' | sort | awk '{print $1 \"\\t\" int(($2)/100)*100}' |\nawk '{print $1 \"\\t\" int($2/200)-int($2/400) \"\\t\" int($2/300) \"\\t\" int($2/400) \"\\t\" int($2/500)}'|\nawk '$1==prv{r2+=$2;r3+=$3;r4+=$4;r5+=$5;next}{print prv \"\\t\" r2 \"\\t\" r3 \"\\t\" r4 \"\\t\" r5; prv=$1;r2=$2;r3=$3; r4=$4;r5=$5}'\n```\n```\n\t\t\t200\t300\t400\t500\t\n2017-03-13\t270\t74\t28\t0\n2017-03-14\t12688\t11837\t11680\t6\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xk1h0%2Fweblog-based-anormaly-detection-using-linux-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0xk1h0%2Fweblog-based-anormaly-detection-using-linux-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xk1h0%2Fweblog-based-anormaly-detection-using-linux-cli/lists"}