https://github.com/dongju93/sysmon-to-rocksdb

Query Elasticsearch to retrieve data, save it to CSV files, store it in RocksDB, and then use GraphQL to fetch the data.
https://github.com/dongju93/sysmon-to-rocksdb

csv elasticsearch graphql javascript nextjs postgresql rocksdb rust typescript

Last synced: 4 months ago
JSON representation

Query Elasticsearch to retrieve data, save it to CSV files, store it in RocksDB, and then use GraphQL to fetch the data.

Host: GitHub
URL: https://github.com/dongju93/sysmon-to-rocksdb
Owner: dongju93
Created: 2023-08-07T00:37:48.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-03-09T13:07:02.000Z (over 2 years ago)
Last Synced: 2025-02-25T00:33:40.004Z (over 1 year ago)
Topics: csv, elasticsearch, graphql, javascript, nextjs, postgresql, rocksdb, rust, typescript
Language: Rust
Homepage:
Size: 33.9 MB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          ## Ultimate goal diagram. Ver.1

```mermaid

flowchart TB

	subgraph Windows11

	  Winlogbeat--Read-->Sysmon

	end

	subgraph LogStorageServer

		Winlogbeat--Push-->Elasticsearch

	end

	subgraph PreprocessingServer

		Elasticsearch<--Request/Response-->DataFetchBatch:::foo

		DataFetchBatch:::foo--Write-->CSV

		DataStoreBatch:::foo--Read-->CSV

	end

	subgraph DatabaseServer

		DataStoreBatch:::foo--Store-->RocksDB

	end

	subgraph MiddlewareServer

		DataViewBinary:::foo<--Iter./Fetch-->RocksDB

		GraphQL:::bar<--Execute/Return-->DataViewBinary:::foo

	end

	subgraph ApplicaionServer

		WebApplication:::foobar<--Query/Mutate-->GraphQL:::bar

	end

	Browser--Access-->WebApplication:::foobar

classDef foo stroke:#f00

classDef bar stroke:#0f0

classDef foobar stroke:#00f

```

# 1. Elasticsearch data to .csv file

First, you need to collect [SYSMON](https://learn.microsoft.com/ko-kr/sysinternals/downloads/sysmon) data with [WINLOGBEAT](https://www.elastic.co/kr/beats/winlogbeat) and stored with [ELASTICSEARCH](https://www.elastic.co/kr/elasticsearch)   

Second, this code will extract data to CSV files with delimiter "\t"

it's parsing "message" field with "agent.name", "agent.id" field

may require to modify maximum size of search query, default is 10000

```

// replace with your Index name

PUT /.ds-winlogbeat-8.8.2-2023.08.06-000001/_settings

{

    "max_result_window": 1000000

}

```

Please refer to the comments in the code for detailed explanation

## Quickstart

1. You need to create "elastic.rs" files, located "/src/envs"

- /src/envs/elasric.rs

```

pub const ES_URL_SECRET: &str = "YOUR ELASTICSEARCH URL";

pub const ID_SECRET: &str = "YOUR ELASTICSEARCH USERNAME (default is elaseic)";

pub const PW_SECRET: &str = "YOUR ELASTICSEARCH PASSWORD";

```

2. You need set your index name, the name may start with ".ds-winlogbeat" if you setup winlogbeat to elasticsearch automatically

and if index is multiple, set numbers and write index names within array

- /src/envs/env.rs

```

pub const INDICES: [&str; 1] = ["YOUR INDEX NAME"];

// if you have three indexes

// When the CSV is saved, if the file does not exist, a title line is added as the file is created, and if the file exists, the parsed data rows are added without the title line.

// To explain further, if you specify multiple indexes, the file will be created from the first index and the data will be added to the file created from the second index.

pub const INDICES: [&str; 3] = ["YOUR INDEX NAME 1", "YOUR INDEX NAME 2", "YOUR INDEX NAME 3"];

```

3. Set timestamp, query size, save location

- /src/envs/env.rs

```

pub const TIMESTAMP_START: &str = "START TIMESTAMP";

pub const TIMESTAMP_END: &str = "END TIMESTAMP";

pub const SIZE: usize = QUERY SIZE;

// between SAVELOCATION, CSVNAME event code will automatically generated

pub const SAVELOCATION: &str = "SAVE LOCATION";

pub const CSVNAME: &str = "FILENAME WITH FILE EXTENSTION (extenstion is .csv)";

```

4. Execute code

```

cargo build

cargo run --bin main

```

* Tip : Checking field types when selecting a wildcard type

```

// replace with your Index name

// When checking the message field type

GET /.ds-winlogbeat-8.8.2-2023.08.06-000001/_mapping/field/message

```

# 2. Data(.csv files) to RocksDB

1. Place csv files location

2. configure RocksDB location and execute code

```

cargo run --bin rocks

```

# 3. Data view on GraphQL(raw query)

1. change directory

```

cd graphql

```

2. Run graphQL server

```

npm run dev

```

3. Access apollo graphql server on 4000 port

```

http://localhost:4000

```

# 4. Data view on web(GUI)

1. change directory

```

cd webapp

```

2. Run node server

```

npm run dev

```

3. Access Next.js on 3000 port

```

http://localhost:3000

```

# 99. Todo

1. auto fetch elasticsearch data every one minute

2. if elasticsearch data exceed max than fetch more

3. auto import data to RocksDB right after csv parsing

4. data fetch from web application implements with react-query

5. cursor based pagination

6. web application api optimize

7. add union on graphql for multiple data types

8. fetch data from RocksDB using iteration (detach PostgreSQL) - speed test required - ✅

9. apply lib.rs to main.rs for crate maintenance

## Ultimate goal diagram. Ver.2

```mermaid

flowchart TB

	subgraph Linux

	  Filebeat--Read-->Sysmon

		Filebeat--Read-->Suricata

		Filebeat--Read-->Zeek

		Filebeat--Read-->Netflow

	end

	subgraph LogSendingServer

		Filebeat--Push-->Logstash--Push-->Redis

	end

	subgraph PreprocessingServer

		Redis--Stream-->DataStorePipe:::foo

	end

	subgraph DatabaseServer

		PostgreSQL<--Replication-->Replica

		DataStorePipe:::foo--Store-->RocksDB

	end

	subgraph MiddlewareServer

		DataViewBinary:::foo<--Iter./Fetch-->RocksDB

		GraphQL:::bar<--Execute/Return-->DataViewBinary:::foo

		LargeDataViewBinary:::foo<--Iter./Fetch-->RocksDB

		GraphQL:::bar<--Execute/Return-->LargeDataViewBinary:::foo

		GraphQL:::bar<--SQL/Return-->PostgreSQL

	end

	subgraph ApplicaionServer

		WebApplication1:::foobar<--Query/Mutate-->GraphQL:::bar

		WebApplication2:::foobar<--Query/Mutate-->GraphQL:::bar

		Nginx--Proxy-->WebApplication1:::foobar

		Nginx--Proxy-->WebApplication2:::foobar

	end

	Browser--Access-->Nginx

classDef foo stroke:#f00

classDef bar stroke:#0f0

classDef foobar stroke:#00f

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dongju93/sysmon-to-rocksdb

Awesome Lists containing this project

README