Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/marsupialtail/logcloud-experiments
https://github.com/marsupialtail/logcloud-experiments
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/marsupialtail/logcloud-experiments
- Owner: marsupialtail
- Created: 2023-10-06T04:04:44.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2023-11-29T02:59:55.000Z (about 1 year ago)
- Last Synced: 2023-11-29T18:35:10.034Z (about 1 year ago)
- Language: Shell
- Size: 1 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LogCloud Reproduction
## Dataset
LogHub data: https://zenodo.org/records/8196385. Spark, Hdfs, Windows, Hadoop and Thunderbird are used. Unzip the logs into text files under their own directories.
## OpenSearch
Set up an AWS OpenSearch cluster as described here: https://aws.amazon.com/opensearch-service/ with ultrawarm and cold storage enabled.
Load the log datasets with opensearch/bulk-load.py
Now you can run the search using opensearch/run.sh. Note that this will migrate indices to clear the ultrawarm cache to keep the comparisons fair.
## LogGrep
For compression, follow the instructions here: https://github.com/THUBear-wjy/LogGrep-zstd. In particular, this script can be used to compress all the logs: https://github.com/THUBear-wjy/LogGrep-zstd/blob/master/compression/quickTest.py.
After the logs are compressed, upload the compressed logs to an S3 bucket.
For search, just run https://github.com/marsupialtail/logcloud-experiments/blob/master/loggrep/search-loggrep.sh with the included binary thulr_cmdline. Modify the S3 bucket path accordingly.
## LogCloud
LogCloud is packaged under the name "rottnest" on pypi. Install the right version of LogCloud accordingly:
~~~
pip3 install rottnest==1.0.1 # Wavelet tree implementation, no early stopping
pip3 install rottnest==1.0.2 # Wavelet tree implementation, early stopping
pip3 install rottnest==1.0.3 # custom FM-index implementation, no early stopping
pip3 install rottnest==1.0.4 # custom FM-index implementation, early stopping
~~~To compress the logs, use: https://github.com/marsupialtail/logcloud-experiments/blob/master/logcloud/compress.sh. Change the log directories in the script. The upload the output indices to an S3 bucket.
To search the logs, run: https://github.com/marsupialtail/logcloud-experiments/blob/master/logcloud/search-logcloud.sh. Change the S3 bucket path accordingly.