https://github.com/y-scope/log-archival-bench
https://github.com/y-scope/log-archival-bench
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/y-scope/log-archival-bench
- Owner: y-scope
- License: apache-2.0
- Created: 2025-07-21T21:34:12.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-09-22T06:14:20.000Z (9 months ago)
- Last Synced: 2025-09-22T08:24:53.487Z (9 months ago)
- Language: Python
- Size: 37.1 KB
- Stars: 4
- Watchers: 2
- Forks: 4
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Log Archival Bench How To
## Setup
Initialize and update submodules:
```shell
git submodule update --init --recursive
```
Run the following code to setup the virtual environment, add the python files in src to python's
import path, then run the venv
```
python3 -m venv venv
echo "$(pwd)" > $(find venv/lib -maxdepth 1 -mindepth 1 -type d)/site-packages/project_root.pth
. venv/bin/activate
pip3 install -r requirements.txt
```
## Download Datasets
You can download all the datasets we use in the benchmark using the [download\_all.py](/scripts/download_all.py) script we provide.
The [download\_all.py](/scripts/download_all.py) script will download all datasets into the correct directories **with** the specified names, concentrate multi-file datasets together into a single file, and generate any modified version of the dataset needed for tools like Presto \+ CLP.
## Run Everything
Follow the instructions above to set up your virtual environment.
Stay in the [Log Archival Bench](/) directory and run [scripts/benchall.py](/scripts/benchall.py). This script runs the tools \+ parameters in its "benchmarks" variable across all datasets under [data/](/data).
## Run One Tool
Execute `./assets/{tool name}/main.py {path to .log}` to run ingestion and search on that dataset.
## Contributing
Follow the steps below to develop and contribute to the project.
### Requirements
* [Task] 3.40.0 or higher
### Linting
Before submitting a pull request, ensure you've run the linting commands below and have fixed all
violations and suppressed any benign warnings.
To run all linting checks:
```shell
task lint:check
```
To run all linting checks AND fix some violations:
```shell
task lint:fix
```
To see how to run a subset of linters for a specific file type:
```shell
task -a
```
Look for tasks under the `lint` namespace (identified by the `lint:` prefix).
[Task]: https://taskfile.dev