An open API service indexing awesome lists of open source software.

https://github.com/code402/batch-vs-index-warc

A benchmark to explore the speed of reading WARC entries in bulk vs individually.
https://github.com/code402/batch-vs-index-warc

Last synced: about 2 months ago
JSON representation

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

Awesome Lists containing this project

README

          

# batch-vs-index-warc

_See the blog post: [S3 Throughput: Scans vs Indexes](https://code402.com/blog/s3-scans-vs-index/)._

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

```bash
mvn clean install assembly:single # Build the JAR
```

```bash
NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Single

NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Batch
```