https://github.com/ttab/trawl

Trawl elasticsearch image database to build test data.
https://github.com/ttab/trawl

archived-repository

Last synced: 3 months ago
JSON representation

Trawl elasticsearch image database to build test data.

Host: GitHub
URL: https://github.com/ttab/trawl
Owner: ttab
Created: 2013-06-17T09:45:51.000Z (almost 12 years ago)
Default Branch: master
Last Pushed: 2013-10-11T09:32:25.000Z (over 11 years ago)
Last Synced: 2024-04-10T19:47:47.648Z (about 1 year ago)
Topics: archived-repository
Language: JavaScript
Size: 180 KB
Stars: 0
Watchers: 22
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        trawl & trickle

===============

## Make test data

```

spix-dev01$ ./trawl http://spix-bildbank01:9200/grafik/image

Dumping 50 most recent entries. After that every 500th.

Saving mapping

Trawling through posts

  trawling [=======================================] 100% (44002/44002)

Downloading assets

    assets [=======================================] 100% (720/720)

Compressing

  archiving [=======================================] 100% (722/722)

Written 265366585 bytes: /home/martin/trawl/target/grafik_image.zip

Done.

```

## Usage

`traw` will dump all most recent records up to a point, then every nth

to the end. This is so we get both nice recent data as well as some

historic.

* `-r --recent` The number of recent records, defaults to 50.

* `-n --nth` After recent records, grab every nth, defaults to 500.

* `-m --max` Max number of records *to consider*. Default to 0 for all records in the index.

### Example

`-r 100 -n 1000 -m 5000` means download the first 100 records, then

ever 1000th up until the 5000 record in the index. I.e. if there are more than 5000 records

in the index we will end up with 100 + 4 = 104 records in total.

## Upload test data to nexus

```

$ mvn deploy

...

Uploaded: http://repo.ad.tt.se/nexus/content/repositories/snapshots/se/prb/scanpix-trawl/1.0.0-SNAPSHOT/scanpix-trawl-1.0.0-20130618.064742-3-dist.zip (259148 KB at 7650.1 KB/sec)

```

*Notice that pom.xml got a hard coded value for the .zip-file and classifier to deploy to nexus*

```

princess$ grep grafik_image pom.xml

                target/grafik_image.zip

                grafik_image

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ttab/trawl

Awesome Lists containing this project

README