Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/abbe98/soch-download-cli

Command line interface for batch downloading of Swedish Open Cultural Heritage (K-samsök) records.
https://github.com/abbe98/soch-download-cli

batch-download k-samsok metadata owner-idi soch

Last synced: about 2 months ago
JSON representation

Command line interface for batch downloading of Swedish Open Cultural Heritage (K-samsök) records.

Awesome Lists containing this project

README

        

# SOCH Download CLI

![screenshot](screenshot.gif)

SOCH Download CLI lets you do **multithreaded** batch downloads of Swedish Open Cultural Heritage (K-samsök) records for offline processing and analytics.

## Prerequirements

- Python >=3.4 and PIP

## Installing

```bash
pip install soch-download
```

## Usage Examples

**Heads up: This program might use all the systems available CPUs.**

Download records based on a SOCH search query (Text, CQL, indexes, etc):

```bash
soch-download --action=query --query=thumbnailExists=j --outdir=path/to/target/directory
```

Download records from an specific institution:

```bash
soch-download --action=institution --institution=raa --outdir=path/to/target/directory
```

Download records using a predefined action/query:

```bash
soch-download --action=all --outdir=path/to/target/directory
soch-download --action=geodata-exists --outdir=path/to/target/directory
```

**Unpacking**

The download actions by default downloads large XML files containing up to 1000 RDFs each, after such a download you can use the `unpack` argument to convert all those files into individual RDF files:

```bash
soch-download --unpack=path/to/xml/files --outdir=path/to/target/directory
```

**Misc**

List all available parameters and actions:

```bash
soch-download --help
```

Target a custom SOCH API endpoint:

```bash
soch-download --action=query --query=itemKeyWord=hus --outdir=path/to/target/directory --endpoint=http://lx-ra-ksam2.raa.se:8080/
```