https://github.com/genspectrum/nextclade-datasets

Last synced: 5 days ago
JSON representation

Host: GitHub
URL: https://github.com/genspectrum/nextclade-datasets
Owner: GenSpectrum
Created: 2024-07-14T17:19:08.000Z (10 months ago)
Default Branch: main
Last Pushed: 2025-04-03T22:32:40.000Z (about 2 months ago)
Last Synced: 2025-04-03T22:34:37.615Z (about 2 months ago)
Size: 4.04 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# nextclade-datasets

This directory is a Genspectrum-maintained nextclade server, created using the docs: https://github.com/nextstrain/nextclade_data/blob/master/docs/dataset-server-maintenance.md.

You can run the server locally for testing by pasting

```
https://clades.nextstrain.org/?dataset-server=https://raw.githubusercontent.com/genspectrum/nextclade-datasets/main/data
```

into an incognito browser.

How to add new datasets?

1. Create a dataset following [nextclade's instructions](https://github.com/nextstrain/nextclade_data/blob/master/docs/dataset-creation-guide.md).
2. Update the `index.json`: this should include the details from each pathogen.json folder, additionally the `index.json` expects datasets to be versioned. For simplicity set version to unreleased and keep each dataset in a subdirectory called `unreleased`.
3. Zip the contents of the dataset into `dataset.zip` - this is what will be downloaded by nextclade and unzipped prior to use.

```
for i in {1..8}; do
rm -rf dataset.zip
cd seg$i/unreleased
zip -r dataset.zip *
cd -
done
```

Note that steps 2 and 3 are performed automatically by the CI when you create an official nextclade dataset, using the [rebuild script](https://github.com/nextstrain/nextclade_data/blob/master/scripts/rebuild/).

Download H5N1 datasets as follows:

```
for i in {1..8}; do
nextclade_dataset_name=flu/h5n1/seg$i
nextclade_dataset_server=https://raw.githubusercontent.com/genspectrum/nextclade-datasets/main/data
nextclade3 dataset get --name $nextclade_dataset_name --server $nextclade_dataset_server --output-dir output$i
done
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/genspectrum/nextclade-datasets

Awesome Lists containing this project

README