An open API service indexing awesome lists of open source software.

https://github.com/neherlab/nextclade_data_workflows


https://github.com/neherlab/nextclade_data_workflows

nextclade nextstrain phylogenetics snakemake virus-evolution

Last synced: 2 months ago
JSON representation

Awesome Lists containing this project

README

        

## Checking new tree

1. Download generated files into nextclade data workflow repo:

```bash
scp -rC [email protected]:~/nextclade_data_workflows/sars-cov-2/output output
```

1. Plug them into nextclade.org advanced view.
1. Filter to new nodes and check that:
- clades are clean
- no big outliers
1. Check `tag.json` is up to date (ideally update in `profiles/tag.json` for posterity)
1. Check `qc.json` does not regress (ideally update in `profiles/qc.json` for posterity) [beware, codons are 0 indexed]
1. Potentially run `scripts/common_stops.py` and `scripts/common_frameshifts.py` to add new stops/frameshifts that have become more common to `qc.json`

## Identifying most common frame shifts and stop conds

1. Download metadata to `data/metadata_raw.tsv`
1. Run snakemake workflow with following commands/targets:

```bash
snakemake --profile=profiles/clades pre-processed/frameshifts.tsv -R select_frameshifts
snakemake --profile=profiles/clades pre-processed/stops.tsv -R select_stops
```

1. Format most commons stops/fs into qc.json JSON format using

```bash
python3 scripts/common_stops.py
python3 scripts/common_frameshifts.py
```

1. Manually check resul for plausibility and add to qc.json

## Committing to data repo

1. Go to nextclade_data_workflow repo
1. Checkout branch, open PR to master
1. Copy output from workflow repo to data repo

```bash
cp -r output/sars-cov-2/references/MN908947/versions/ ../../nextclade_data/data/datasets/sars-cov-2/references/MN908947/versions
```

1. Update `changelog.md`
1. Get Ivan to review
1. Merge into master

## Release process

Follow release guidelines as outlined here: https://github.com/nextstrain/nextclade_data#dataset-release-process