An open API service indexing awesome lists of open source software.

https://github.com/scilifelabdatacentre/genome-portal

This is the repository for the Swedish Reference Genome Portal, a platform facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden
https://github.com/scilifelabdatacentre/genome-portal

Last synced: about 1 month ago
JSON representation

This is the repository for the Swedish Reference Genome Portal, a platform facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden

Awesome Lists containing this project

README

        

Swedish Reference Genome Portal
========

This repository contains the source code for the [Swedish Reference Genome Portal](https://genomes.scilifelab.se/), which:

- Showcases genome research performed in Sweden on non-model eukaryotic species.
- Lowers the barrier of entry to access, visualise, and interpret genome data.
- Encourages sharing of genomic annotations, even the seldom-published kind.
- Strives to present FAIR data, available in public repositories.

## Table of Contents

1. [Overview](#overview)
2. [Cite this portal](#cite-this-portal)
3. [Contributing](#contributing)
4. [Funding](#funding)
5. [Contact us](#contact-us)
6. [Technical overview](#technical-overview)
- [Repository Layout](#repository-layout)
- [Local development](#local-development)
7. [Credits](#credits)

## Overview

- The Swedish Reference Genome Portal website is built using the
[Hugo](https://gohugo.io/) static web generator.

-  The [JBrowse2](https://jbrowse.org/jb2/) genome browser is
embedded within the website to visually explore genome datasets.

- Primary data file sources are available in public repositories
(such as [ENA](https://www.ebi.ac.uk/ena/browser/home)), and prepared
for display on JBrowse by our `Makefile` recipes (essentially
compressing and indexing).

- The code for the Genome Portal is available under an MIT (open
source) license.

- The Genome Portal website is currently hosted by the [KTH Royal
Institute of Technology](https://www.kth.se/) in Stockholm.

## Cite this portal

DOI

See 'Cite this repository' in the "About" section at the top right of this page.

## Contributing

Two types of contributions are especially welcome:

- **Datasets for display in the portal**: Consult our
[requirements](https://genomes.scilifelab.se/contribute) for including a
genome dataset to the portal, and contact us if you have any questions.

- **Source code and documentation**: We welcome contributions, small and large,
to our codebase and documentation. They will be published after review and
approval by the Genome Portal team. Fork, open a PR, or contact us to discuss ideas!

## Funding

This service is supported by [SciLifeLab](https://www.scilifelab.se/)
and the [Knut and Alice Wallenberg
Foundation](https://kaw.wallenberg.org/en) through the [Data-Driven
Life Science (DDLS) program](https://www.scilifelab.se/data-driven/),
as well as by the [Swedish Foundation for Strategic Research
(SSF)](https://strategiska.se/en/).

## Contact us

We welcome all questions and suggestions (including feature requests or bug reports).

- Email us at [[email protected]](mailto:[email protected]).
- [Create an issue on Github](https://github.com/ScilifelabDataCentre/genome-portal/issues/new).

## Technical overview

This section contains high-level technical documentation about the
source code.

### Repository layout

- The `config/` directory contains information about data sources
(tracks and assemblies) displayed in the genome browser.
- Each species subdirectory includes:
- `config.yml` : specifies the assembly and tracks to be displayed in JBrowse2.
- `config.json` : starting point from which to generate a complete JBrowse2
configuration, based on `config.yaml`. A common use is to define
default browsing sessions.

- Different `make` recipes prepare the material described in `config/`
for use by JBrowse2. The main operations are downloading data files,
compressing using `bgzip` and indexing with `samtools`.

- The website content resides in the `hugo` directory.
- Most importantly, each species gets:
1. A content subdirectory in `hugo/content/species/` (e.g. `hugo/content/species/clupea_harengus`).
2. A data directory in `hugo/data/` (taxonomic information and statistics).
3. An assets directory in `hugo/assets` (data inventory).

- The `scripts` folder contains executables to help:
1. Build and serve the website using Docker.
2. Add a new species to the website content.
3. Add new datasets to the portal.

- The `tests` folder contains tests and fixtures, mainly covering the
data preparation scripts.

- The `docker` folder contains two Docker files:
1. `docker/data.dockerfile` used for data preparation (everything that `make` needs).
2. `docker/hugo.dockerfile` used to build and serve the website.

### Local development

The steps described below requires
[`docker`](https://www.docker.com/) to be installed.

**1. Clone the repository**

```
git clone [email protected]:ScilifelabDataCentre/genome-portal.git
cd genome-portal
```

**2. Build and install the genomic data**

```bash
# Build local image from `docker/data.dockerfile`
./scripts/dockerbuild data

# Run the dockermake script to build the assets and install them locally.
./scripts/dockermake
```

You may need to be patient, some files are tens of Gigabytes. Should
only a subset of species be of interest, you can restrict the
scope of the build:

```bash
./scripts/dockermake SPECIES=clupea_harengus,linum_tenue
```

**3. Run the web application container**

Then to run the website locally, you have several options:

#### Using the latest development image

```bash
docker pull ghcr.io/scilifelabdatacentre/swg-hugo-site:dev
./scripts/dockerserve -t dev
```

#### Using a local build

```bash
./scripts/dockerbuild -t local hugo
./scripts/dockerserve -t local
```

#### Using the Hugo development server

This last method is adequate when you want to see changes to the
source immediately reflected in the web browser.

It requires the additional step of installing the JBrowse static
bundle in `hugo/static/browser`:

```bash
./scripts/download_jbrowse v2.15.4 hugo/static/browser
./scripts/dockerserve -d
```

---

Either of these methods will serve you the website at `http://localhost:8080/`.

## Credits

The Swedish Reference Genome Portal is developed and maintained by the DDLS
Data Science Node in Evolution and Biodiversity (DSN-EB) team as part of
the [SciLifeLab Data Platform](https://data.scilifelab.se/), operated by the 
SciLifeLab Data Centre. Members if the DSN-EB team are affiliated
with [SciLifeLab Data Centre](https://www.scilifelab.se/data/) 
and the [National Bioinformatics Infrastructure Sweden (NBIS)](https://nbis.se/),
based at Uppsala University and the Swedish Museum of Natural History.