https://github.com/gagneurlab/scooby
https://github.com/gagneurlab/scooby
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/gagneurlab/scooby
- Owner: gagneurlab
- License: mit
- Created: 2024-09-18T11:29:53.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-10-25T10:12:52.000Z (7 months ago)
- Last Synced: 2024-10-25T11:17:11.548Z (7 months ago)
- Language: Python
- Homepage: https://scooby.readthedocs.io
- Size: 2.94 MB
- Stars: 6
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
Scooby
======.. raw:: html
![]()
.. image:: https://readthedocs.org/projects/scooby/badge/?version=latest
:target: https://scooby.readthedocs.io/en/latest/?badge=latest
:alt: Documentation StatusCode for the scooby `manuscript `__. Scooby is the first model to predict
scRNA-seq coverage and scATAC-seq insertion profiles along the genome at
single-cell resolution. For this, it leverages the pre-trained
multi-omics profile predictor Borzoi as a foundation model, equips it
with a cell-specific decoder, and fine-tunes its sequence embeddings.
Specifically, the decoder is conditioned on the cell position in a
precomputed single-cell embedding.This repository contains model and data loading code and a train script.
The reproducibility
`repository `__
contains notebooks to reproduce the results of the manuscript.Hardware requirements
---------------------- NVIDIA GPU (tested on A40), Linux, Python (tested with v3.9)
Installation instructions
-------------------------Prerequisites
~~~~~~~~~~~~~scooby uses a a custom version of SnapATAC2, which can be installed with ``pip``. This is best installed in a separate environment due to numpy version conflicts with scooby.
- ``pip install snapatac2-scooby``
Scooby package installation
~~~~~~~~~~~~~~~~~~~~~~~~~~~- ``pip install git+https://github.com/gagneurlab/scooby.git``
- Download file contents from the Zenodo `repo `__
- Use examples from the scooby reproducibility
`repository `__Training
--------We offer a `train
script for modeling scRNA-seq only `__ and a `script for multiome modeling `__.
Both require SNAPATAC2-preprocessed anndatas and embeddings. Training scooby
takes 1-2 days on 8 NVIDIA A40 GPUs with 128GB RAM and 32 cores.Model architecture
------------------Currently, the model is only tested with a batch size of 1.
.. raw:: html
![]()