https://github.com/mvinyard/torch-adata
Create PyTorch Datasets from AnnData
https://github.com/mvinyard/torch-adata
adata anndata python pytorch single-cell
Last synced: 30 days ago
JSON representation
Create PyTorch Datasets from AnnData
- Host: GitHub
- URL: https://github.com/mvinyard/torch-adata
- Owner: mvinyard
- License: agpl-3.0
- Created: 2022-06-21T23:32:14.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2025-03-06T19:33:20.000Z (3 months ago)
- Last Synced: 2025-03-06T19:47:23.160Z (3 months ago)
- Topics: adata, anndata, python, pytorch, single-cell
- Language: Jupyter Notebook
- Homepage: https://torch-adata.readthedocs.io/en/latest/index.html
- Size: 415 KB
- Stars: 20
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 
[](https://pypi.python.org/pypi/torch-adata/)
[](https://badge.fury.io/py/torch-adata)
[](https://torch-adata.readthedocs.io/en/latest/?badge=latest)
[](https://github.com/psf/black)Create [`PyTorch Datasets`](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) from [`AnnData`](https://anndata.readthedocs.io/en/latest/)
## Installation
Install from PYPI (current version: **[`0.0.24`](https://pypi.org/project/torch-adata/)**):
```BASH
pip install torch-adata
```Install the developer version:
```BASH
git clone https://github.com/mvinyard/torch-adata.git; cd torch-adata;
pip install -e .
```## The main API
The primary class is the [`AnnDataset`](https://github.com/mvinyard/torch-adata/blob/main/torch_adata/_core/_AnnDataset.py). This is a subclass of the widely-used [`torch.utils.data.Dataset`](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html). The PyTorch `Dataset` module enables us to take advantage of built-in multiprocessing and other organizational tricks that ultimately standardize workflows and enable reproducibility.

```python
import anndata as a
import torch_adataadata = a.read_h5ad("/path/to/data.h5ad")
dataset = torch_adata.AnnDataset(adata, use_key="X_pca", groupby="time", obs_keys=["affinity"])
```
```
[ torch-adata ]: AnnDataset object with 7131 samples
----------------------------------------------------
Grouped by: 'time' with attributes:
- X (use_key = 'X_pca') torch.Size([3, 7131, 50])
- obs: affinity: torch.Size([3, 7131, 1])
```#
There is an additional approach to this dubbed [`AnnLoader`](https://github.com/scverse/anndata/blob/master/anndata/experimental/pytorch/_annloader.py), highlighted by [Sergei Rybakov](https://github.com/koncopd) in [Interfacing pytorch models with anndata](https://anndata-tutorials.readthedocs.io/en/latest/annloader.html)**For more information, please visit the [documentation](https://torch-adata.readthedocs.io/en/latest/index.html)!**
**Problem?** Open an [issue](https://github.com/mvinyard/torch-adata/issues/new)