Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bnediction/scboolseq
scBoolSeq: scRNA-Seq data binarisation and synthetic generation from Boolean dynamics
https://github.com/bnediction/scboolseq
bioinformatics boolean-networks computational-biology machine-learning pandas python3 scikit-learn scrna-seq single-cell-rna-seq
Last synced: 4 months ago
JSON representation
scBoolSeq: scRNA-Seq data binarisation and synthetic generation from Boolean dynamics
- Host: GitHub
- URL: https://github.com/bnediction/scboolseq
- Owner: bnediction
- Created: 2022-03-29T21:29:41.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-17T14:24:25.000Z (9 months ago)
- Last Synced: 2024-10-12T23:43:40.254Z (4 months ago)
- Topics: bioinformatics, boolean-networks, computational-biology, machine-learning, pandas, python3, scikit-learn, scrna-seq, single-cell-rna-seq
- Language: Python
- Homepage:
- Size: 215 KB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: changelog.md
Awesome Lists containing this project
README
# scBoolSeq
scRNA-Seq data binarisation and synthetic generation from Boolean dynamics.
## Installation
### Pip
```
pip install scboolseq
```### Conda
```
conda install -c conda-forge -c colomoto scboolseq
```### Docker
`scBoolSeq` is included in the [ColoMoTo Docker](http://colomoto.org/notebook) distribution.
## Usage
### Python API
Here a minimal example is presented, using the same dataset as the CLI usage guide.
For further information, please check the documentation.```python
import pandas as pd
from scboolseq import scBoolSeq# read in the normalized expression data
nestorowa = pd.read_csv("data_Nestorowa.tsv.gz", index_col=0, sep="\t")
nestorowa.iloc[1:5, 1:5]
# HSPC_031 HSPC_037 LT-HSC_001 HSPC_001
# Kdm3a 6.877725 0.000000 0.000000 0.000000
# Coro2b 0.000000 6.913384 8.178374 9.475577
# 8430408G22Rik 0.000000 0.000000 0.000000 0.000000
# Clec9a 0.000000 0.000000 0.000000 0.000000
#
# NOTE : here, genes are rows and observations are columnsscbool_nest = scBoolSeq()
##
## Binarization
### scBoolSeq expects genes to be columns, thus we transpose the DataFrame.
scbool_nest.fit(nestorowa.T) # compute binarization criteriabinarized = scbool_nestorowa.binarize(nestorowa.T)
binarized.iloc[1:5, 1:5]
# Kdm3a Coro2b 8430408G22Rik Phf6
# HSPC_031 1.0 NaN NaN 0.0
# HSPC_037 0.0 1.0 NaN 0.0
# LT-HSC_001 0.0 1.0 NaN 1.0
# HSPC_001 0.0 1.0 NaN 1.0##
## Synthetic RNA-Seq generation from Boolean states
### We load in a boolean trace obtained from the simulation of a Boolean model
boolean_trace = pd.read_csv("boolean_dynamics.csv", index_col=0)
boolean_trace
# Kdm3a Coro2b 8430408G22Rik Phf6
# init 1.0 0.0 1.0 0.0
# transient_1 0.0 1.0 1.0 0.0
# transient_2 0.0 1.0 0.0 1.0
# stable_state 0.0 1.0 1.0 1.0synthetic_scrna_pseudocounts = scbool_nestorowa.sample_counts(boolean_trace)
```## Contributors
* [Gustavo Magaña López](https://github.com/gmagannaDevelop)