https://github.com/hutaobo/cell-gps
Cell-GPS is the Python package and reference implementation for Cophenetic Spatial Topology Embedding (COSTE), a spatial topology analysis framework for spatial omics data.
https://github.com/hutaobo/cell-gps
bioinformatics data-visualization python scanpy single-cell spatial-analysis spatial-omics spatial-transcriptomics visium xenium
Last synced: 9 days ago
JSON representation
Cell-GPS is the Python package and reference implementation for Cophenetic Spatial Topology Embedding (COSTE), a spatial topology analysis framework for spatial omics data.
- Host: GitHub
- URL: https://github.com/hutaobo/cell-gps
- Owner: hutaobo
- License: other
- Created: 2025-01-04T16:42:14.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-06-03T07:41:42.000Z (13 days ago)
- Last Synced: 2026-06-03T09:23:03.658Z (13 days ago)
- Topics: bioinformatics, data-visualization, python, scanpy, single-cell, spatial-analysis, spatial-omics, spatial-transcriptomics, visium, xenium
- Language: Python
- Homepage: https://pypi.org/project/Cell-GPS/
- Size: 3.64 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.rst
- Authors: AUTHORS.rst
Awesome Lists containing this project
README
# Cell-GPS
[](https://pypi.org/project/Cell-GPS/)
[](https://cell-gps.readthedocs.io/en/latest/?badge=latest)
[](https://anaconda.org/conda-forge/cell-gps)
[](https://pypi.org/project/Cell-GPS/)
[](LICENSE)
[](https://github.com/hutaobo/Cell-GPS/actions/workflows/python-publish.yml)
`Cell-GPS` is the Python package and reference implementation for Cophenetic Spatial Topology Embedding (COSTE), a spatial topology analysis framework for spatial omics data.
This repository is maintained as both the installable Python package and the code companion for the Cell-GPS/COSTE bioRxiv preprint.
## Preprint and manuscript code
Cell-GPS/COSTE is described in the associated bioRxiv preprint:
> Long M, Hu T, Sountoulidis A, Samakovlis C, Nilsson M. Cophenetic Spatial Topology Embedding reveals multiscale tissue architecture in spatial omics. bioRxiv. 2026. doi: [10.64898/2026.05.26.727847](https://doi.org/10.64898/2026.05.26.727847)
Versioned bioRxiv page:
If you use the Python package, the Windows executable, the R companion package, or the manuscript figure/table code, please cite this preprint.
The code used to generate the preprint figures and supplementary tables is organized in [`Cell-GPS manuscript code/`](https://github.com/hutaobo/Cell-GPS/tree/main/Cell-GPS%20manuscript%20code):
- [`main_figures/`](https://github.com/hutaobo/Cell-GPS/tree/main/Cell-GPS%20manuscript%20code/main_figures): notebooks for main figure analyses.
- [`supplementary_figures/`](https://github.com/hutaobo/Cell-GPS/tree/main/Cell-GPS%20manuscript%20code/supplementary_figures): notebooks for supplementary figure analyses.
- [`supplementary_tables/`](https://github.com/hutaobo/Cell-GPS/tree/main/Cell-GPS%20manuscript%20code/supplementary_tables): notebooks for supplementary table analyses.
The notebooks are intentionally output-free and preserve the original manuscript data paths where those paths were required for reproduction. A detailed mapping from manuscript results to source code is available in [`docs/cellgps_science_manuscript_code_inventory.md`](https://github.com/hutaobo/Cell-GPS/blob/main/docs/cellgps_science_manuscript_code_inventory.md).
## Package Names
- Python distribution: `Cell-GPS`
- Conda-forge distribution: `cell-gps`
- Python import package: `cellgps`
- Legacy Python compatibility namespace: `sfplot` (not a separate distribution; retained for existing scripts)
- R package/repository: `cellgpsr`
- Windows executable: `cellgps.exe`
The Python package is hosted at `https://github.com/hutaobo/Cell-GPS`. The R package is hosted separately at `https://github.com/hutaobo/cellgpsr`. The Windows single-file executable is distributed through Zenodo; use the latest open v2 DOI or the version-series route .
## What Cell-GPS does
- Computes searcher-findee distance matrices from spatial coordinates and cell labels.
- Builds cophenetic distance matrices and StructureMap heatmaps from `AnnData` objects or plain `pandas` tables.
- Loads 10x Xenium outputs and prepares them for downstream spatial analysis.
- Provides Xenium loaders backed by `pyXenium.io`, including standard Xenium folders and table bundles with `cells.parquet`, official `*_cell_groups.csv`, and `cell_feature_matrix.h5`.
- Supports transcript-by-cell analysis for locating transcripts relative to cell types.
- Includes memory-optimized workflows for large datasets.
- Provides plotting utilities such as clustered heatmaps, circular dendrograms, and related summary figures.
## Repository layout
- `src/cellgps/`: recommended Python import namespace.
- `src/cellgps/pp`, `src/cellgps/tl`, `src/cellgps/pl`: scverse-style aliases for preprocessing, analysis, and plotting APIs.
- `src/sfplot/`: legacy compatibility namespace that currently hosts implementation modules; new code should import through `cellgps`.
- `tests/`: package tests and smoke checks.
- `docs/`: Sphinx documentation.
- `docs/project/`: project notes, changelog, authors, and reviewer guide.
- `Cell-GPS manuscript code/`: curated preprint figure and table notebooks.
- `examples/`: compact usage examples and small example data files.
- `packaging/conda-recipe/`: legacy local conda recipe retained for reference.
- `packaging/pyinstaller/`: Windows executable build scripts and PyInstaller assets.
## Installation
Install from PyPI:
```bash
pip install Cell-GPS
```
Install from conda-forge:
```bash
conda install -c conda-forge cell-gps
```
Install directly from GitHub:
```bash
pip install git+https://github.com/hutaobo/Cell-GPS.git
```
For local development or reviewer inspection:
```bash
git clone https://github.com/hutaobo/Cell-GPS.git
cd cellgps
pip install -e .
```
The package requires Python 3.9 or later.
## Quick start from a coordinate table
The minimal input is a table with spatial coordinates and a cell-type column.
```python
import pandas as pd
from cellgps import compute_cophenetic_distances_from_df, plot_cophenetic_heatmap
df = pd.DataFrame(
{
"x": [0, 1, 5, 6],
"y": [0, 1, 5, 6],
"celltype": ["A", "A", "B", "B"],
}
)
row_coph, col_coph = compute_cophenetic_distances_from_df(
df=df,
x_col="x",
y_col="y",
celltype_col="celltype",
)
plot_cophenetic_heatmap(
row_coph,
matrix_name="row_coph",
output_dir="output",
output_filename="StructureMap_example.pdf",
sample="Example",
)
```
## Quick start from Xenium output
```python
from cellgps import load_xenium_data, load_xenium_table_bundle, compute_cophenetic_distances_from_adata
# Standard Xenium folder through pyXenium.io
adata = load_xenium_data("/path/to/xenium/run", normalize=False)
# Explicit table-bundle route used for the Atera Xenium benchmark
adata = load_xenium_table_bundle("/path/to/xenium/run", normalize=False)
row_coph, col_coph = compute_cophenetic_distances_from_adata(
adata,
cluster_col="Cluster",
output_dir="output",
)
```
## Useful public entry points
- `load_xenium_data`: load and preprocess Xenium data.
- `load_xenium_table_bundle`: load Xenium data from `cells.parquet` + `*_cell_groups.csv` + `cell_feature_matrix.h5` through `pyXenium.io`.
- `compute_cophenetic_distances_from_df`: compute structure matrices from a coordinate table.
- `compute_weighted_searcher_findee_distance_matrix_from_df`: weighted searcher-findee kernel for entity, pathway, or LR analysis.
- `compute_weighted_cophenetic_distances_from_df`: weighted StructureMap wrapper over the weighted kernel.
- `compute_cophenetic_distances_from_adata`: compute structure matrices from `AnnData`.
- `compute_entity_to_cell_topology`: generalize `t_and_c` from transcripts to arbitrary weighted entities.
- `compute_entity_structuremap`: build StructureMap-style topology among arbitrary weighted entities.
- `plot_cophenetic_heatmap`: generate StructureMap and related clustered heatmaps.
- `transcript_by_cell_analysis`: analyze transcript-to-cell spatial structure at scale.
- `ligand_receptor_topology_analysis`: score sender->receiver ligand-receptor candidates using topology, structure compatibility, and local contact.
- `ligand_receptor_target_consistency`: add a NicheNet-style downstream target-consistency layer.
- `compute_pathway_activity_matrix`: compute rank-based or weighted pathway activities per cell.
- `pathway_topology_analysis`: analyze pathway-to-cell and pathway-to-pathway spatial topology.
- `compute_cophenetic_distances_from_df_memory_opt`: memory-aware alternative for large tables.
- `plot_circular_dendrogram_pycirclize`: circular dendrogram visualization.
## Validation scope
The manuscript-validated scope is the COSTE/SSS workflow and the figure/table analyses mapped in `Cell-GPS manuscript code/` and `docs/cellgps_science_manuscript_code_inventory.md`. Ligand-receptor topology, pathway topology, Visium helpers, GUI entry points and other convenience APIs are included for reuse and development, but should be treated as optional or exploratory unless a manuscript notebook or documentation page explicitly maps them to a reported analysis.
## Notes for reviewers
- The curated figure and table notebooks for the bioRxiv preprint are kept in `Cell-GPS manuscript code/`.
- Raw experimental datasets are not bundled in this repository because of size and distribution constraints. The code expects standard spatial omics outputs such as Xenium folders or tabular coordinate inputs.
- Conda-forge packages the upstream Python distribution as `cell-gps`. The `sfplot` top-level namespace is bundled only as legacy compatibility inside the same distribution, not as a separate conda or PyPI package.
- When a `cellgps_tbc_formal_wta/results`-style directory is already available, the LR and pathway topology extensions are designed to reuse its `t_and_c_result_*.csv` and `StructureMap_table_*.csv` outputs as the preferred gene-level topology anchors before falling back to recomputation.
- Xenium loading depends on `pyXenium>=0.4.3`. Visium helpers remain optional through the separate `Cell-GPS[visium]` extra.
- A short repository walkthrough is available in [docs/project/REVIEWER_GUIDE.md](docs/project/REVIEWER_GUIDE.md).
## Documentation
Read the Docs documentation is available at .
The manuscript-focused pages introduce the bioRxiv preprint, explain how to use the curated figure/table notebooks, and map each figure and supplementary table to its GitHub code location.
Sphinx documentation sources are available in `docs/`.
## Citation
If you use Cell-GPS or reuse the manuscript analysis code, please cite:
> Long M, Hu T, Sountoulidis A, Samakovlis C, Nilsson M. Cophenetic Spatial Topology Embedding reveals multiscale tissue architecture in spatial omics. bioRxiv. 2026. doi: [10.64898/2026.05.26.727847](https://doi.org/10.64898/2026.05.26.727847)
```bibtex
@article{long2026cellgps,
title = {Cophenetic Spatial Topology Embedding reveals multiscale tissue architecture in spatial omics},
author = {Long, Mengping and Hu, Taobo and Sountoulidis, Alexandros and Samakovlis, Christos and Nilsson, Mats},
journal = {bioRxiv},
year = {2026},
doi = {10.64898/2026.05.26.727847},
url = {https://www.biorxiv.org/content/10.64898/2026.05.26.727847v1}
}
```
## License
This project is released under the MIT License. See [LICENSE](LICENSE).