https://github.com/slowkow/harmonypy

🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
https://github.com/slowkow/harmonypy

bioinformatics data-integration data-science single-cell-analysis

Last synced: 3 months ago
JSON representation

🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.

Host: GitHub
URL: https://github.com/slowkow/harmonypy
Owner: slowkow
License: gpl-3.0
Created: 2019-12-19T17:25:59.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2024-07-08T16:55:48.000Z (about 1 year ago)
Last Synced: 2025-03-28T12:05:27.965Z (4 months ago)
Topics: bioinformatics, data-integration, data-science, single-cell-analysis
Language: Python
Homepage: https://portals.broadinstitute.org/harmony/
Size: 2.77 MB
Stars: 217
Watchers: 5
Forks: 22
Open Issues: 6
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

        harmonypy

=========

[![Latest PyPI Version][pb]][pypi] [![PyPI Downloads][db]][pypi] [![tests][gb]][yml]  [![DOI](https://zenodo.org/badge/229105533.svg)](https://zenodo.org/badge/latestdoi/229105533)

[gb]: https://github.com/slowkow/harmonypy/actions/workflows/python-package.yml/badge.svg

[yml]: https://github.com/slowkow/harmonypy/actions/workflows/python-package.yml

[pb]: https://img.shields.io/pypi/v/harmonypy.svg

[pypi]: https://pypi.org/project/harmonypy/

[db]: https://img.shields.io/pypi/dm/harmonypy?label=pypi%20downloads

Harmony is an algorithm for integrating multiple high-dimensional datasets.

harmonypy is a port of the [harmony] R package by [Ilya Korsunsky].

Example

-------



  



This animation shows the Harmony alignment of three single-cell RNA-seq datasets from different donors.

[→ How to make this animation.](https://slowkow.com/notes/harmony-animation/)

Installation

------------

This package has been tested with Python 3.7.

Use [pip] to install:

```bash

pip install harmonypy

```

Usage

-----

Here is a brief example using the data that comes with the R package:

```python

# Load data

import pandas as pd

meta_data = pd.read_csv("data/meta.tsv.gz", sep = "\t")

vars_use = ['dataset']

# meta_data

#

#                  cell_id dataset  nGene  percent_mito cell_type

# 0    half_TGAAATTGGTCTAG    half   3664      0.017722    jurkat

# 1    half_GCGATATGCTGATG    half   3858      0.029228      t293

# 2    half_ATTTCTCTCACTAG    half   4049      0.015966    jurkat

# 3    half_CGTAACGACGAGAG    half   3443      0.020379    jurkat

# 4    half_ACGCCTTGTTTACC    half   2813      0.024774      t293

# ..                   ...     ...    ...           ...       ...

# 295  t293_TTACGTACGACACT    t293   4152      0.033997      t293

# 296  t293_TAGAATTGTTGGTG    t293   3097      0.021769      t293

# 297  t293_CGGATAACACCACA    t293   3157      0.020411      t293

# 298  t293_GGTACTGAGTCGAT    t293   2685      0.027846      t293

# 299  t293_ACGCTGCTTCTTAC    t293   3513      0.021240      t293

data_mat = pd.read_csv("data/pcs.tsv.gz", sep = "\t")

data_mat = np.array(data_mat)

# data_mat[:5,:5]

#

# array([[ 0.0071695 , -0.00552724, -0.0036281 , -0.00798025,  0.00028931],

#        [-0.011333  ,  0.00022233, -0.00073589, -0.00192452,  0.0032624 ],

#        [ 0.0091214 , -0.00940727, -0.00106816, -0.0042749 , -0.00029096],

#        [ 0.00866286, -0.00514987, -0.0008989 , -0.00821785, -0.00126997],

#        [-0.00953977,  0.00222714, -0.00374373, -0.00028554,  0.00063737]])

# meta_data.shape # 300 cells, 5 variables

# (300, 5)

#

# data_mat.shape  # 300 cells, 20 PCs

# (300, 20)

# Run Harmony

import harmonypy as hm

ho = hm.run_harmony(data_mat, meta_data, vars_use)

# Write the adjusted PCs to a new file.

res = pd.DataFrame(ho.Z_corr)

res.columns = ['X{}'.format(i + 1) for i in range(res.shape[1])]

res.to_csv("data/adj.tsv.gz", sep = "\t", index = False)

```

[harmony]: https://github.com/immunogenomics/harmony

[Ilya Korsunsky]: https://github.com/ilyakorsunsky

[pip]: https://pip.readthedocs.io/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/slowkow/harmonypy

Awesome Lists containing this project

README