{"id":49973836,"url":"https://github.com/helicalAI/helical","last_synced_at":"2026-05-27T00:01:00.103Z","repository":{"id":237913837,"uuid":"792692027","full_name":"helicalAI/helical","owner":"helicalAI","description":"A framework for state-of-the-art pre-trained bio foundation models on genomics and transcriptomics modalities.","archived":false,"fork":false,"pushed_at":"2026-04-28T10:35:32.000Z","size":21923,"stargazers_count":209,"open_issues_count":4,"forks_count":37,"subscribers_count":4,"default_branch":"release","last_synced_at":"2026-04-28T12:17:38.628Z","etag":null,"topics":["artificial-intelligence","bioinformatics","biology","deep-learning","dna-sequences","evo2","foundation-models","gene-expression","geneformer","helixmrna","pre-trained-model","pre-training","rna","rna-seq","rnaseq","scgpt","transcriptformer","transformer","uce","vcf"],"latest_commit_sha":null,"homepage":"https://www.helical-ai.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/helicalAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.bib","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-04-27T09:46:06.000Z","updated_at":"2026-04-28T02:06:48.000Z","dependencies_parsed_at":"2024-05-30T21:30:14.835Z","dependency_job_id":"7fecacac-01da-4bf0-adef-72e1b0e8c706","html_url":"https://github.com/helicalAI/helical","commit_stats":null,"previous_names":["helicalai/helical"],"tags_count":68,"template":false,"template_full_name":null,"purl":"pkg:github/helicalAI/helical","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helicalAI%2Fhelical","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helicalAI%2Fhelical/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helicalAI%2Fhelical/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helicalAI%2Fhelical/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/helicalAI","download_url":"https://codeload.github.com/helicalAI/helical/tar.gz/refs/heads/release","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helicalAI%2Fhelical/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33543973,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"ssl_error","status_checked_at":"2026-05-26T15:22:15.568Z","response_time":63,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","bioinformatics","biology","deep-learning","dna-sequences","evo2","foundation-models","gene-expression","geneformer","helixmrna","pre-trained-model","pre-training","rna","rna-seq","rnaseq","scgpt","transcriptformer","transformer","uce","vcf"],"created_at":"2026-05-18T10:00:22.018Z","updated_at":"2026-05-27T00:01:00.083Z","avatar_url":"https://github.com/helicalAI.png","language":"Python","funding_links":[],"categories":["🔬 Domain-Specific Applications","Domain Applications"],"sub_categories":["🧬 Biology \u0026 Medicine","Science, Medicine, and Quant"],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cp\u003e\u003ca href=\"https://helical.readthedocs.io/\"/\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"docs/assets/logo_and_text_v2_white.png\"\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"docs/assets/logo_and_text_v2.png\"\u003e\n    \u003cimg alt=\"Helical Logo\" src=\"docs/assets/logo_and_text_v2_white.png\" width=\"300\"\u003e\n  \u003c/picture\u003e\n  \u003c/a\u003e\u003c/p\u003e\n\u003c/div\u003e\n\n\n# What is Helical ?\n\nHelical builds the Virtual AI Lab for Biological Discovery.\nThis open framework provides access to state-of-the-art Bio Foundation Models across genomics, transcriptomics, and single-cell data modalities.\n\nHelical simplifies the entire lifecycle of applying Bio Foundation Models — from model access to fine-tuning and in-silico experimentation. With Helical's open-source framework, you can:\n\t•\tLeverage the latest Bio Foundation Models through a simple Python interface\n\t•\tRun example notebooks for key downstream tasks\n\t•\tCustomize models and workflows for your own datasets and experiments\n  \nThis repository is continuously updated with new models, benchmarks, and utilities.\nJoin us in shaping the next generation of AI-powered biology.\n\nLet’s build the most exciting AI-for-Bio community together!\n\u003cdiv align=\"center\"\u003e\n\n![Workflow](https://github.com/helicalAI/helical/actions/workflows/release.yml/badge.svg) \u0026nbsp;\n![Workflow](https://github.com/helicalAI/helical/actions/workflows/github-code-scanning/codeql/badge.svg) \u0026nbsp;\n[![Docs](https://img.shields.io/badge/docs-available-brightgreen)](https://helical.readthedocs.io/) \u0026nbsp;\n[![PyPI version](https://badge.fury.io/py/helical.svg)](https://badge.fury.io/py/helical) \u0026nbsp;\n![GitHub contributors](https://img.shields.io/github/contributors/helicalAI/helical) \u0026nbsp;\n\n\u003c/div\u003e\n\n## What's new?\n\n### Tahoe-x1\nWe have integrated the Tahoe-x1 foundation model for single-cell RNA-seq data. This transformer-based model can extract both cell and gene embeddings from raw count data and supports attention weight extraction for interpretability. Try it out with our [comprehensive tutorial notebook](./examples/notebooks/Tahoe-x1-Tutorial.ipynb)!\n\n### Cell2Sentence-Scale\nWe have integrated the new Cell2Sentence-Scale models which use cell sentences as input and are based on the Gemma language model architecture (2B and 27B models available in quantised versions too). You can use this model for embeddings and perturbation prediction. Follow our notebook tutorial [here](./examples/notebooks/Cell2Sen-Tutorial.ipynb). \n\n### New Larger Geneformer Models\nWe have integrated the new Geneformer models which are larger and have been trained on more data. Find out which models have been integrated into the Geneformer suite in the [model card](./helical/models/geneformer/README.md). Check out the our notebook on drug perturbation prediction using different Geneformer scalings [here](./examples/notebooks/Geneformer-Series-Comparison.ipynb).\n\n\n### TranscriptFormer\nWe have integrated [TranscriptFormer](https://github.com/czi-ai/transcriptformer) into our helical package and have made a model card for it in our [Transcriptformer model folder](helical/models/transcriptformer/README.md). If you would like to test the model, take a look at our [example notebook](examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb)!\n\n### 🧬 Introducing Helix-mRNA-v0: Unlocking new frontiers \u0026 use cases in mRNA therapy 🧬\nWe’re thrilled to announce the release of our first-ever mRNA Bio Foundation Model, designed to:\n\n1) Be Efficient, handling long sequence lengths effortlessly\n2) Balance Diversity \u0026 Specificity, leveraging a 2-step pre-training approach\n3) Deliver High-Resolution, using single nucleotides as a resolution\n\nCheck out our \u003ca href=\"https://www.helical-ai.com/blog/helix-mrna-v0\" target=\"_blank\"\u003eblog post\u003c/a\u003e to learn more about our approach and read the \u003ca href=\"https://helical.readthedocs.io/en/latest/model_cards/helix_mrna/\" target=\"_blank\"\u003emodel card\u003c/a\u003e to get started.\n\n## Installation\n\nWe recommend installing Helical within a conda environment with the commands below (run them in your terminal) - this step is optional:\n```\nconda create --name helical-package python=3.11.13\nconda activate helical-package\n```\n\nTo install the latest pip release of our Helical package, you can run the command below:\n```\npip install helical\n```\n\n***Note***\nSometimes Torch is not installed as the CUDA compiled version (e.g. on different architectures) which is why you need to manually install Helical with GPU support, run the command below (or install pytorch with cuda first and then install helical):\n```\npip install helical --extra-index-url https://download.pytorch.org/whl/cuXXX (replace XXX with your cuda version, e.g. 128 for cuda 12.8)\n```\n\nTo install the latest Helical package, you can run the command below:\n```\npip install --upgrade git+https://github.com/helicalAI/helical.git\n```\n\nAlternatively, clone the repo and install it:\n```\ngit clone https://github.com/helicalAI/helical.git\npip install .\n```\n\n\n###Flash Attention Support\nTo enable Flash Attention (required by some models), run the command below:\n```\npip install flash-attn --no-build-isolation\n```\n**Important** Make sure that your Pytorch CUDA Version matches your system CUDA version, especially when using flash-attn.\n\n###Mamba-SSM Model Installation\n[Optional] To install mamba-ssm and causal-conv1d use the command below:\n```\npip install helical[mamba-ssm]\n```\nor in case you're installing from the Helical repo cloned locally:\n```\npip install .[mamba-ssm]\n```\n###Evo2 Model Installation\nTo install Evo2 Specifically, follow the instructions in the [evo-2 model card](helical/models/evo_2/README.md).\n\n### Tahoe-X1 Model Installation\nTo install Tahoe-X1 do the following after installing helical:\n```\npip install helical[tahoe]\n```\n\n## Notes on the installation: \n- Make sure your machine has GPU(s) and Cuda installed. Currently this is a requirement for the packages mamba-ssm and causal-conv1d. \n- The package `causal_conv1d` requires `torch` to be installed already. First installing `helical` separately (without `[mamba-ssm]`) will install `torch` for you. A second installation (with `[mamba-ssm]`), installs the packages correctly.\n- If you have problems installing `mamba-ssm`, you can install the package via the provided `.whl` files on their release page [here](https://github.com/state-spaces/mamba/releases/tag/v2.2.4). Choose the package according to your cuda, torch and python version:\n```\npip install https://github.com/state-spaces/mamba/releases/download/v2.2.4/mamba_ssm-2.2.4+cu12torch2.3cxx11abiFALSE-cp311-cp311-linux_x86_64.whl\n```\n- Now continue with `pip install .[mamba-ssm]` to also install the remaining `causal-conv1d`.\n\n### Singularity (Optional)\nIf you desire to run your code in a singularity file, you can use the [singularity.def](./singularity.def) file and build an apptainer with it:\n```\napptainer build --sandbox singularity/helical singularity.def\n```\n\nand then shell into the sandbox container (use the --nv flag if you have a GPU available):\n```\napptainer shell --nv --fakeroot singularity/helical/\n```\n\n### RNA models:\n- [Helix-mRNA](https://helical.readthedocs.io/en/latest/model_cards/helix_mrna/)\n- [Mamba2-mRNA](https://helical.readthedocs.io/en/latest/model_cards/mamba2_mrna/)\n- [Geneformer](https://helical.readthedocs.io/en/latest/model_cards/geneformer/)\n- [scGPT](https://helical.readthedocs.io/en/latest/model_cards/scgpt/)\n- [Universal Cell Embedding (UCE)](https://helical.readthedocs.io/en/latest/model_cards/uce/)\n- [TranscriptFormer](https://helical.readthedocs.io/en/latest/model_cards/transcriptformer/)\n- [Tahoe-x1](https://helical.readthedocs.io/en/latest/model_cards/tahoe/)\n\n### DNA models:\n- [HyenaDNA](https://helical.readthedocs.io/en/latest/model_cards/hyena_dna/)\n- [Caduceus](https://helical.readthedocs.io/en/latest/model_cards/caduceus/)\n- [Evo 2](https://helical.readthedocs.io/en/latest/model_cards/evo_2/)\n\n\n## Demo \u0026 Use Cases\n\nTo run examples, be sure to have installed the Helical package (see Installation) and that it is up-to-date.\n\nYou can look directly into the example folder above and download the script of your choice, look into our [documentation](https://helical.readthedocs.io/) for step-by-step guides or directly clone the repository using:\n```\ngit clone https://github.com/helicalAI/helical.git\n```\nWithin the `examples/notebooks` folder, open the notebook of your choice. We recommend starting with `Quick-Start-Tutorial.ipynb`\n\n### Current Examples:\n\n| Example | Description | Colab |\n| ----------- | ----------- |----------- |                                                        \n|[Quick-Start-Tutorial.ipynb](./examples/notebooks/Quick-Start-Tutorial.ipynb)| A tutorial to quickly get used to the helical package and environment. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Quick-Start-Tutorial.ipynb)|\n|[Helix-mRNA.ipynb](./examples/notebooks/Helix-mRNA.ipynb)|An example of how to use the Helix-mRNA model.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Helix-mRNA.ipynb) |\n|[Geneformer-vs-TranscriptFormer.ipynb](./examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb) | Zero-Shot Reference Mapping with Geneformer \u0026 TranscriptFormer and compare the outcomes. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-vs-TranscriptFormer.ipynb) |\n|[Hyena-DNA-Inference.ipynb](./examples/notebooks/Hyena-DNA-Inference.ipynb)|An example how to do probing with HyenaDNA by training a neural network on 18 downstream classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Hyena-Dna-Inference.ipynb) |\n|[Cell-Type-Annotation.ipynb](./examples/notebooks/Cell-Type-Annotation.ipynb)|An example how to do probing with scGPT by training a neural network to predict cell type annotations.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Annotation.ipynb) |\n|[Cell-Type-Classification-Fine-Tuning.ipynb](./examples/notebooks/Cell-Type-Classification-Fine-Tuning.ipynb)|An example how to fine-tune different models on classification tasks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Type-Classification-Fine-Tuning.ipynb) |\n|[HyenaDNA-Fine-Tuning.ipynb](./examples/notebooks/HyenaDNA-Fine-Tuning.ipynb)|An example of how to fine-tune the HyenaDNA model on downstream benchmarks.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/HyenaDNA-Fine-Tuning.ipynb) |\n|[Cell-Gene-Cls-embedding-generation.ipynb](./examples/notebooks/Cell-Gene-Cls-embedding-generation.ipynb)|A notebook explaining the different embedding modes of single cell RNA models.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell-Gene-Cls-embedding-generation.ipynb) |\n|[Geneformer-Series-Comparison.ipynb](./examples/notebooks/Geneformer-Series-Comparison.ipynb)|A zero shot comparison between Geneformer model scaling on drug perturbation prediction|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Geneformer-Series-Comparison.ipynb) |\n|[Cell2Sen-Tutorial.ipynb](./examples/notebooks/Cell2Sen-Tutorial.ipynb)|An example tutorial of how to use cell2sen models for embeddings and perturbation predictions.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Cell2Sen-Tutorial.ipynb) |\n|[Tahoe-x1-Tutorial.ipynb](./examples/notebooks/Tahoe-x1-Tutorial.ipynb)|A comprehensive tutorial on using the Tahoe-x1 model for extracting cell and gene embeddings, with attention visualization.|[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/helicalAI/helical/blob/main/examples/notebooks/Tahoe-x1-Tutorial.ipynb) |\n\n\n## Stuck somewhere ? Other ideas ?\nWe are eager to help you and interact with you:\n- Join our [Slack channel](https://dk1sxv04.eu1.hubspotlinksfree.com/Ctc/L2+113/dk1sxv04/VWtlqj8M7nFNVf1vhw52bPfMW8wLjj95ptQw7N1k24YY3m2ndW8wLKSR6lZ3ldW7fZmPx5PxJ2lW8mYJtq5xWH5BVsxw821cWpdKW8CYXdj753XHSW8b5vG-7PTQ2LW1zs6x622rZxDW6930hX7RPKh3N5-trBXyRHkwVfJ3Zs3wRQV_N5NbYL3-lm47W1HvYX63pJp9cW6QXY-x6QsWMTW8G5jZh7T4vphN4Qtr7dMCxlJW8rM1-Y42pS-PW5sfJbh4FyRMhW5mHPkD4yCl56W36YW1_4GpPrGW7-sRYG1gXy8hMXqK6Sp5p69W8YTpvd3tC80SW2PTYtr6hP0dxW863B5F4KNCYkVFSWl390bSlQW78rxWn7JbS3LW14ZJ735n7SpFVSVlQr7lm7vwVlWslf6g9JRQf8mBL3b04) where you can discuss applications of bio foundation models.\n- You can also open Github issues [here](https://github.com/helicalAI/helical/issues).\n\n## Why should I use Helical \u0026 what to expect in the future?\nIf you are (or plan to) working with bio foundation models s.a. Geneformer or UCE on RNA and DNA data, Helical will be your best buddy! We provide and improve on:\n- Up-to-date model library\n- A unified API for all models\n- User-facing abstractions tailored to computational biologists, researchers \u0026 AI developers\n- Innovative use case and application examples and ideas\n- Efficient data processing \u0026 code-base\n\nWe will continuously upload the latest model, publish benchmarks and make our code more efficient.\n\n## Contributing\n\nWe welcome all kinds of contributions, including code, documentation, bug reports, and feature suggestions. Please read our [Contributing Guidelines](CONTRIBUTING.md) to help us keep the project organized and collaborative.\n\n## Acknowledgements\n\nA lot of our models have been published by talented authors developing these exciting technologies. We sincerely thank the authors of the following open-source projects:\n\n- [scGPT](https://github.com/bowang-lab/scGPT/)\n- [Geneformer](https://huggingface.co/ctheodoris/Geneformer)\n- [UCE](https://github.com/snap-stanford/UCE)\n- [TranscriptFormer](https://github.com/czi-ai/transcriptformer)\n- [HyenaDNA](https://github.com/HazyResearch/hyena-dna)\n- [Cell2Sen](https://github.com/vandijklab/cell2sentence)\n- [Tahoe-X1](https://github.com/tahoebio/tahoe-x1)\n- [llm-foundry](https://github.com/mosaicml/llm-foundry)\n- [composer](https://github.com/mosaicml/composer)\n- [anndata](https://github.com/scverse/anndata)\n- [scanpy](https://github.com/scverse/scanpy)\n- [transformers](https://github.com/huggingface/transformers)\n- [scikit-learn](https://github.com/scikit-learn/scikit-learn)\n- [GenePT](https://github.com/yiqunchen/GenePT)\n- [Caduceus](https://github.com/kuleshov-group/caduceus)\n- [Evo2](https://github.com/ArcInstitute/evo2)\n- [torch](https://github.com/pytorch/pytorch/blob/main/LICENSE)\n- [torchvision](https://github.com/pytorch/vision/blob/release/0.21/LICENSE)\n\n### Licenses\n\nYou can find the Licenses for each model implementation in the model repositories:\n\n- [Helix-mRNA](https://github.com/helicalAI/helical/blob/release/helical/models/helix_mrna/LICENSE)\n- [Mamba2-mRNA](https://github.com/helicalAI/helical/blob/release/helical/models/mamba2_mrna/LICENSE)\n- [scGPT](https://github.com/helicalAI/helical/blob/release/helical/models/scgpt/LICENSE)\n- [Geneformer](https://github.com/helicalAI/helical/blob/release/helical/models/geneformer/LICENSE)\n- [UCE](https://github.com/helicalAI/helical/blob/release/helical/models/uce/LICENSE)\n- [TranscriptFormer](https://github.com/helicalAI/helical/blob/release/helical/models/transcriptformer/LICENSE.md)\n- [HyenaDNA](https://github.com/helicalAI/helical/blob/release/helical/models/hyena_dna/LICENSE)\n- [Evo2](https://github.com/helicalAI/helical/blob/release/helical/models/evo_2/LICENSE)\n- [Cell2Sen](https://github.com/helicalAI/helical/blob/release/helical/models/c2s/LICENSE)\n- [Tahoe-X1](https://github.com/helicalAI/helical/blob/release/helical/models/tahoe/LICENSE)\n\n## Citation\n\nPlease use this BibTeX to cite this repository in your publications:\n\n```bibtex\n@software{allard_2024_13135902,\n  author       = {Helical Team},\n  title        = {helicalAI/helical: v1.1.0},\n  month        = nov,\n  year         = 2024,\n  publisher    = {Zenodo},\n  version      = {1.1.0},\n  doi          = {10.5281/zenodo.13135902},\n  url          = {https://doi.org/10.5281/zenodo.13135902}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FhelicalAI%2Fhelical","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FhelicalAI%2Fhelical","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FhelicalAI%2Fhelical/lists"}