An open API service indexing awesome lists of open source software.

https://github.com/mikhaeelatefrizk/bindsight

RNA-seq counts to ranked de novo protein binder candidates, with full provenance back to the patient cohort.
https://github.com/mikhaeelatefrizk/bindsight

alphafold bioinformatics boltz computational-biology de-novo-binder-design protein-design proteinmpnn prov-o reproducibility rfdiffusion rna-seq ro-crate streamlit

Last synced: 4 days ago
JSON representation

RNA-seq counts to ranked de novo protein binder candidates, with full provenance back to the patient cohort.

Awesome Lists containing this project

README

          

# bindsight

> **Expression โ†’ Binder.** The first open-source pipeline that takes RNA-seq counts and outputs ranked de novo protein binder candidates, with full provenance back to the patient cohort.

[![HF Space](https://img.shields.io/badge/%F0%9F%A4%97%20HF%20Space-bindsight-yellow.svg)](https://huggingface.co/spaces/Mikhaeelatefrizk/bindsight)
[![Open in Streamlit](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://bindsight.streamlit.app/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20121496.svg)](https://doi.org/10.5281/zenodo.20121496)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](LICENSE)
[![Python: 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![CI](https://github.com/mikhaeelatefrizk/bindsight/actions/workflows/ci.yml/badge.svg)](https://github.com/mikhaeelatefrizk/bindsight/actions/workflows/ci.yml)
[![Workflow: Snakemake](https://img.shields.io/badge/workflow-Snakemake-brightgreen.svg)](https://snakemake.github.io/)

## ๐Ÿ‘‰ Try it live

**Primary** (Hugging Face Space, 16 GB CPU): **[huggingface.co/spaces/Mikhaeelatefrizk/bindsight](https://huggingface.co/spaces/Mikhaeelatefrizk/bindsight)**
**Mirror** (Streamlit Community Cloud, 1 GB CPU): [bindsight.streamlit.app](https://bindsight.streamlit.app/)

Zero install โ€” runs in your browser. Click the **Demo** tab and watch the **discovery half** surface antibody-tractable cell-surface antigens from a **real TCGA breast-cancer cohort** (NIH/GDC), with full provenance. (Binder *design* and *validation* are GPU-only โ€” you run those locally via Modal / Docker / Kaggle / Colab, so they don't execute in the browser.)

> Both hosts are free-tier and will sleep after several days without traffic; a GitHub Actions cron pings both URLs every 6 hours so the next visitor lands on a warm container. If you hit either link after a long quiet stretch, give the wake-up screen 30โ€“60 s and reload once.

> ๐Ÿš€ **v0.2.0** โ€” discovery half end-to-end on CPU (real TCGA data); design + validation now **proven** end-to-end on a **free GPU** โ€” bindsight's first real de novo binders (20 ERBB2 designs, best ipTM 0.84, 50% success@0.65, with the real Boltz-2-predicted complexes) ship in the [designer benchmark](benchmarks/designer_benchmark/RESULTS.md); web UI deployed on Streamlit Cloud.

**New here?** โ†’ [What is bindsight?](docs/what-is-bindsight.md) (5-min read) ยท [How to use it](docs/how-to-use.md) ยท [Use cases](docs/use-cases.md) ยท [Designing on Colab](docs/colab-design-howto.md)

---

## Three ways to try it

### 1. Web app โ€” [Hugging Face Space](https://huggingface.co/spaces/Mikhaeelatefrizk/bindsight) (zero install) ยท [Streamlit mirror](https://bindsight.streamlit.app/)

Anyone visiting either URL above gets:
- The Home page with what bindsight is
- A **Demo** button that runs the discovery half live and renders a report
- A **Run on my data** page (upload counts.tsv + design.tsv โ†’ get results)
- A **Browse a run** page to inspect any output directory

The Hugging Face Space is the primary mirror (16 GB CPU). The Streamlit Cloud deploy at `bindsight.streamlit.app` is the same app on smaller free-tier infrastructure (1 GB CPU). Both hosts sleep after several days of inactivity; a 6-hourly GitHub Actions ping keeps them warm, but the very first visit after a long quiet period can still take ~30โ€“120 s to wake.

### 2. Local web app (one command)

```bash
pip install -e ".[discover,report]"
bindsight ui
# โ†’ opens http://localhost:8501 with the same multi-page interface
```

### 3. CLI

```bash
bindsight demo
```

Runs the full discovery half on a **real TCGA-BRCA tumor-vs-adjacent-normal cohort** auto-downloaded from NIH/GDC, and produces a real HTML report you can open in a browser. The pipeline discovers antibody-tractable cell-surface antigens over-expressed in tumor โ€” entirely from RNA-seq counts, with full provenance (well-known targets such as ERBB2/HER2 surface among the candidates when their signal is present). First run needs internet (cohort + SURFY downloaded, then cached) and takes a few minutes of real DESeq2 + enrichment; CPU-only, no GPU.

```
$ bindsight demo
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Demo run โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Real TCGA-BRCA tumor-vs-adjacent-normal โ”‚
โ”‚ cohort (NIH/GDC). Discovers antibody- โ”‚
โ”‚ tractable cell-surface antigens, with โ”‚
โ”‚ full provenance. โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
INFO GDC: downloading TCGA-BRCA cohort (20 tumor + 20 normal)โ€ฆ
INFO SURFY cache empty; populating the full surfaceome list (2886)
INFO DEGs: 17019 total, 4011 significant; enriching top 300 up-regulated
INFO surfaceome filter: 300 โ†’ 42
INFO wrote runs/demo/report.html
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ bindsight demo โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Demo complete! โ”‚
โ”‚ Report HTML: runs/demo/report.html โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
```

---

## Why this exists

Two ecosystems in computational biology operate side-by-side and barely talk to each other:

- **Genomics** (DESeq2, edgeR, Seurat, scanpy, TCGA, recount3) stops at *"here are the interesting genes."*
- **Protein design** (RFdiffusion, ProteinMPNN, BindCraft, BoltzGen, AlphaFold, Boltz-2) starts from *"given a target..."*

The bridge between them โ€” *"this gene is up in disease, low in healthy tissue, surface-exposed, has a known targetable site, here is a docked binder seed and a designed binder ranked by predicted affinity, with the receipts back to the patient cohort"* โ€” is missing. People build it ad-hoc, per project, never reproducibly. **bindsight ships that bridge as one tool.**

## What it does

```
RNA-seq counts (bulk or sc) Designed protein binders
โ”‚ โ–ฒ
โ”‚ โ”‚
โ–ผ โ”‚
Differential expression โ”€โ”€โ–บ Surface-exposed โ”€โ”€โ–บ De novo backbone
(pydeseq2 or DESeq2) (SURFY) (RFdiffusion / BindCraft / BoltzGen)
โ”‚ โ”‚
โ–ผ โ–ผ
Targetable sites Sequence design
(SURFACE-Bind, v0.2) (ProteinMPNN)
โ”‚ โ”‚
โ–ผ โ–ผ
AlphaFoldDB structure Affinity + structure
validation
(Boltz-2 / Chai-1r)
โ”‚
โ–ผ
Multi-objective ranking
โ”‚
โ–ผ
HTML report + RO-Crate (Zenodo)
with full PROV-O provenance
```

## Who it's for

- **Translational researchers** who want a free, reproducible "data โ†’ designed binder" pipeline.
- **Clinical biologists** who need an audit trail back from a binder to the patient cohort.
- **Method developers** who want a held-out evaluation harness (rediscovery of known antigens) to benchmark new designers/validators.
- **Pharma early-discovery teams** who want an open comparator they can extend with proprietary designers via the plugin interface.

## What's distinctive

| | Existing protein-design tools | bindsight |
|---|---|---|
| Input | Target structure | RNA-seq counts |
| Provenance | PDB + maybe a log | PROV-O JSON-LD + RO-Crate, audit trail to patient cohort |
| Hardware | HPC assumed | CPU laptop + offload to free Colab / Modal / Kaggle |
| Cost-awareness | None | `--dry-run` estimates GPU $ before running |
| Negative results | Discarded | Catalogued (`failure_taxonomy.parquet`) |
| Citability | Code dump | DOI per release, JSON-Schema-validated outputs, JOSS-style |

For the full landscape comparison, see [ARCHITECTURE.md](ARCHITECTURE.md#8-comparison-vs-existing-tools).

## What works today (v0.2.0)

| Capability | Status | How to try |
|---|---|---|
| **Web UI** โ€” multi-page Streamlit app (Home / Demo / Run on my data / Browse / About) | โœ… ready | `bindsight ui` *or* Streamlit Cloud |
| **`bindsight demo`** โ€” full discovery on shipped example + paper-style report | โœ… ready | `bindsight demo` |
| **`bindsight discover`** โ€” your own RNA-seq cohort โ†’ ranked targets | โœ… ready | `bindsight discover my.yaml --out runs/x` |
| **`bindsight rank`** โ€” multi-objective composite scoring of validated binders | โœ… ready | `bindsight rank runs/x` |
| **`bindsight report --format html`** โ€” paper-style HTML, embedded volcano + tables + provenance | โœ… ready | `bindsight report runs/x` |
| **`bindsight report --format streamlit`** โ€” interactive dashboard for one run | โœ… ready | `bindsight report runs/x --format streamlit` |
| **`bindsight run`** โ€” full pipeline orchestrator (discover โ†’ design โ†’ validate โ†’ rank โ†’ report โ†’ export) | โœ… ready | `bindsight run my.yaml --out runs/x` |
| **`bindsight export`** โ€” RO-Crate zip for Zenodo deposit | โœ… ready | `bindsight export runs/x --out runs/x.crate.zip` |
| **`bindsight design`** โ€” RFdiffusion + ProteinMPNN + Boltz-2 (and BindCraft / BoltzGen / Chai-1r / AF2-IG) run end-to-end on a GPU backend | โœ… ready | `bindsight design runs/x --backend modal` (or `local_docker` / `kaggle` / `colab`) |
| **`bindsight design --dry-run`** โ€” GPU cost estimate for any backend | โœ… ready | `bindsight design runs/x --backend modal --dry-run` |
| **`bindsight validate`** โ€” materialise structure/affinity metrics โ†’ `validated.parquet` | โœ… ready | `bindsight validate runs/x` |
| **`bindsight benchmark`** โ€” score rediscovery of the held-out known antigens (recall@k) | โœ… ready | `bindsight benchmark runs/x --known-antigens benchmarks/known.tsv` |
| **Snakemake front-end** โ€” same pipeline as the CLI, end-to-end | โœ… ready | `snakemake --configfile my.yaml --cores 4` (`pip install -e ".[workflow]"`) |
| **`bindsight doctor`** โ€” diagnose deps, caches, vendored data | โœ… ready | `bindsight doctor` |
| **`bindsight verify-licenses`** โ€” per-component license inventory | โœ… ready | `bindsight verify-licenses` |

> **Note on GPU stages.** The design/validation models require CUDA, so they
> run on the GPU backend you choose (Modal / local Docker / Kaggle, or a
> generated Colab notebook), not on the CPU host. The held-out evaluation set
> lives in [`benchmarks/`](benchmarks/) with full provenance.

> **Discovery quality filters (opt-in).** Beyond the core
> DE โ†’ surfaceome โ†’ structure path, discovery can apply real-data refinements via
> `target_discovery` config flags: an AlphaFold-pLDDT disorder gate
> (`min_mean_plddt`), UniProt extracellular-domain / topology restriction
> (`use_uniprot_topology`, `require_extracellular_domain`), and GTEx normal-tissue
> safety (`use_gtex_safety`) โ€” each adds a negative-result disposition and a
> per-candidate column. Binder developability scoring (Biopython ProtParam) is a
> ranking component; an ESM-2 โ†’ PCA embedding visualizer (`pip install -e ".[embed]"`)
> shows the designed-binder sequence space before any GPU spend; and the report
> carries a Limitations section (mRNA โ‰  surface protein, bulk-purity confounding).
> All are documented in the [CHANGELOG](CHANGELOG.md).

## Status & roadmap

- โœ… **v0.2.0** (current) โ€” everything in v0.1.0 (discovery on real TCGA data; full design half โ€” RFdiffusion + ProteinMPNN + Boltz-2, plus BindCraft / BoltzGen / Chai-1r / AF2-IG โ€” on Modal / local Docker / Kaggle / Colab; rank + report + export; benchmark + held-out eval set; CLI **and** Snakemake front-ends; web UI) **plus** the first real de novo binders, the free Kaggle split-environment backend, the negative-result taxonomy, SURFACE-Bind targetable-site lookup, opt-in discovery-quality filters (AlphaFold-pLDDT disorder gate, UniProt extracellular-domain/topology restriction, GTEx normal-tissue safety), binder developability scoring, an ESM-2 pre-GPU embedding visualizer, and surfaced discovery caveats (mRNA โ‰  surface protein, bulk-purity confounding).
- โœ… **Rediscovery validation** โ€” the discovery half, run on six real indication-matched TCGA cohorts, resurfaces **ERBB2 at rank 4** in HER2-enriched breast cancer (via PAM50 subtype stratification โ€” versus rank 25 in the unsplit BRCA cohort, where averaging across subtypes dilutes the HER2 signal) and is specific (non-over-expressed antigens such as EGFR/CEA are correctly not surfaced). Reproducible artifacts in [`benchmarks/validation/`](benchmarks/validation/RESULTS.md); write-up in [`paper/validation/`](paper/validation/manuscript.md).
- โœ… **De novo binder design validated** โ€” the design half (RFdiffusion โ†’ ProteinMPNN โ†’ Boltz-2) run on a **free Kaggle Tesla P100** produced **20 real binders** against the ERBB2 extracellular **domain IV** (the clinically validated trastuzumab epitope): mean **ipTM 0.59**, best **0.84**, **50 %** of designs pass the ipTM โ‰ฅ 0.65 success bar (mean PAE-interaction 13.7 ร…) โ€” at **$0**, no local GPU. The real Boltz-2-predicted **complexes** (CIF) + FASTAs + per-design metrics are in [`benchmarks/designer_benchmark/RESULTS.md`](benchmarks/designer_benchmark/RESULTS.md); reproduce on a free GPU via [`RUN_FREE_GPU.md`](benchmarks/designer_benchmark/RUN_FREE_GPU.md).
- โณ **v0.3.0** โ€” single-cell RNA-seq input, async (non-blocking) Modal job submission, and extending the [designer benchmark](benchmarks/designer_benchmark/DESIGNER_BENCHMARK.md) from the committed `rfdiff_mpnn` arm to the full three-way comparison (BindCraft / BoltzGen need โ‰ฅ24โ€“32 GB GPUs, so those arms run on paid backends).
- โณ **v1.0.0** โ€” JOSS submission; multi-modal tumor-selectivity scoring (single-cell + co-expression + immunopeptidomics) to extend discovery beyond bulk differential expression.

See [ARCHITECTURE.md ยง Phased Roadmap](ARCHITECTURE.md#11-phased-roadmap) for details.

## Install

`bindsight` is not yet on PyPI. Install from source (Windows / macOS / Linux,
Python 3.11+):

```bash
git clone bindsight
cd bindsight
python -m venv .venv

# Windows
.venv\Scripts\activate

# macOS / Linux
source .venv/bin/activate

pip install -e ".[dev,discover,report]"
bindsight --version
bindsight doctor # confirm install is clean
bindsight demo # run the 60-second demo
```

For Conda users, `envs/discover.yaml` provides the same set of dependencies:

```bash
mamba env create -f envs/discover.yaml
mamba activate bindsight-discover
pip install -e ".[dev,report]"
```

## Quickstart

```bash
# 1. Discover targets from a TCGA cohort (CPU only, ~10 minutes on a laptop)
bindsight discover examples/tcga_luad.yaml --out runs/luad_v01

# 2. Inspect the discovered targets
bindsight report runs/luad_v01 --format html
open runs/luad_v01/report.html

# 3. (v0.1+) Design binders for the top 5 targets via Colab GPU
bindsight design runs/luad_v01 --backend colab --trajectories 50

# 4. (v0.1+) Validate with Boltz-2
bindsight validate runs/luad_v01 --backend colab --validator boltz2

# 5. (v0.1+) Rank, report, export as RO-Crate
bindsight rank runs/luad_v01
bindsight report runs/luad_v01 --format html --include-binders
bindsight export runs/luad_v01 --format ro-crate --out runs/luad_v01.crate.zip
```

## Repository layout

```
bindsight/ # Python package
โ”œโ”€โ”€ io/ # Parquet, FASTA, PDB, mmCIF, manifest readers
โ”œโ”€โ”€ deg/ # pydeseq2 wrapper (+ optional R bridge)
โ”œโ”€โ”€ targets/ # Open Targets client + ENSGโ†’UniProt fallback + GTEx safety
โ”œโ”€โ”€ surfaceome/ # SURFY filter + SURFACE-Bind client
โ”œโ”€โ”€ structures/ # AlphaFoldDB + RCSB/PDBe fetch; pLDDT + UniProt topology
โ”œโ”€โ”€ epitopes/ # SURFACE-Bind site lookup; fpocket fallback (v0.2)
โ”œโ”€โ”€ design/ # Designer plugin interface; developability + ESM-2 embeddings
โ”œโ”€โ”€ runners/ # Colab / Modal / Kaggle / local-Docker adapters
โ”œโ”€โ”€ validate/ # Boltz-2 default; Chai-1r, AF2-IG opt-in
โ”œโ”€โ”€ rank/ # Multi-objective scoring
โ”œโ”€โ”€ benchmark/ # Rediscovery + designer-benchmark scoring harness
โ”œโ”€โ”€ pipelines/ # Discovery orchestrator (discover.py) + honesty caveats
โ”œโ”€โ”€ provenance/ # PROV-O JSON-LD schema + RO-Crate emitter
โ”œโ”€โ”€ report/ # HTML report template + Streamlit app
โ”œโ”€โ”€ config.py # Pydantic run-configuration models
โ””โ”€โ”€ cli.py # Click entrypoint

envs/ # Conda environment files (one per stage)
examples/ # Example pipeline configs (TCGA-LUAD, etc.)
benchmarks/ # Held-out known-antigen eval set + validation & designer-benchmark harnesses
paper/ # JOSS + bioRxiv manuscripts and the validation write-up
data/ # Local cache for auto-downloaded TCGA cohorts (gitignored)
tests/ # Pytest smoke + integration tests + fixtures
docs/ # mkdocs-material site source
.github/workflows/ # CI + Zenodo deposit on tag

ARCHITECTURE.md # Architectural source of truth
LICENSING.md # Per-dependency license inventory
CONTRIBUTING.md # How to contribute
CHANGELOG.md # Per-version changes
CITATION.cff # Zenodo / GitHub citation metadata
Snakefile # Snakemake DAG
pyproject.toml # Python packaging
```

## Documentation

- [ARCHITECTURE.md](ARCHITECTURE.md) โ€” system design, module contracts, design rationale
- [LICENSING.md](LICENSING.md) โ€” per-dependency license inventory and commercial-use guidance
- [CONTRIBUTING.md](CONTRIBUTING.md) โ€” dev setup, testing, commit conventions
- [CHANGELOG.md](CHANGELOG.md) โ€” per-version changes
- `docs/` โ€” long-form docs (built with `mkdocs build`)

## Acknowledgments

`bindsight` is an opinionated wrapper. Real intellectual credit belongs to the upstream tool authors. See [LICENSING.md](LICENSING.md) for the full inventory; the work this builds on most directly:

- [SURFACE-Bind](https://github.com/hamedkhakzad/SURFACE-Bind) (Khakzad et al., PNAS 2025) โ€” the targetable-sites catalog that makes the bridge tractable
- [pydeseq2](https://github.com/owkin/PyDESeq2) (Muzellec et al., Bioinformatics 2023) โ€” Python DESeq2 implementation
- [RFdiffusion](https://github.com/RosettaCommons/RFdiffusion) (Watson et al., Nature 2023) โ€” backbone generation
- [ProteinMPNN](https://github.com/dauparas/ProteinMPNN) (Dauparas et al., Science 2022) โ€” sequence design
- [Boltz-2](https://github.com/jwohlwend/boltz) (Wohlwend et al., 2025) โ€” structure + affinity prediction
- [BindCraft](https://github.com/martinpacesa/BindCraft) (Pacesa et al., Nature 2025) โ€” one-shot binder design
- [Snakemake](https://github.com/snakemake/snakemake) (Mรถlder et al., F1000Research 2021) โ€” workflow orchestration

## Citation

If you use `bindsight` in your work, please cite it via the Zenodo DOI:

> Wahba, M. A. R. (2026). *bindsight: a reproducible bridge from RNA-seq to de novo protein binder design* (v0.2.0). Zenodo. https://doi.org/10.5281/zenodo.20121496

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20121496.svg)](https://doi.org/10.5281/zenodo.20121496)

BibTeX:

```bibtex
@software{wahba_bindsight_2026,
author = {Wahba, Mikhaeel Atef Rizk},
title = {bindsight: a reproducible bridge from RNA-seq to de novo protein binder design},
year = {2026},
publisher = {Zenodo},
version = {v0.2.0},
doi = {10.5281/zenodo.20121496},
url = {https://doi.org/10.5281/zenodo.20121496},
orcid = {https://orcid.org/0009-0006-1069-9558}
}
```

GitHub also exposes a "Cite this repository" button on the right sidebar of the [repo page](https://github.com/mikhaeelatefrizk/bindsight) that auto-generates citations in BibTeX, APA, and other formats from [CITATION.cff](CITATION.cff). Please also cite the upstream tools you used (the per-run manifest emits a `software.bib` to make this easy).

## About the author

`bindsight` is built and maintained by **Mikhaeel Atef Rizk Wahba** โ€” PharmD graduate of the German University in Cairo (GUC), currently finishing the Egyptian post-PharmD applied-pharmacy term (Imtiyaz). Earlier in 2026 he had a research rotation at the German International University in Berlin (GIU Berlin) where he picked up R / RStudio.

- ORCID: [0009-0006-1069-9558](https://orcid.org/0009-0006-1069-9558)
- GitHub: [@mikhaeelatefrizk](https://github.com/mikhaeelatefrizk)
- Email: `mikhaeelatefrizk@proton.me`
- Languages: Arabic (native), English (full professional), German (professional working โ‰ˆ B2), French, Russian

### Sister projects on GitHub

`bindsight` sits at the deep end of an ongoing bioinformatics portfolio:

- **[bioinformatics-portfolio](https://github.com/mikhaeelatefrizk/bioinformatics-portfolio)** โ€” an end-to-end bioinformatics portfolio with three subprojects, each fully reproducible from raw data to figures:
- [`01-rnaseq-fox-domestication`](https://github.com/mikhaeelatefrizk/bioinformatics-portfolio/tree/main/01-rnaseq-fox-domestication) โ€” RNA-seq differential expression on GEO GSE76517, replicating the Kukekova et al. *PNAS* 2018 silver-fox domestication study
- [`02-tcga-survival-kidney-cancer`](https://github.com/mikhaeelatefrizk/bioinformatics-portfolio/tree/main/02-tcga-survival-kidney-cancer) โ€” TCGA-KIRC clinical survival analysis identifying EPAS1 / HIF-2ฮฑ as a prognostic biomarker (target of FDA-approved belzutifan)
- [`03-scrnaseq-pbmc-seurat`](https://github.com/mikhaeelatefrizk/bioinformatics-portfolio/tree/main/03-scrnaseq-pbmc-seurat) โ€” Seurat v5 single-cell RNA-seq workflow on the 10x PBMC 3k dataset, recovering 8 immune populations
- **[affect-labeling-review](https://github.com/mikhaeelatefrizk/affect-labeling-review)** โ€” a pre-registered systematic review + meta-analysis of affect labeling (Lieberman et al. 2007 paradigm). Real random-effects meta-analysis (k=9), PRISMA 2020, RoB 2 / ROBINS-I, ~14,000-word manuscript, open data + open code, `.zenodo.json` for citable archival
- **[awesome-protein-design-software](https://github.com/mikhaeelatefrizk/awesome-protein-design-software)** โ€” curated list of protein-design / structure-prediction software (RFdiffusion, ProteinMPNN, Boltz, AlphaFold, ESMFold, etc.)
- **[Awesome-Bioinformatics](https://github.com/mikhaeelatefrizk/Awesome-Bioinformatics)** โ€” curated list of bioinformatics libraries and tools

## License

- **Code:** [GNU AGPL-3.0-or-later](LICENSE). You may use, study, modify, and
redistribute bindsight freely; if you distribute a modified version **or run it
as a network service**, you must make your source available under the same
license, with attribution preserved. See [LICENSING.md](LICENSING.md) for
component-level details (bindsight orchestrates external tools that keep their
own licenses).
- **Documentation, manuscripts, figures, and generated results** (e.g. `paper/`):
[CC BY 4.0](paper/LICENSE) โ€” reuse freely with attribution.

ยฉ 2026 Mikhaeel Atef Rizk Wahba. Commercial licensing on other terms is available
from the author on request.