https://github.com/vic-cheung/rpy-bridge
Python-R interoperability layer for environment management, type-safe conversions, data normalization, safe function execution, and supporting recursive R object unpacking.
https://github.com/vic-cheung/rpy-bridge
bioinformatics bridge interoperability interoperability-layer python python-bridge r rpy2 statistics
Last synced: 5 months ago
JSON representation
Python-R interoperability layer for environment management, type-safe conversions, data normalization, safe function execution, and supporting recursive R object unpacking.
- Host: GitHub
- URL: https://github.com/vic-cheung/rpy-bridge
- Owner: vic-cheung
- License: other
- Created: 2025-11-20T03:25:19.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-12-15T21:58:43.000Z (6 months ago)
- Last Synced: 2025-12-16T10:27:02.813Z (6 months ago)
- Topics: bioinformatics, bridge, interoperability, interoperability-layer, python, python-bridge, r, rpy2, statistics
- Language: Python
- Homepage: https://rpy-bridge.readthedocs.io/en/stable/usage.html
- Size: 5.67 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# rpy-bridge
**rpy-bridge** is a Python-controlled **R execution orchestrator** (not a thin
rpy2 wrapper). It delivers deterministic, headless-safe R startup; project-root
inference; out-of-tree `renv` activation; isolated script namespaces; and robust
Python↔R conversions with dtype/NA normalization. Use it when you need
reproducible R execution from Python in production pipelines and CI.
**Latest release:** [`rpy-bridge` on PyPI](https://pypi.org/project/rpy-bridge/)
---
## What this is (and is not)
rpy-bridge **is not a thin rpy2 wrapper**. Key differences:
- Infers R project roots via markers (`.git`, `.Rproj`, `renv.lock`, `DESCRIPTION`, `.here`)
- Activates `renv` even when it lives outside the calling directory
- Executes from the inferred project root so relative paths behave as R expects
- Runs headless by default (no GUI probing), isolates scripts from `globalenv()`
- Normalizes return values for Python (NAs, dtypes, data.frames) and offers comparison helpers
---
## Quickstart
Call a package function (no scripts):
```python
from rpy_bridge import RFunctionCaller
rfc = RFunctionCaller()
samples = rfc.call("stats::rnorm", 5, mean=0, sd=1)
median_val = rfc.call("stats::median", samples)
```
Call a function from a local script with `renv` (out-of-tree allowed):
```python
from pathlib import Path
from rpy_bridge import RFunctionCaller
project_dir = Path("/path/to/your-r-project")
script = project_dir / "scripts" / "example.R"
rfc = RFunctionCaller(path_to_renv=project_dir, scripts=script)
result = rfc.call("some_function", 42, named_arg="value")
```
## Core capabilities
### 1. R execution orchestration
- Embeds R via `rpy2` with deterministic startup behavior
- Disables interactive and GUI-dependent hooks for headless execution
- Loads R scripts into isolated namespaces (not `globalenv()`)
### 2. Project root inference and path stability
- Infers R project roots using markers such as:
`.git`, `.Rproj`, `renv.lock`, `DESCRIPTION`, `.here`
- Executes R code from the inferred project root regardless of Python CWD
- Preserves relative-path behavior expected by R scripts
- Supports R code using `here::here()` or project-local data
### 3. Out-of-tree `renv` activation
- Activates `renv` projects located **outside** the calling Python directory
- Sources `.Rprofile` and `.Renviron` to reproduce R startup semantics
- Does not require R scripts and `renv` to live in the same directory
### 4. Python ↔ R data conversion
- Converts Python scalars, lists, dicts, and pandas objects into R equivalents
- Converts R vectors, lists, and data.frames back into Python-native types
- Handles nested structures, missing values, and mixed types robustly
### 5. Data normalization and diagnostics
- Post-processes R data.frames to fix dtype, timezone, and NA semantics
- Normalizes column types for reliable Python-side comparison
- Supports structured mismatch diagnostics between Python and R data
### 6. Function invocation across scripts and packages
- Calls functions defined in sourced R scripts, base R, or installed packages
- Supports qualified function names (e.g. `stats::median`)
- Executes functions within the active project and library context
---
## Calling base R functions and managing packages
In addition to sourcing local R scripts, rpy-bridge supports calling functions
from base R and installed packages directly from Python.
Current support includes:
- Calling base R functions without a local R script
- Executing functions from installed R packages within the active environment
Planned extensions (roadmap):
- Programmatic installation of R packages into the active `renv` or system
environment when explicitly enabled
- Declarative package requirements at the Python call site
- Safe, opt-in package installation for CI and ephemeral environments
Package installation is intentionally **not automatic by default** to preserve
reproducibility and avoid side effects during execution.
---
## Installation
### Prerequisites
- System R installed and available on `PATH`
- Python 3.11+ (tested on 3.11–3.12)
### From PyPI
Install rpy-bridge with rpy2 for full R support:
```bash
python3 -m pip install rpy-bridge rpy2
```
Using `uv`:
```bash
uv add rpy-bridge rpy2
```
### Development install
```bash
python3 -m pip install -e .
```
or:
```bash
uv sync
```
### Required Python dependencies
- `rpy2`
- `pandas`
- `numpy`
---
## Usage
See Quickstart above and examples in `examples/basic_usage.py`.
---
## Round-trip Python ↔ R behavior
rpy-bridge attempts to convert Python objects to R and back. Most objects used in
scientific and ML pipelines round-trip cleanly, but some heterogeneous Python
structures may be wrapped or slightly altered due to differences in R’s type
system.
| Python type | Round-trip fidelity | Notes |
| ---------------------------------------------- | ------------------- | --------------------------------------------------------------------- |
| `int`, `float`, `bool`, `str` | High | Scalars convert directly |
| Homogeneous `list` of numbers/strings | High | Converted to atomic R vectors |
| Nested homogeneous lists | High | Converted to nested R lists |
| `pandas.DataFrame` / `pd.Series` | High | Converted to `data.frame` and normalized on return |
| Mixed-type `list` or `dict` | Partial | May be wrapped in single-element vectors |
| `None` / `pd.NA` | High | Converted to R `NULL` |
---
## R setup helpers
Helper scripts are provided in `examples/r-deps/` to prepare R environments.
- Install system R dependencies (macOS / Homebrew):
```bash
bash examples/r-deps/install_r_dev_deps_homebrew.sh
```
- Initialize an `renv` project:
```r
source("examples/r-deps/setup_env.R")
```
- Restore the environment on a new machine:
```r
renv::restore()
```
---
## Who this is for
rpy-bridge is designed for:
- Python-first pipelines that rely on mature R code
- Teams where R logic must remain authoritative
- CI or production systems that cannot rely on interactive R sessions
- Multi-repo or multi-directory projects with non-trivial filesystem layouts
It is **not** intended as a convenience wrapper for exploratory R usage.
---
## Licensing
- rpy-bridge is released under the MIT License © 2025 Victoria Cheung
- Depends on [`rpy2`](https://rpy2.github.io), licensed under the GNU GPL (v2 or later)
---
## Acknowledgements
This package was spun out of internal tooling I wrote at Revolution Medicines.
Thanks to the team there for supporting its open-source release.