https://github.com/tidymodels/tabpfn
Foundation Model for Tabular Data via reticulate
https://github.com/tidymodels/tabpfn
Last synced: 5 months ago
JSON representation
Foundation Model for Tabular Data via reticulate
- Host: GitHub
- URL: https://github.com/tidymodels/tabpfn
- Owner: tidymodels
- License: apache-2.0
- Created: 2025-01-27T16:59:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-12-04T14:34:18.000Z (6 months ago)
- Last Synced: 2026-01-14T21:39:20.853Z (5 months ago)
- Language: R
- Homepage: http://tabpfn.tidymodels.org/
- Size: 1.7 MB
- Stars: 19
- Watchers: 3
- Forks: 3
- Open Issues: 4
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# tabpfn
[](https://CRAN.R-project.org/package=tabpfn)
[](https://github.com/tidymodels/tabpfn/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/tidymodels/tabpfn?branch=main)
tabpfn, meaning prior fitted networks for tabular data, is a deep-learning model. See:
- [_Transformers Can Do Bayesian Inference_](https://arxiv.org/abs/2112.10510) (arXiv, 2021)
- [_TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second_](https://arxiv.org/abs/2207.01848) (arXiv, 2022)
- [_Accurate predictions on small data with a tabular foundation model_](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C7&q=%22Accurate+predictions+on+small+data+with+a+tabular+foundation+model%22) (Nature, 2025)
This R package is a wrapper of the [Python library](https://github.com/PriorLabs/tabpfn) via reticulate. It has an idiomatic R syntax using standard S3 methods.
## Installation
You can install the development version of tabpfn like so:
```{r}
#| eval: false
require(pak)
pak(c("tidymodels/tabpfn"), ask = FALSE)
```
You'll need a Python virtual environment to access the underlying library. After installing the R package, tabpfn will install the required Python bits when you first fit a model:
```
> library(tabpfn)
>
> predictors <- mtcars[, -1]
> outcome <- mtcars[, 1]
>
> # XY interface
> mod <- tab_pfn(predictors, outcome)
Downloading uv...Done!
Downloading cpython-3.12.12 (download) (15.9MiB)
Downloading cpython-3.12.12 (download)
Downloading setuptools (1.1MiB)
Downloading scikit-learn (8.2MiB)
Downloading numpy (4.9MiB)
Downloading llvmlite
Downloading torch
Installed 58 packages in 350ms
> mod
tabpfn Regression Model
Training set
i 32 data points
i 10 predictors
```
## Example
```{r}
#| label: tab-start-up
library(tabpfn)
```
To fit a model:
```{r}
#| label: mtcars
set.seed(364)
reg_mod <- tab_pfn(mtcars[1:25, -1], mtcars$mpg[1:25])
reg_mod
```
In addition to the x/y interface shown above, there are also formula and recipes interfaces.
Prediction follows the usual S3 `predict()` method:
```{r}
#| label: mtcars-pred
predict(reg_mod, mtcars[26:32, -1])
```
tabpfn follows the tidymodels prediction convention: a data frame is always returned with a standard set of column names.
For a classification model, the outcome should always be a factor vector. For example, using these data from the modeldata package:
```{r}
#| label: cls
#| results: none
library(modeldata)
library(ggplot2)
two_cls_train <- parabolic[1:400, ]
two_cls_val <- parabolic[401:500,]
grid <- expand.grid(X1 = seq(-5.1, 5.0, length.out = 25),
X2 = seq(-5.5, 4.0, length.out = 25))
set.seed(3824)
cls_mod <- tab_pfn(class ~ ., data = two_cls_train)
grid_pred <- predict(cls_mod, grid)
grid_pred
```
The fit looks fairly good when shown with out-of-sample data:
```{r}
#| label: boundaries
#| fig.width: 5
#| fig.height: 4
#| fig.align: "center"
#| out.width: 50%
cbind(grid, grid_pred) |>
ggplot(aes(X1, X2)) +
geom_point(data = two_cls_val, aes(col = class, pch = class),
alpha = 3 / 4, cex = 3) +
geom_contour(aes(z = .pred_Class1), breaks = 1/ 2, col = "black", linewidth = 1) +
coord_equal(ratio = 1)
```
## Code of Conduct
Please note that the tabpfn project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.