{"id":32201507,"url":"https://github.com/praktiskt/featuretoolsr","last_synced_at":"2026-03-11T01:01:36.143Z","repository":{"id":56936389,"uuid":"145317535","full_name":"praktiskt/featuretoolsR","owner":"praktiskt","description":"An R interface to the Python module Featuretools","archived":false,"fork":false,"pushed_at":"2020-04-25T10:06:39.000Z","size":67,"stargazers_count":50,"open_issues_count":3,"forks_count":8,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-09-08T16:16:44.338Z","etag":null,"topics":["feature-engineering","featuretools","machine-learning","r-package","rstats"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/praktiskt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-19T16:02:21.000Z","updated_at":"2025-04-24T07:52:24.000Z","dependencies_parsed_at":"2022-08-21T06:21:02.252Z","dependency_job_id":null,"html_url":"https://github.com/praktiskt/featuretoolsR","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/praktiskt/featuretoolsR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praktiskt%2FfeaturetoolsR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praktiskt%2FfeaturetoolsR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praktiskt%2FfeaturetoolsR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praktiskt%2FfeaturetoolsR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/praktiskt","download_url":"https://codeload.github.com/praktiskt/featuretoolsR/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praktiskt%2FfeaturetoolsR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280376527,"owners_count":26320275,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["feature-engineering","featuretools","machine-learning","r-package","rstats"],"created_at":"2025-10-22T04:00:51.994Z","updated_at":"2025-10-22T04:01:57.067Z","avatar_url":"https://github.com/praktiskt.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# featuretoolsR\nAn R interface to the Python module Featuretools.\n\n# General\n`featuretoolsR` provides functionality from the Python module `featuretools`, which aims to automate feature engineering. This package is very much a work in progress as Featuretools offers a lot of functionality. Any PRs are much appreciated.\n\n# Installing\n\n## Package\n### CRAN\nThe latest stable release is found on [CRAN](https://cran.r-project.org/package=featuretoolsR).\n\n### Github\nYou can get the latest version of `featuretoolsR` by installing it straight from Github:  `devtools::install_github(\"magnusfurugard/featuretoolsR\")`.\n\n## Featuretools\nYou'll need to have a working Python environment as well as `featuretools` installed. The recommended way is to use the built-in function `install_featuretools()` which automatically sets up a virtual environment for the package and installs `featuretools`.\n\n# Usage\nAll functions in `featuretoolsR` comes with documentation, but it's advised to briefly browse through the [Featuretools Python documentation](https://docs.featuretools.com/). It'll cover things like `entities`, `relationships` and `dfs`. \n\n## Creating an entityset\nAn entityset is the set which contain all your entities. To create a set and add an entity straight away, you can use `as_entityset`. \n```\n# Libs\nlibrary(featuretoolsR)\nlibrary(magrittr)\n\n# Create some mock data\nset_1 \u003c- data.frame(key = 1:100, value = sample(letters, 100, T), a = rep(Sys.Date(), 100))\nset_2 \u003c- data.frame(key = 1:100, value = sample(LETTERS, 100, T), b = rep(Sys.time(), 100))\n\n# Create entityset\nes \u003c- as_entityset(\n  set_1, \n  index = \"key\", \n  entity_id = \"set_1\", \n  id = \"demo\", \n  time_index = \"a\"\n)\n```\n\n## Adding entities\nTo add entities (i.e if you have relational data across multiple `data.frames`), this can be achieved with `add_entity`. This function is pipe friendly. For this demo-case, we'll use `set_2`.\n```\nes \u003c- es %\u003e%\n  add_entity(\n    df = set_2, \n    entity_id = \"set_2\", \n    index = \"key\", \n    time_index = \"b\"\n  )\n```\n\n## Defining relationships\nWith relational data, it's useful to define a relationship between two or more entities. This can be done with `add_relationship`.\n```\nes \u003c- es %\u003e%\n  add_relationship(\n    parent_set = \"set_1\", \n    child_set = \"set_2\", \n    parent_idx = \"key\", \n    child_idx = \"key\"\n  )\n```\n\n## Deep feature synthesis\nThe bread and butter of Featuretools is the `dfs`-function (official docs [here](https://docs.featuretools.com/en/stable/automated_feature_engineering/afe.html)). It will attempt to create features based on `*_primitives` you provide (more on primitives below).\n```\nft_matrix \u003c- es %\u003e%\n  dfs(\n    target_entity = \"set_1\", \n    trans_primitives = c(\"and\", \"cum_sum\")\n  )\n```\n\n## Tidying up\nTo use the new data.frame/features created by `dfs`, a function unique for `featuretoolsR`, `tidy_feature_matrix` can be used. A few \"nice-to-have\" arguments can be passed to clean the new data, like removing near zero variance variables, as well as replacing `NaN` with `NA`.\n```\ntidy \u003c- tidy_feature_matrix(ft_matrix, remove_nzv = T, nan_is_na = T, clean_names = T)\n```\n\n# Primitives\nFeaturetools supports a lot of primitives. These are accessible with the function `list_primitives()` which returns a data.frame containing type (aggregation (`agg_primitives`) or transform (`trans_primitives`)), name (in the example above, \"and\" and \"divide\") as well as a brief description of the primitive itself.\n\n# Credits\n[reticulate](https://github.com/rstudio/reticulate) - an R interface to Python.\n\n[Featuretools](https://github.com/Featuretools/featuretools)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpraktiskt%2Ffeaturetoolsr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpraktiskt%2Ffeaturetoolsr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpraktiskt%2Ffeaturetoolsr/lists"}