{"id":26412788,"url":"https://github.com/alteryx/autonormalize","last_synced_at":"2025-08-21T12:46:49.548Z","repository":{"id":41864964,"uuid":"193964369","full_name":"alteryx/autonormalize","owner":"alteryx","description":"python library for automated dataset normalization","archived":false,"fork":false,"pushed_at":"2023-07-20T13:19:29.000Z","size":19735,"stargazers_count":116,"open_issues_count":21,"forks_count":16,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-07-22T18:40:39.141Z","etag":null,"topics":["automatic","automatic-normalization","normalization"],"latest_commit_sha":null,"homepage":"https://blog.featurelabs.com/automatic-dataset-normalization-for-feature-engineering-in-python/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alteryx.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"contributing.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-26T19:18:25.000Z","updated_at":"2025-07-01T01:04:30.000Z","dependencies_parsed_at":"2024-06-19T11:14:14.157Z","dependency_job_id":"2674dd06-7787-4aec-aed8-bd6c30e9f940","html_url":"https://github.com/alteryx/autonormalize","commit_stats":{"total_commits":110,"total_committers":13,"mean_commits":8.461538461538462,"dds":"0.44545454545454544","last_synced_commit":"a484d634489af8c78ea49241f388e64ee028276c"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/alteryx/autonormalize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alteryx%2Fautonormalize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alteryx%2Fautonormalize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alteryx%2Fautonormalize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alteryx%2Fautonormalize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alteryx","download_url":"https://codeload.github.com/alteryx/autonormalize/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alteryx%2Fautonormalize/sbom","scorecard":{"id":187143,"data":{"date":"2025-08-11","repo":{"name":"github.com/alteryx/autonormalize","commit":"a484d634489af8c78ea49241f388e64ee028276c"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.2,"checks":[{"name":"Code-Review","score":8,"reason":"Found 20/23 approved changesets -- score normalized to 8","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":0,"reason":"dangerous workflow patterns detected","details":["Warn: script injection with untrusted input ' github.event.pull_request.head.ref ': .github/workflows/release_notes_updated.yml:15"],"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/build_docs.yml:1","Warn: no topLevel permission defined: .github/workflows/entry_point_test.yml:1","Warn: no topLevel permission defined: .github/workflows/lint_check.yml:1","Warn: no topLevel permission defined: .github/workflows/release.yml:1","Warn: no topLevel permission defined: .github/workflows/release_notes_updated.yml:1","Warn: no topLevel permission defined: .github/workflows/unit_tests_with_latest_deps.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build_docs.yml:15: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/build_docs.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build_docs.yml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/build_docs.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/entry_point_test.yml:15: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/entry_point_test.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/entry_point_test.yml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/entry_point_test.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/lint_check.yml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/lint_check.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/lint_check.yml:22: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/lint_check.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release.yml:11: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/release.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/release.yml:13: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/release.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release_notes_updated.yml:29: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/release_notes_updated.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/unit_tests_with_latest_deps.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:22: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/unit_tests_with_latest_deps.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:48: update your workflow using https://app.stepsecurity.io/secureworkflow/alteryx/autonormalize/unit_tests_with_latest_deps.yml/main?enable=pin","Warn: pipCommand not pinned by hash: release/upload.sh:13","Warn: pipCommand not pinned by hash: .github/workflows/build_docs.yml:30","Warn: pipCommand not pinned by hash: .github/workflows/build_docs.yml:31","Warn: pipCommand not pinned by hash: .github/workflows/build_docs.yml:32","Warn: pipCommand not pinned by hash: .github/workflows/entry_point_test.yml:28","Warn: pipCommand not pinned by hash: .github/workflows/entry_point_test.yml:29","Warn: pipCommand not pinned by hash: .github/workflows/lint_check.yml:29","Warn: pipCommand not pinned by hash: .github/workflows/lint_check.yml:30","Warn: pipCommand not pinned by hash: .github/workflows/lint_check.yml:31","Warn: pipCommand not pinned by hash: .github/workflows/lint_check.yml:32","Warn: pipCommand not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:31","Warn: pipCommand not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:33","Warn: pipCommand not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:34","Warn: pipCommand not pinned by hash: .github/workflows/unit_tests_with_latest_deps.yml:38","Info:   0 out of  10 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 third-party GitHubAction dependencies pinned","Info:   0 out of  14 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: BSD 3-Clause \"New\" or \"Revised\" License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":0,"reason":"10 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-29gw-9793-fvw7","Warn: Project is vulnerable to: GHSA-9jmq-rx5f-8jwq","Warn: Project is vulnerable to: PYSEC-2023-117 / GHSA-mrwq-x4v8-fh7p","Warn: Project is vulnerable to: PYSEC-2021-856 / GHSA-5545-2q6w-2gh6","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: PYSEC-2019-108 / GHSA-9fq2-x9r6-wfmf","Warn: Project is vulnerable to: PYSEC-2021-857 / GHSA-f7c7-j99h-c22f","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: GHSA-g7vv-2v7x-gj9p"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 27 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-16T19:59:09.248Z","repository_id":41864964,"created_at":"2025-08-16T19:59:09.248Z","updated_at":"2025-08-16T19:59:09.248Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271484535,"owners_count":24767765,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automatic","automatic-normalization","normalization"],"created_at":"2025-03-17T22:09:18.496Z","updated_at":"2025-08-21T12:46:49.500Z","avatar_url":"https://github.com/alteryx.png","language":"Python","readme":"# AutoNormalize\n\n![Tests](https://github.com/FeatureLabs/autonormalize/workflows/Tests/badge.svg)\n\nAutoNormalize is a Python library for automated datatable normalization. It allows you to build an `EntitySet` from a single denormalized table and generate features for machine learning using [Featuretools](https://github.com/FeatureLabs/featuretools).\n\n\u003cimg src=https://github.com/FeatureLabs/autonormalize/blob/main/gif.gif\u003e\n\n## Getting Started\n\n- [Install](#install)\n- [Demos](#demos)\n- [API Reference](#api-reference)\n\n## Install\n\n```shell\npip install featuretools[autonormalize]\n```\n\n#### Uninstall\n\n```shell\npip uninstall autonormalize\n```\n\n## Demos\n\n- [Blog Post](https://blog.featurelabs.com/automatic-dataset-normalization-for-feature-engineering-in-python/)\n- [Machine Learning Demo with Featuretools](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/AutoNormalize%20%2B%20FeatureTools%20Demo.ipynb)\n- [Kaggle Liquor Sales Dataset Demo](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Kaggle%20Liquor%20Sales%20Dataset%20Demo.ipynb)\n- [Demo with Editing Dependencies](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Editing%20Dependnecies%20Demo.ipynb)\n- [Kaggle Food Production Dataset Demo](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Kaggle%20Food%20%20Dataset%20Demo.ipynb)\n\n## API Reference\n\n### `auto_entityset`\n\n```shell\nauto_entityset(df, accuracy=0.98, index=None, name=None, time_index=None)\n```\n\nCreates a normalized entityset from a dataframe.\n\n**Arguments:**\n\n- `df` (pd.Dataframe) : the dataframe containing data\n\n- `accuracy` (0 \u003c float \u003c= 1.00; default = 0.98) : the accuracy threshold required in order to conclude a dependency (i.e. with accuracy = 0.98, 0.98 of the rows must hold true the dependency LHS --\u003e RHS)\n\n- `index` (str, optional) : name of column that is intended index of df\n\n- `name` (str, optional) : the name of created EntitySet\n\n- `time_index` (str, optional) : name of time column in the dataframe.\n\n**Returns:**\n\n- `entityset` (ft.EntitySet) : created entity set\n\n### `find_dependencies`\n\n```shell\nfind_dependencies(df, accuracy=0.98, index=None)\n```\n\nFinds dependencies within dataframe with the DFD search algorithm.\n\n**Returns:**\n\n- `dependencies` (Dependencies) : the dependencies found in the data within the contraints provided\n\n### `normalize_dataframe`\n\n```shell\nnormalize_dataframe(df, dependencies)\n```\n\nNormalizes dataframe based on the dependencies given. Keys for the newly created DataFrames can only be columns that are strings, ints, or categories. Keys are chosen according to the priority:\n\n1. shortest lenghts\n2. has \"id\" in some form in the name of an attribute\n3. has attribute furthest to left in the table\n\n**Returns:**\n\n- `new_dfs` (list[pd.DataFrame]) : list of new dataframes\n\n\u003cbr /\u003e\n\n### `make_entityset`\n\n```shell\nmake_entityset(df, dependencies, name=None, time_index=None)\n```\n\nCreates a normalized EntitySet from dataframe based on the dependencies given. Keys are chosen in the same fashion as for `normalize_dataframe`and a new index will be created if any key has more than a single attribute.\n\n**Returns:**\n\n- `entityset` (ft.EntitySet) : created EntitySet\n\n\u003cbr /\u003e\n\n### `normalize_entityset`\n\n```shell\nnormalize_entityset(es, accuracy=0.98)\n```\n\nReturns a new normalized `EntitySet` from an `EntitySet` with a single entity.\n\n**Arguments:**\n\n- `es` (ft.EntitySet) : EntitySet with a single entity to normalize\n\n**Returns:**\n\n- `new_es` (ft.EntitySet) : new normalized EntitySet\n\n\u003cbr /\u003e\n\n## Built at Alteryx Innovation Labs\n\n\u003ca href=\"https://www.alteryx.com/innovation-labs\"\u003e\n    \u003cimg src=\"https://evalml-web-images.s3.amazonaws.com/alteryx_innovation_labs.png\" alt=\"Alteryx Innovation Labs\" /\u003e\n\u003c/a\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falteryx%2Fautonormalize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falteryx%2Fautonormalize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falteryx%2Fautonormalize/lists"}