{"id":34714612,"url":"https://github.com/srlearn/relational-datasets","last_synced_at":"2026-03-15T11:32:50.179Z","repository":{"id":37246973,"uuid":"385275404","full_name":"srlearn/relational-datasets","owner":"srlearn","description":"Python package for fetching and using srlearn-compatible relational datasets.","archived":false,"fork":false,"pushed_at":"2022-11-09T15:40:51.000Z","size":2389,"stargazers_count":4,"open_issues_count":8,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-22T07:44:49.720Z","etag":null,"topics":["datasets","inductive-logic-programming","relational-learning","statistical-relational-learning"],"latest_commit_sha":null,"homepage":"https://srlearn.github.io/relational-datasets/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srlearn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-12T14:26:49.000Z","updated_at":"2023-05-09T15:20:28.000Z","dependencies_parsed_at":"2023-01-21T04:02:44.399Z","dependency_job_id":null,"html_url":"https://github.com/srlearn/relational-datasets","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/srlearn/relational-datasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srlearn%2Frelational-datasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srlearn%2Frelational-datasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srlearn%2Frelational-datasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srlearn%2Frelational-datasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srlearn","download_url":"https://codeload.github.com/srlearn/relational-datasets/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srlearn%2Frelational-datasets/sbom","scorecard":{"id":844084,"data":{"date":"2025-08-11","repo":{"name":"github.com/srlearn/relational-datasets","commit":"893c45197616cb55f43fd2fd1aa9baa98f1a974e"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4.1,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/14 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:27: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/codeql.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:30: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/codeql.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:36: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/codeql.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:39: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/codeql.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy-docs.yml:16: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/deploy-docs.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy-docs.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/deploy-docs.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-package.yml:26: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/python-package.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-package.yml:28: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/python-package.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/python-package.yml:46: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/python-package.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pythonpublish.yml:12: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/pythonpublish.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pythonpublish.yml:14: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/pythonpublish.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pythonpublish.yml:25: update your workflow using https://app.stepsecurity.io/secureworkflow/srlearn/relational-datasets/pythonpublish.yml/main?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/deploy-docs.yml:27","Warn: pipCommand not pinned by hash: .github/workflows/deploy-docs.yml:28","Warn: pipCommand not pinned by hash: .github/workflows/python-package.yml:33","Warn: pipCommand not pinned by hash: .github/workflows/python-package.yml:34","Warn: pipCommand not pinned by hash: .github/workflows/python-package.yml:35","Warn: pipCommand not pinned by hash: .github/workflows/pythonpublish.yml:19","Warn: pipCommand not pinned by hash: .github/workflows/pythonpublish.yml:20","Info:   0 out of  10 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 third-party GitHubAction dependencies pinned","Info:   0 out of   7 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Info: jobLevel 'actions' permission set to 'read': .github/workflows/codeql.yml:16","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:17","Warn: no topLevel permission defined: .github/workflows/codeql.yml:1","Warn: no topLevel permission defined: .github/workflows/deploy-docs.yml:1","Warn: no topLevel permission defined: .github/workflows/python-package.yml:1","Warn: no topLevel permission defined: .github/workflows/pythonpublish.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":8,"reason":"2 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":7,"reason":"SAST tool detected but not run on all commits","details":["Info: SAST configuration detected: CodeQL","Warn: 0 commits out of 25 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-23T21:03:42.490Z","repository_id":37246973,"created_at":"2025-08-23T21:03:42.490Z","updated_at":"2025-08-23T21:03:42.490Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30540946,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-15T07:17:37.589Z","status":"ssl_error","status_checked_at":"2026-03-15T07:17:31.738Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datasets","inductive-logic-programming","relational-learning","statistical-relational-learning"],"created_at":"2025-12-25T00:55:01.269Z","updated_at":"2026-03-15T11:32:50.174Z","avatar_url":"https://github.com/srlearn.png","language":"Python","readme":"# relational-datasets\n\n*A small library for loading and downloading relational datasets.*\n\n```bash\npip install relational-datasets\n```\n\n[![PyPi Version](https://img.shields.io/pypi/v/relational-datasets)](https://pypi.org/project/relational-datasets/)\n[![License](https://img.shields.io/github/license/srlearn/relational-datasets)](https://github.com/srlearn/relational-datasets/blob/main/LICENSE)\n[![Total alerts](https://img.shields.io/lgtm/alerts/g/srlearn/relational-datasets.svg?logo=lgtm\u0026logoWidth=18)](https://lgtm.com/projects/g/srlearn/relational-datasets/alerts/)\n[![codecov](https://codecov.io/gh/srlearn/relational-datasets/branch/main/graph/badge.svg?token=lutvcUSBRF)](https://codecov.io/gh/srlearn/relational-datasets)\n[![Python Package Builds](https://github.com/srlearn/relational-datasets/actions/workflows/python-package.yml/badge.svg)](https://github.com/srlearn/relational-datasets/actions/workflows/python-package.yml)\n[![Documentation Deploy](https://github.com/srlearn/relational-datasets/actions/workflows/deploy-docs.yml/badge.svg)](https://github.com/srlearn/relational-datasets/actions/workflows/deploy-docs.yml)\n\n## Beta Release\n\nThis API and the datasets at\n[https://github.com/srlearn/datasets/](https://github.com/srlearn/datasets/)\nare currently being experimented with.\n\n- \u003cimg src='https://avatars.githubusercontent.com/u/743164?s=200\u0026v=4' height='20' width='20'/\u003e\u003c/a\u003e Prefer *Julia*? Check out [**RelationalDatasets.jl**](https://github.com/srlearn/RelationalDatasets.jl).\n\nOpen enhancements and bugs are tracked here:\n\n- [Issues: relational-datasets package](https://github.com/srlearn/relational-datasets/issues)\n- [Issues: datasets](https://github.com/srlearn/datasets/issues)\n\nBut here is a short-term Roadmap:\n\n- [ ] Modes: [srlearn/datasets: Issue 11](https://github.com/srlearn/datasets/issues/11)\n- [ ] Converting propositional-\u003erelational\n  - [ ] Problem Settings\n    - [X] Binary Classification\n      - [X] Classification: (0, 1)\n      - [ ] Classification: (-1, 1)\n      - [ ] Classification: maybe recommend [`sklearn.preprocessing.LabelBinarizer`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelBinarizer.html)\n    - [X] Regression\n      - [X] Regression: y ∈ `float`\n    - [ ] Multiclass Classification: When target is `int` and in `[0, 1, 2, ...]`\n  - [ ] Categorical datatype support in `X` matrix.\n  - [ ] Dataframes: `pandas`\n\n## Use Case 1: Fetching Zipfiles\n\n**Running** the `fetch` method downloads a version of a datset to your local cache:\n\n```python\nimport relational_datasets\n\nrelational_datasets.fetch(\"toy_cancer\")\nrelational_datasets.fetch(\"toy_father\", \"v0.0.3\")\nrelational_datasets.fetch(\"cora\")\n```\n\n**Resulting in**:\n\n```console\n~/relational_datasets/\n├── toy_cancer_v0.0.4.zip   \u003c--- latest\n├── toy_father_v0.0.3.zip   \u003c--- specific version\n└── cora_v0.0.4.zip         \u003c--- latest\n```\n\n## Use Case 2: Loading Data\n\nThe `load` method returns train and test folds—each with `pos`, `neg`, and\n`facts`. Internally it uses `fetch`, so it will automatically download a\ndataset if it is not available.\n\nFor example: \"*Load fold-2 of webkb*\"\n\n```python\nfrom relational_datasets import load\n\ntrain, test = load(\"webkb\", \"v0.0.4\", fold=2)\n\nlen(train.facts)\n# 1344\n```\n\n## Use Case 3: Working with Standard (Vector-based) Machine Learning Datasets\n\nThe `relational_datasets.convert` module has functions for\nconverting vector-based datasets into relational/ILP-style\ndatasets:\n\n### Binary Classification\n\n*When `y` is a vector of 0/1*\n\n```python\nfrom relational_datasets.convert import from_numpy\nimport numpy as np\n\ndata, modes = from_numpy(\n  np.array([[0, 1, 1], [0, 1, 2], [1, 2, 2]]),\n  np.array([0, 0, 1]),\n)\n\ndata, modes\n```\n\n```console\n(RelationalDataset(pos=['v4(id3).'], neg=['v4(id1).', 'v4(id2).'], facts=['v1(id1,0).', 'v1(id2,0).', 'v1(id3,1).', 'v2(id1,1).', 'v2(id2,1).', 'v2(id3,2).', 'v3(id1,1).', 'v3(id2,2).', 'v3(id3,2).']),\n['v1(+id,#varv1).', 'v2(+id,#varv2).', 'v3(+id,#varv3).', 'v4(+id).'])\n```\n\n### Regression\n\n*When `y` is a vector of floats*\n\n```python\nfrom relational_datasets.convert import from_numpy\nimport numpy as np\n\ndata, modes = from_numpy(\n  np.array([[0, 1, 1], [0, 1, 2], [1, 2, 2]]),\n  np.array([1.1, 0.9, 2.5]),\n)\n\ndata, modes\n```\n\n```console\n(RelationalDataset(pos=['regressionExample(v4(id1),1.1).', 'regressionExample(v4(id2),0.9).', 'regressionExample(v4(id3),2.5).'], neg=[], facts=['v1(id1,0).', 'v1(id2,0).', 'v1(id3,1).', 'v2(id1,1).', 'v2(id2,1).', 'v2(id3,2).', 'v3(id1,1).', 'v3(id2,2).', 'v3(id3,2).']),\n['v1(+id,#varv1).', 'v2(+id,#varv2).', 'v3(+id,#varv3).', 'v4(+id).'])\n```\n\n### Preprocessing scikit-learn's `load_breast_cancer`\n\n[`load_breast_cancer`](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html)\nis based on the\n[Breast Cancer Wisconsin dataset](https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)).\n\nHere we: (**1**) load the data and class labels,\n(**2**) split into training and test sets, (**3**) bin the continuous\nfeatures to discrete, and (**4**) convert to the relational format.\n\n```python\nfrom sklearn.datasets import load_breast_cancer\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.preprocessing import KBinsDiscretizer\n\n# (1) Load\nX, y = load_breast_cancer(return_X_y=True)\n\n# (2) Split\nX_train, X_test, y_train, y_test = train_test_split(X, y)\n\n# (3) Discretize\ndisc = KBinsDiscretizer(n_bins=5, encode=\"ordinal\")\nX_train = disc.fit_transform(X_train)\nX_test = disc.transform(X_test)\nX_train = X_train.astype(int)\nX_test = X_test.astype(int)\n\n# (4) Convert\nfrom relational_datasets.convert import from_numpy\n\ntrain, modes = from_numpy(X_train, y_train)\ntest, _ = from_numpy(X_test, y_test)\n```\n\n## Install\n\n### From PyPi\n\n```bash\npip install relational-datasets\n```\n\n### From GitHub Source\n\n```bash\ngit clone https://github.com/srlearn/relational-datasets.git\ncd relational-datasets\npip install -e .\n```\n\n## Contributions\n\n- [Alexander Hayes](https://hayesall.com) - *Indiana University, Bloomington*\n\nThis package was partially based on datasets from the\n[Starling Lab Datasets Collection](https://starling.utdallas.edu/datasets/),\nwhich included specific contributions by\n[Harsha Kokel](https://harshakokel.com/) and\n[Devendra Singh Dhami](https://sites.google.com/view/devendradhami).\n[Tushar Khot](https://allenai.org/team/tushark) converted many to the ILP\nformat from Alchemy 2 format, but that occurred before versions were tracked.\nSome inspiration was drawn from the\n\"[RelationalDatasets](https://github.com/joschout/RelationalDatasets)\" list that\n[Jonas Schouterden](https://people.cs.kuleuven.be/~jonas.schouterden/) collected.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrlearn%2Frelational-datasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrlearn%2Frelational-datasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrlearn%2Frelational-datasets/lists"}