{"id":26053576,"url":"https://github.com/datasets-org/datasets","last_synced_at":"2025-07-31T08:07:47.678Z","repository":{"id":151900013,"uuid":"77234815","full_name":"datasets-org/datasets","owner":"datasets-org","description":"Library and CLI tools for datasets toolkit","archived":false,"fork":false,"pushed_at":"2018-01-11T20:23:07.000Z","size":23,"stargazers_count":1,"open_issues_count":4,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-08T07:43:18.108Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datasets-org.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-12-23T15:40:32.000Z","updated_at":"2017-12-21T22:02:27.000Z","dependencies_parsed_at":"2023-05-29T02:45:54.566Z","dependency_job_id":null,"html_url":"https://github.com/datasets-org/datasets","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/datasets-org/datasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasets-org%2Fdatasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasets-org%2Fdatasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasets-org%2Fdatasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasets-org%2Fdatasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datasets-org","download_url":"https://codeload.github.com/datasets-org/datasets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasets-org%2Fdatasets/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268010097,"owners_count":24180459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-31T02:00:08.723Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-08T07:43:14.562Z","updated_at":"2025-07-31T08:07:47.668Z","avatar_url":"https://github.com/datasets-org.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Datasets\nThis project is part of **Datasets** toolkit.\n\nRunning server (https://github.com/tivvit/datasets_server) is needed for this\n project to be useful.\n\n## Install \n```sh\npip install datasetstools\n```\n\n## CLI\n`datasets` command is provided after the installation \n### Usage\nIt is recommended to configure the server address first. You may always \nprovide it with `-s`.\n\n#### config\n```sh\ndatasets config\nServer address (example.com): localhost\nPort: 8000\n```\nThe configuration will be saved to `~/.datasets` and will be used also by the\n python library.\n\n#### new\nGenerate new UID for the data set and creates file `dataset.yaml` with \nprefilled structure. \n\n#### scan\nRescan the data sets.\n\n#### info, usages, chagelog\n`info` shows all the information about the data set. The data set is \nrecognized based on `dataset.yaml` which is searched bottom-up. \n\n`usages` shows only usages and `changelog` only the changelog respectively\n\n## Lib\nPython library for interacting with the **Datasets**.\n\n### Init\n```python\nfrom datasets import Datasets\n\nds = Datasets()\n# Without args the address in ~/.datasets will be used or {\"addres\": \"http:localhost:5000\"} may be used\n```\n\n### Info\nReturns information about the data set identified by the UID. Second param - \n[usage](#usage-log)\n```python\nds.info(\"8b88a424-dbd8-4032-8be7-a930a415b9a5\", {\"user\": \"tivvit\"})\n```\n### Paths\nReturns list of paths where the data set may be found. Second param - \n[usage](#usage-log)\n```python\nds.paths(\"8b88a424-dbd8-4032-8be7-a930a415b9a5\", {\"user\": \"tivvit\"})\n# [\"/data/a\", \"/data/b\"]\n```\n\n### Create\nCreates data set in the database. Useful for pragmatical data set creation.\n\n- `data` - dict with the data set attributes\n- `path` - path where should the `dataset.yaml` should be created (optional).\n\nReturns data set UID. \n```python\nds.create(data={\"name\": \"Best DS\", ...}, path=\"\")\n# \"8b88a424-dbd8-4032-8be7-a930a415b9a5\"\n```\n\n## Usage log\nActions are \nlogged to the usage log, the second parameter is optional and will be stored \nin the usage log.\n\n## Development\n\nFeel free to contribute.\n\n## Copyright and License\n\u0026copy; 2016 [Vít Listík](http://tivvit.cz)\n\nReleased under [MIT license](https://github.com/tivvit/datasets/blob/master/LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatasets-org%2Fdatasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatasets-org%2Fdatasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatasets-org%2Fdatasets/lists"}