{"id":23029753,"url":"https://github.com/antononcube/raku-data-exampledatasets","last_synced_at":"2026-06-17T20:32:08.201Z","repository":{"id":61719239,"uuid":"436819705","full_name":"antononcube/Raku-Data-ExampleDatasets","owner":"antononcube","description":"Raku package for (obtaining) example datasets.","archived":false,"fork":false,"pushed_at":"2024-04-17T01:39:41.000Z","size":247,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-25T13:19:56.133Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Raku","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"artistic-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antononcube.png","metadata":{"files":{"readme":"README-work.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-10T02:01:18.000Z","updated_at":"2023-03-30T17:29:33.000Z","dependencies_parsed_at":"2024-04-17T02:47:41.147Z","dependency_job_id":"9c2d6951-2783-4295-9d33-c30bb5a80b15","html_url":"https://github.com/antononcube/Raku-Data-ExampleDatasets","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/antononcube/Raku-Data-ExampleDatasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Data-ExampleDatasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Data-ExampleDatasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Data-ExampleDatasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Data-ExampleDatasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antononcube","download_url":"https://codeload.github.com/antononcube/Raku-Data-ExampleDatasets/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Data-ExampleDatasets/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34465319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-15T14:16:59.675Z","updated_at":"2026-06-17T20:32:08.196Z","avatar_url":"https://github.com/antononcube.png","language":"Raku","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data::ExampleDatasets Raku package\n\n[![Actions Status](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions/workflows/linux.yml/badge.svg)](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions)\n[![Actions Status](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions/workflows/macos.yml/badge.svg)](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions)\n[![Actions Status](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions/workflows/windows.yml/badge.svg)](https://github.com/antononcube/Raku-Data-ExampleDatasets/actions)\n[![SparkyCI](https://ci.sparrowhub.io/project/gh-antononcube-Raku-Data-ExampleDatasets/badge)](https://ci.sparrowhub.io)\n\n[![](https://raku.land/zef:antononcube/Data::ExampleDatasets/badges/version)](https://raku.land/zef:antononcube/Data::ExampleDatasets)\n[![License: Artistic-2.0](https://img.shields.io/badge/License-Artistic%202.0-0298c3.svg)](https://opensource.org/licenses/Artistic-2.0)\n\n[![SparrowCI](https://ci.sparrowhub.io/project/gh-antononcube-Raku-Data-ExampleDatasets/badge)](https://ci.sparrowhub.io)\n\nRaku package for (obtaining) example datasets.\n\nCurrently, this repository contains only [datasets metadata](./resources/dfRdatasets.csv).\nThe datasets are downloaded from the repository \n[Rdatasets](https://github.com/vincentarelbundock/Rdatasets/),\n[VAB1].\n\n------\n\n## Usage examples\n\n### Setup\n\nHere we load the Raku modules\n[`Data::Generators`](https://modules.raku.org/dist/Data::Generators:cpan:ANTONOV),\n[`Data::Summarizers`](https://github.com/antononcube/Raku-Data-Summarizers),\nand this module,\n[`Data::ExampleDatasets`](https://github.com/antononcube/Raku-Data-ExampleDatasets):\n\n```perl6\nuse Data::Reshapers;\nuse Data::Summarizers;\nuse Data::ExampleDatasets;\n```\n\n### Get a dataset by using an identifier\n\nHere we get a dataset by using an identifier and display part of the obtained dataset:\n\n```perl6\nmy @tbl = example-dataset('Baumann', :headers);\nsay to-pretty-table(@tbl[^6]);\n```\n\nHere we summarize the dataset obtained above:\n\n```perl6\nrecords-summary(@tbl)\n```\n\n**Remark**:  The values for the first argument of `example-dataset` correspond to the values \nof the columns \"Item\" and \"Package\", respectively, in theA\n[metadata dataset](https://vincentarelbundock.github.io/Rdatasets/articles/data.html) \nfrom the GitHub repository \"Rdatasets\", [VAB1]. \nSee the datasets metadata sub-section below.\n\nThe first argument of `example-dataset` can take as values:\n\n\n- Strings that correspond to the column \"Items\" of the metadata dataset\n\n  - E.g. `example-dataset(\"mtcars\")`\n\n- Strings that correspond to the columns \"Package\" and \"Items\" of the metadata dataset\n    \n  - E.g. `example-dataset(\"COUNT::titanic\")`\n\n- Regexes\n\n  - E.g. `example-dataset(/ .* mann $ /)`\n\n- `Whatever` or `WhateverCode`\n\n### Get a dataset by using an URL\n\nHere we get a dataset by using an URL and display a summary of the obtained dataset:\n\n```perl6\nmy $url = 'https://raw.githubusercontent.com/antononcube/Raku-Data-Reshapers/main/resources/dfTitanic.csv';\nmy @tbl2 = example-dataset($url, :headers);\nrecords-summary(@tbl2, field-names =\u003e \u003cid passengerSex passengerClass passengerAge passengerSurvival\u003e);\n```\n\n### Datasets metadata\n\nHere we:\n1. Get the dataset of the datasets metadata\n2. Filter it to have only datasets with 13 rows\n3. Keep only the columns \"Item\", \"Title\", \"Rows\", and \"Cols\"\n4. Display it in \"pretty table\" format\n\n```perl6\nmy @tblMeta = get-datasets-metadata();\n@tblMeta = @tblMeta.grep({ $_\u003cRows\u003e == 13}).map({ $_.grep({ $_.key (elem) \u003cItem Title Rows Cols\u003e}).Hash });\nsay to-pretty-table(@tblMeta, field-names =\u003e \u003cItem Title Rows Cols\u003e)\n```\n\n### Keeping downloaded data\n\nBy default the data is obtained over the web from\n[Rdatasets](https://github.com/vincentarelbundock/Rdatasets/),\nbut `example-dataset` has an option to keep the data \"locally.\"\n(The data is saved in `XDG_DATA_HOME`, see \n[[JS1](https://modules.raku.org/dist/XDG::BaseDirectory:cpan:JSTOWE)].)\n\nThis can be demonstrated with the following timings of a dataset with ~1300 rows:\n\n```raku\nmy $startTime = now;\nmy $data = example-dataset( / 'COUNT::titanic' $ / ):keep;\nmy $endTime = now;\nsay \"Geting the data first time took { $endTime - $startTime } seconds\";\n```\n\n```raku\n$startTime = now;\n$data = example-dataset( / 'COUNT::titanic' $/ ):keep;\n$endTime = now;\nsay \"Geting the data second time took { $endTime - $startTime } seconds\";\n```\n\n------\n\n## References\n\n### Functions, packages, repositories\n\n[AAf1] Anton Antonov,\n[`ExampleDataset`](https://resources.wolframcloud.com/FunctionRepository/resources/ExampleDataset),\n(2020),\n[Wolfram Function Repository](https://resources.wolframcloud.com/FunctionRepository).\n\n[VAB1] Vincent Arel-Bundock,\n[Rdatasets](https://github.com/vincentarelbundock/Rdatasets/),\n(2020),\n[GitHub/vincentarelbundock](https://github.com/vincentarelbundock).\n\n[JS1] Jonathan Stowe,\n[`XDG::BaseDirectory`](https://modules.raku.org/dist/XDG::BaseDirectory:cpan:JSTOWE),\n(last updated on 2021-03-31),\n[Raku Modules](https://modules.raku.org/).\n\n\n### Interactive interfaces\n\n[AAi1] Anton Antonov,\n[Example datasets recommender interface](https://antononcube.shinyapps.io/ExampleDatasetsRecommenderInterface/),\n(2021),\n[Shinyapps.io](https://antononcube.shinyapps.io/).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fraku-data-exampledatasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantononcube%2Fraku-data-exampledatasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fraku-data-exampledatasets/lists"}