{"id":18652918,"url":"https://github.com/elixir-nx/scidata","last_synced_at":"2025-04-05T11:12:41.760Z","repository":{"id":40375332,"uuid":"358652780","full_name":"elixir-nx/scidata","owner":"elixir-nx","description":"Download and normalize datasets related to science","archived":false,"fork":false,"pushed_at":"2023-08-11T16:59:02.000Z","size":88,"stargazers_count":163,"open_issues_count":3,"forks_count":12,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-29T10:10:32.714Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elixir-nx.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-16T16:05:49.000Z","updated_at":"2025-03-04T15:45:32.000Z","dependencies_parsed_at":"2025-01-12T05:07:05.582Z","dependency_job_id":"97ee54a8-0165-4ff0-8b22-2b00091b799b","html_url":"https://github.com/elixir-nx/scidata","commit_stats":{"total_commits":46,"total_committers":10,"mean_commits":4.6,"dds":0.4782608695652174,"last_synced_commit":"ea7a4883464584530b6a49f95b0eea2f5b954b80"},"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elixir-nx%2Fscidata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elixir-nx%2Fscidata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elixir-nx%2Fscidata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elixir-nx%2Fscidata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elixir-nx","download_url":"https://codeload.github.com/elixir-nx/scidata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247325695,"owners_count":20920714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T07:09:22.680Z","updated_at":"2025-04-05T11:12:41.720Z","avatar_url":"https://github.com/elixir-nx.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scidata\n\n## Usage\n\nScidata currently supports the following training and test datasets:\n\n- Caltech101\n- CIFAR10\n- CIFAR100\n- FashionMNIST\n- IMDb Reviews\n- Kuzushiji-MNIST (KMNIST)\n- MNIST\n- SQuAD\n- Yelp Reviews (Full and Polarity)\n- Iris\n- Wine\n\nDownload or fetch datasets locally:\n\n```elixir\n{train_images, train_labels} = Scidata.MNIST.download()\n{test_images, test_labels} = Scidata.MNIST.download_test()\n\n# Unpack train_images like...\n{images_binary, tensor_type, shape} = train_images\n```\n\nMost often you will convert those results to `Nx` tensors:\n\n```elixir\n{train_images, train_labels} = Scidata.MNIST.download()\n\n# Normalize and batch images\n{images_binary, images_type, images_shape} = train_images\n\nbatched_images =\n  images_binary\n  |\u003e Nx.from_binary(images_type)\n  |\u003e Nx.reshape(images_shape)\n  |\u003e Nx.divide(255)\n  |\u003e Nx.to_batched(32)\n\n# One-hot-encode and batch labels\n{labels_binary, labels_type, _shape} = train_labels\n\nbatchd_labels =\n  labels_binary\n  |\u003e Nx.from_binary(labels_type)\n  |\u003e Nx.new_axis(-1)\n  |\u003e Nx.equal(Nx.tensor(Enum.to_list(0..9)))\n  |\u003e Nx.to_batched(32)\n```\n\n## Installation\n\n```elixir\ndef deps do\n  [\n    {:scidata, \"~\u003e 0.1.11\"}\n  ]\nend\n```\n\n## Contributing\n\nPRs are encouraged! Consider using [utils](https://github.com/elixir-nx/scidata/blob/master/lib/scidata/utils.ex) to add your favorite dataset or one from [this list](https://github.com/elixir-nx/scidata/issues/16).\n\n## License\n\nCopyright (c) 2022 Tom Rutten\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felixir-nx%2Fscidata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felixir-nx%2Fscidata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felixir-nx%2Fscidata/lists"}