Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kaggledatasets/kaggledatasets
Collection of Kaggle Datasets ready to use for Everyone (Looking for contributors)
https://github.com/kaggledatasets/kaggledatasets
data-science datasets deep-learning kaggle keras machine-learning python pytorch scikit-learn tensorflow
Last synced: about 1 month ago
JSON representation
Collection of Kaggle Datasets ready to use for Everyone (Looking for contributors)
- Host: GitHub
- URL: https://github.com/kaggledatasets/kaggledatasets
- Owner: kaggledatasets
- License: apache-2.0
- Created: 2019-11-16T08:59:19.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2019-12-18T10:55:34.000Z (almost 5 years ago)
- Last Synced: 2024-09-30T09:20:21.469Z (about 1 month ago)
- Topics: data-science, datasets, deep-learning, kaggle, keras, machine-learning, python, pytorch, scikit-learn, tensorflow
- Language: Python
- Homepage: https://kaggledatasets.github.io
- Size: 51.8 KB
- Stars: 49
- Watchers: 4
- Forks: 6
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# kaggledatasets
Collection of Kaggle Datasets ready to use for Everyone
![License](https://img.shields.io/github/license/kaggledatasets/kaggledatasets)
![Release](https://img.shields.io/github/v/release/kaggledatasets/kaggledatasets)
![Platform Support](https://img.shields.io/pypi/pyversions/kaggledatasets)- [More about Kaggle Datasets](#more-about-kaggle-datasets)
- [Installation](#installation)
- [Binaries](#binaries)
- [From Source](#from-source)
- [Getting Started](#getting-started)
- [Communication](#communication)
- [Releases and Contributing](#releases-and-contributing)
- [License](#license)| System | Python 3.5 | Python 3.6 | Python 3.7 |
|:-------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------:|
| Linux | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/1) | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/2) | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/3) |
| macOS | | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/4) | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/5) |
| Windows | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/6) | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/7) | ![Build Status](https://travis-matrix-badges.herokuapp.com/repos/kaggledatasets/kaggledatasets/branches/master/8) |## More About Kaggle Datasets
```python
import kaggledatasets as kdheart_disease = kd.structured.HeartDiseaseUCI(download=True)
# Returns the pandas data frame to be used in Scikit Learn or any other framework
df = heart_disease.data_frame()# Returns the tensorflow dataset type compatible with TF 2.0
dataset = heart_disease.load()
for batch, label in dataset.take(1):
for key, value in batch.items():
...# Returns the data loader for PyTorch
# Work in progress for PyTorch support
```## Installation
### Binaries
Commands to install from binaries via Conda or pip wheels are on our website: [https://kaggledatasets.github.io](https://kaggledatasets.github.io)
### From Source
#### Get the kaggledatasets Source
```
git clone --recursive https://github.com/kaggledatasets/kaggledatasets
cd kaggledatasets
```#### Install Dependencies
```
pip install -r requirements.txt
```#### Install kaggledatasets
```
python setup.py install
```## Getting Started
* [Tutorials: Get started with kaggledatasets](https://kaggledatasets.github.io/get-started)
* [Examples: Easy to understand kaggledatasets code](https://github.com/kaggledatasets/kaggledatasets/tree/master/examples)
* [API Reference](https://kaggledatasets.readthedocs.io)## Communication
* **GitHub Issues**: bug reports, feature requests, dataset requests, install issues, help wanted, thoughts, etc.
* **Slack**: The [Kaggle Datasets Slack](https://kaggledatasets.slack.com/) hosts a primary audience of moderate to experienced Kaggle Datasets users and developers for general chat, online discussions, collaboration etc. If you need a slack invite, please visit: [http://bit.ly/kdslack](http://bit.ly/kdslack)## Releases and Contributing
kaggledatasets is expecting to have a 30 day release cycle (major releases). Please let us know if you encounter a bug by [filing an issue](https://github.com/kaggledatasets/kaggledatasets/issues).
We appreciate all contributions and make sure you go through our [Contributing Guide](CONTRIBUTING.md). If you are planning to contribute back bug-fixes, please do so without any further discussion.
If you plan to contribute new features, new datasets, utility functions or extensions to the core, please first open an issue and discuss the feature with us.
Sending a PR without discussion might end up resulting in a rejected PR, because we might be taking kaggledatasets in a different direction than you might be aware of.## License
kaggledatasets is Apache-2.0 licensed, as found in the [LICENSE](LICENSE) file.