https://github.com/rmax/dask-avro
Avro reader for Dask.
https://github.com/rmax/dask-avro
Last synced: 11 months ago
JSON representation
Avro reader for Dask.
- Host: GitHub
- URL: https://github.com/rmax/dask-avro
- Owner: rmax
- License: mit
- Created: 2017-02-02T19:53:17.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2018-06-16T05:07:12.000Z (almost 8 years ago)
- Last Synced: 2025-01-31T10:36:09.755Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 10.3 MB
- Stars: 4
- Watchers: 3
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
Awesome Lists containing this project
README
=========
Dask-Avro
=========
.. image:: https://img.shields.io/pypi/v/dask-avro.svg
:target: https://pypi.python.org/pypi/dask-avro
.. image:: https://img.shields.io/pypi/pyversions/dask-avro.svg
:target: https://pypi.python.org/pypi/dask-avro
.. image:: https://readthedocs.org/projects/dask-avro/badge/?version=latest
:target: https://readthedocs.org/projects/dask-avro/?badge=latest
:alt: Documentation Status
.. image:: https://img.shields.io/travis/rmax/dask-avro.svg
:target: https://travis-ci.org/rmax/dask-avro
.. image:: https://codecov.io/github/rmax/dask-avro/coverage.svg?branch=master
:alt: Coverage Status
:target: https://codecov.io/github/rmax/dask-avro
.. image:: https://landscape.io/github/rmax/dask-avro/master/landscape.svg?style=flat
:target: https://landscape.io/github/rmax/dask-avro/master
:alt: Code Quality Status
.. image:: https://requires.io/github/rmax/dask-avro/requirements.svg?branch=master
:alt: Requirements Status
:target: https://requires.io/github/rmax/dask-avro/requirements/?branch=master
Avro reader for Dask.
* Free software: MIT license
* Documentation: https://dask-avro.readthedocs.org.
* Python versions: 2.7, 3.5+
Features
--------
This projects provides an Avro_ format reader for Dask_. Provides a convenient
function to read one or more Avro files and partition them arbitrarily.
Quickstart
----------
Usage::
import dask.bag
import dask_avro
delayeds = dask_avro.read_avro("data-*.avro", blocksize=2**26)
data = dask.bag.from_delayed(delayeds)
Credits
-------
This package was created with Cookiecutter_ and the `rmax/cookiecutter-pypackage`_ project template.
.. _Avro: https://avro.apache.org/docs/1.2.0/
.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _Dask: http://dask.pydata.org/en/latest/
.. _`rmax/cookiecutter-pypackage`: https://github.com/rmax/cookiecutter-pypackage