https://github.com/jezcope/pyrefine
Execute OpenRefine JSON scripts without OpenRefine (or Java)
https://github.com/jezcope/pyrefine
data-science data-wrangling openrefine python
Last synced: 6 months ago
JSON representation
Execute OpenRefine JSON scripts without OpenRefine (or Java)
- Host: GitHub
- URL: https://github.com/jezcope/pyrefine
- Owner: jezcope
- License: mit
- Created: 2017-02-25T10:12:55.000Z (over 8 years ago)
- Default Branch: develop
- Last Pushed: 2022-12-27T15:02:24.000Z (almost 3 years ago)
- Last Synced: 2025-03-22T12:24:29.091Z (7 months ago)
- Topics: data-science, data-wrangling, openrefine, python
- Language: Python
- Homepage:
- Size: 460 KB
- Stars: 30
- Watchers: 3
- Forks: 2
- Open Issues: 13
-
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
Awesome Lists containing this project
README
===============================
PyRefine
===============================.. image:: https://img.shields.io/pypi/v/pyrefine.svg
:target: https://pypi.python.org/pypi/pyrefine.. image:: https://img.shields.io/travis/jezcope/pyrefine.svg
:target: https://travis-ci.org/jezcope/pyrefine.. image:: https://readthedocs.org/projects/pyrefine/badge/?version=latest
:target: https://pyrefine.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status.. image:: https://pyup.io/repos/github/jezcope/pyrefine/shield.svg
:target: https://pyup.io/repos/github/jezcope/pyrefine/
:alt: UpdatesOpenRefine_ is a great tool for exploring and cleaning datasets prior
to analysing them. It also records an undo history of all actions that
you can export as a sort of script in JSON_ format. However, in order
to execute that script on a new dataset, you need to manually import
it through the graphical interface or set up a BatchRefine_ server,
neither of which is quick.PyRefine allows you to execute OpenRefine JSON scripts against
datasets without firing up a full Java/OpenRefine server. It has a
commandline tool for quick use, or you can use it as a library to
integrate it into your pandas_-based data analysis pipeline.More details in `this blog post`_.
**Please note:** PyRefine is still very much alpha-quality. It probably
doesn't work exactly how you're expecting right now. That said, please
try it out, and consider :doc:`contributing`!.. _OpenRefine: http://openrefine.org
.. _JSON: http://en.wikipedia.org/wiki/JSON
.. _BatchRefine: https://github.com/fusepoolP3/p3-batchrefine
.. _pandas: http://pandas.pydata.org/
.. _`this blog post`: https://erambler.co.uk/blog/introducing-pyrefine-openrefine-python/* Free software: MIT license
* Documentation: https://pyrefine.readthedocs.io.Features
--------* Execute OpenRefine JSON against a dataset from the command line
* Execute OpenRefine JSON from a Python scriptCredits
---------This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage