https://github.com/fny/thecurator
https://github.com/fny/thecurator
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/fny/thecurator
- Owner: fny
- License: mit
- Created: 2017-09-19T16:25:03.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2019-01-02T20:18:51.000Z (over 7 years ago)
- Last Synced: 2025-03-25T03:51:15.526Z (about 1 year ago)
- Language: Python
- Size: 42 KB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
The Curator 🖼
==============
.. image:: https://travis-ci.org/fny/thecurator.svg?branch=master
:target: https://travis-ci.org/fny/thecurator
:alt: Build Status
.. image:: https://badge.fury.io/py/thecurator.svg
:target: https://pypi.org/project/thecurator
:alt: The Curator on PyPI
The Curator helps you define pipelines for transforming dirty data into consumable databases.
Usage
-----
.. code:: python
from thecurator import Curator
# Paths to files describing different tables
table_descriptions = ['patient.yml', 'lab.yml']
curator = Curator(sqlalchemy_engine, table_descriptions)
# Transform a pandas DataFrame according to the descriptions
curator.transform_df('patient', patient_df)
# Transform a dictionary array according to the descriptions
curator.transform_dicts('patient', patient_dicts)
# Transform and insert a dictionary array according to the descriptions
curator.insert_dicts('lab', lab_dicts)
See the tests for more examples. More coming soon...
Development
-----------
- Install development requirements `pip install -r dev-requirements.txt`
- Make changes
- Run the tests `pytest tests`
- See the Makefile for other useful commands