Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lopez86/datatools
https://github.com/lopez86/datatools
Last synced: 23 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/lopez86/datatools
- Owner: lopez86
- License: mit
- Created: 2018-07-02T00:28:12.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-07-08T18:37:15.000Z (over 6 years ago)
- Last Synced: 2024-11-06T20:50:30.883Z (2 months ago)
- Language: Python
- Size: 26.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DataTools
This package is meant to hold various tools that might be useful for analyzing data and running machine learning models.
Currently the main focus is allowing for faster prototyping of models.
Some basic features that I've incorporated into this package include:
- Basic dataset and result classes
- Batch generation
- K-fold CV with out-of-fold predictions and test predictions for each fold
- Tensorflow training for fairly simple models
- Tensorflow feed-dict production
I also expect to have some basic tools for other packages such as LightGBM, Keras, and XGBoost.
Some things still to do include:
- Add model serialization to the train & predict functions
- Deserialization & prediction with no training
- Basic model architectures
- Embedding dataset production