Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/danielhanchen/sciblox
sciblox - Easier Data Science and Machine Learning
https://github.com/danielhanchen/sciblox
boosting data-analysis data-mining data-preprocessing data-science data-visualization imputation machine-learning python sklearn
Last synced: 3 months ago
JSON representation
sciblox - Easier Data Science and Machine Learning
- Host: GitHub
- URL: https://github.com/danielhanchen/sciblox
- Owner: danielhanchen
- License: mit
- Created: 2017-07-19T05:51:40.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-07-28T13:14:08.000Z (over 7 years ago)
- Last Synced: 2024-04-26T10:42:10.481Z (9 months ago)
- Topics: boosting, data-analysis, data-mining, data-preprocessing, data-science, data-visualization, imputation, machine-learning, python, sklearn
- Language: HTML
- Homepage: https://danielhanchen.github.io/
- Size: 1.38 MB
- Stars: 48
- Watchers: 4
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.txt
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# sciblox
An all in one Python3 Data Science Package. Easy visualisation, data mining, data preparation and machine learning.Please check the Jupyter Notebook for instructions on how to use it.
You can also check sciblox out on https://danielhanchen.github.io/https://pypi.python.org/pypi/sciblox
Install:
```sh
[sudo] pip install sciblox
```
NOTE: If you intend to use remove linearly dependent rows or KNN,SVD impute:
```sh
[sudo] pip install fancyimpute sympy theano
```
If fancyimpute fails: Please install C++ or MingW compilerWHAT'S NEW?
1. FASTER (x10) BPCA fill
2. Better analyser
3. NEW modules - Machine LearningSome features explained include:
1. MICE, BPCA missing data imputation with Random Forests, XGBoost and Linear Regression support
2. Automatic Data Plotting
3. Word extraction and frequency plots
4. Sequential text processing
5. CARET like processes including ZeroVarCheck, FreqRatios etc.
6. Discretization and Continuisation
7. Easy data structure changes like Hcat, Vcat, reversing etc.
8. Easy CARET like Machine Learning modules
9. Automatic Best Graphs PlottingIN CONSTRUCTION:
1. Advanced text extraction methods
2. Automatic Machine Learning methodsFor easier calling:
```python
from sciblox import *
%matplotlib notebook
```
If you are using other methods, just copy paste sciblox.py into whatever Python3 main directory.
Then call it same as top.Some screenshots:
![Analysing](/img/Analyse.jpg?raw=true "Auto analysing and 3d plots")
![Preprocessing](/img/Preprocess.jpg?raw=true "CARET like Preprocess")
![Analytics](/img/Analytics.jpg?raw=true "CARET like checking")
![Plotting](/img/Plot.jpg?raw=true "Cool easy plots")