Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/neurospin/pystatsml
Statistics and Machine Learning in Python
https://github.com/neurospin/pystatsml
Last synced: 27 days ago
JSON representation
Statistics and Machine Learning in Python
- Host: GitHub
- URL: https://github.com/neurospin/pystatsml
- Owner: neurospin
- License: other
- Created: 2016-04-07T16:14:40.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2021-01-06T16:12:02.000Z (over 3 years ago)
- Last Synced: 2024-03-25T23:13:21.752Z (3 months ago)
- Language: Jupyter Notebook
- Size: 34.4 MB
- Stars: 64
- Watchers: 20
- Forks: 89
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: COPYING
Lists
- awesome-stars - neurospin/pystatsml - Statistics and Machine Learning in Python (Jupyter Notebook)
README
Statistics and Machine Learning in Python
=========================================- [pdf](ftp://ftp.cea.fr/pub/unati/people/educhesnay/pystatml/StatisticsMachineLearningPython.pdf)
- [www](https://duchesnay.github.io/pystatsml)Structure
---------Courses are available in three formats:
1. Jupyter notebooks.
2. Python files using sphinx-gallery.
3. ReStructuredText files.
All notebooks and python files are converted into `rst` format and then assembled together using sphinx.
Directories and main files:
introduction/
├── machine_learning.rst
└── python_ecosystem.rstpython_lang/ # (Python language)
├── python_lang.py # (main file)
└── python_lang_solutions.pyscientific_python/
├── matplotlib.ipynb
├── scipy_numpy.py
├── scipy_numpy_solutions.py
├── scipy_pandas.py
└── scipy_pandas_solutions.pystatistics/ # (Statistics)
├── stat_multiv.ipynb # (multivariate statistics)
├── stat_univ.ipynb # (univariate statistics)
├── stat_univ_solutions.ipynb
├── stat_univ_lab01_brain-volume.py # (lab)
├── stat_univ_solutions.ipynb
└── time_series.ipynbmachine_learning/ # (Machine learning)
├── clustering.ipynb
├── decomposition.ipynb
├── decomposition_solutions.ipynb
├── linear_classification.ipynb
├── linear_regression.ipynb
├── non_linear_prediction.ipynb
├── resampling.ipynb
├── resampling_solution.py
└── sklearn.ipynboptimization/
├── optim_gradient_descent.ipynb
└── optim_gradient_descent_lab.ipynbdeep_learning/
├── dl_backprop_numpy-pytorch-sklearn.ipynb
├── dl_cnn_cifar10_pytorch.ipynb
├── dl_mlp_mnist_pytorch.ipynb
└── dl_transfer-learning_cifar10-ants-Build
-----After pulling the repository execute Jupyter notebooks (outputs are expected to be removed before git submission).
```
make exe
```Build the pdf file (requires LaTeX):
```
make pdf
```Build the html files:
```
make html
```Clean everything and strip output from Jupyter notebook (useless if you installed the nbstripout hook, ):
```
make clean
```Dependencies
------------
The easier is to install Anaconda at https://www.continuum.io with python >= 3. Anaconda provides- python 3
- ipython
- Jupyter
- pandoc
- LaTeX to generate pdfThen install:
1. [sphinx-gallery](https://sphinx-gallery.readthedocs.io)
```
pip install sphinx-gallery
```2. [nbstripout](https://github.com/kynan/nbstripout)
```
conda install -c conda-forge nbstripout
```Configure your git repository with nbstripout pre-commit hook for users who don't want to track output in VCS.
```
cd pystatsml
nbstripout --install
```3. Git [LFS](https://git-lfs.github.com/) for datasets
a. Install Git LFS
```
git lfs install
```b. select the file types you'd like Git LFS to manage
```
git lfs track "*.npz"
git lfs track "*.npy"
git lfs track "*.nii"
git lfs track "*.nii.gz"
git lfs track "*.csv"
```b. Now make sure .gitattributes is tracked:
```
git add .gitattributes
```4. LaTeX (optional for pdf)
For Linux debian like:
```
sudo apt-get install latexmk texlive-latex-extra
```5. MS docx (optional)
[docxbuilder](https://docxbuilder.readthedocs.io/en/latest/docxbuilder.html)
a. Install
```
pip install docxbuilder
pip install docxbuilder[math]
```b. Build
```
make docx
```