Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pyscaffold/pyscaffoldext-dsproject

💫 PyScaffold extension for data-science projects
https://github.com/pyscaffold/pyscaffoldext-dsproject

data-science pyscaffold pyscaffold-extension python

Last synced: 3 days ago
JSON representation

💫 PyScaffold extension for data-science projects

Host: GitHub
URL: https://github.com/pyscaffold/pyscaffoldext-dsproject
Owner: pyscaffold
License: mit
Created: 2019-07-01T15:53:30.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2025-02-03T17:14:40.000Z (15 days ago)
Last Synced: 2025-02-09T09:05:56.392Z (10 days ago)
Topics: data-science, pyscaffold, pyscaffold-extension, python
Language: Python
Homepage: https://pyscaffold.org/projects/dsproject
Size: 138 KB
Stars: 156
Watchers: 7
Forks: 23
Open Issues: 7
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.rst
- Contributing: CONTRIBUTING.rst
- Funding: .github/FUNDING.yml
- License: LICENSE.txt
- Authors: AUTHORS.rst

Awesome Lists containing this project

README

        [![Build Status](https://api.cirrus-ci.com/github/pyscaffold/pyscaffoldext-dsproject.svg?branch=master)](https://cirrus-ci.com/github/pyscaffold/pyscaffoldext-dsproject)

[![ReadTheDocs](https://readthedocs.org/projects/pyscaffold/badge/?version=latest)](https://pyscaffold.org/projects/dsproject/en/latest)

[![Coveralls](https://img.shields.io/coveralls/github/pyscaffold/pyscaffoldext-dsproject/master.svg)](https://coveralls.io/r/pyscaffold/pyscaffoldext-dsproject)

[![PyPI-Server](https://img.shields.io/pypi/v/pyscaffoldext-dsproject.svg)](https://pypi.org/project/pyscaffoldext-dsproject)

[![Conda-Forge](https://img.shields.io/conda/vn/conda-forge/pyscaffoldext-dsproject.svg)](https://anaconda.org/conda-forge/pyscaffoldext-dsproject)

[![Downloads](https://pepy.tech/badge/pyscaffoldext-dsproject/month)](https://pepy.tech/project/pyscaffoldext-dsproject)

[![Sponsor](https://img.shields.io/static/v1?label=Sponsor&message=%E2%9D%A4&logo=GitHub&color=ff69b4)](https://github.com/sponsors/FlorianWilhelm)

# pyscaffoldext-dsproject

[PyScaffold] extension tailored for *Data Science* projects. This extension is inspired by

[cookiecutter-data-science] and enhanced in many ways. The main differences are that it

1. advocates a proper Python package structure that can be shipped and distributed,

2. uses a [conda] environment instead of something [virtualenv]-based and is thus more suitable

   for data science projects,

3. more default configurations for [Sphinx], [pytest], [pre-commit], etc. to foster

   clean coding and best practices.

Also consider using [dvc] to version control and share your data within your team.

Read [this blogpost] to learn how to work with JupyterLab notebooks efficiently by using a

data science project structure like this.

The final directory 
``` 
├── AUTHORS.md 
├── CHANGELOG.md 
├── CONTRIBUTING.md 
├── Dockerfile 
├── LICENSE.txt 
├── README.md 
├── configs 
├── data 
│   ├── external 
│   ├── interim 
│   ├── processed 
│   └── raw 
├── docs 
├── environment.yml 
├── models 
│ 
├── notebooks 
│ 
│ 
├── pyproject.toml 
│ 
├── references 
├── reports 
│   └── figures 
├── scripts 
│ 
├── setup.cfg 
├── setup.py 
│ 
├── src 
│   └── PYTHON_PKG 
├── tests 
├── .coveragerc 
├── .isort.cfg 
└── .pre-commit-config.yaml 
```

structure looks like: <- List of developers and maintainers. <- Changelog to keep track of new features and fixes. <- Guidelines for contributing to this project. <- Build a docker container with `docker build .`. <- License as chosen on the command-line. <- The top-level README for developers. <- Directory for configurations of model & application. <- Data from third party sources. <- Intermediate data that has been transformed. <- The final, canonical data sets for modeling. <- The original, immutable data dump. <- Directory for Sphinx documentation in rst or md. <- The conda environment file for reproducibility. <- Trained and serialized models, model predictions, or model summaries. <- Jupyter notebooks. Naming convention is a number (for ordering), the creator's initials and a description, e.g. `1.0-fw-initial-data-exploration`. <- Build configuration. Don't change! Use `pip install -e .` to install for development or to build `tox -e build`. <- Data dictionaries, manuals, and all other materials. <- Generated analysis as HTML, PDF, LaTeX, etc. <- Generated plots and figures for reports. <- Analysis and production scripts which import the actual PYTHON_PKG, e.g. train_model. <- Declarative configuration of your project. <- [DEPRECATED] Use `python setup.py develop` to install for development or `python setup.py bdist_wheel` to build. <- Actual Python package where the main functionality goes. <- Unit tests which can be run with `pytest`. <- Configuration for coverage reports of unit tests. <- Configuration for git hook that sorts imports. <- Configuration of pre-commit git hooks.

See a demonstration of the initial project structure under [dsproject-demo] and also check out

the documentation of [PyScaffold] for more information.

## Usage

Just install this package with `conda install -c conda-forge pyscaffoldext-dsproject`

and note that `putup -h` shows a new option `--dsproject`.

Creating a data science project is then as easy as:

```bash

putup --dsproject my_ds_project

```

The flag `--dsproject` comprises additionally the flags `--markdown`, `--pre-commit` and `--no-skeleton`

for convenience.

## Making Changes & Contributing

This project uses [pre-commit], please make sure to install it before making any

changes:

```bash

conda install pre-commit

cd pyscaffoldext-dsproject

pre-commit install

```

It is a good idea to update the hooks to the latest version:

```bash

pre-commit autoupdate

```

Please also check PyScaffold's [contribution guidelines].

[PyScaffold]: https://pyscaffold.org/

[cookiecutter-data-science]: https://github.com/drivendata/cookiecutter-data-science

[Miniconda]: https://docs.conda.io/en/latest/miniconda.html

[Jupyter]: https://jupyter.org/

[dsproject-demo]: https://github.com/pyscaffold/dsproject-demo

[Sphinx]: https://www.sphinx-doc.org/

[pytest]: https://docs.pytest.org/

[conda]: https://docs.conda.io/

[Conda-Forge]: https://anaconda.org/conda-forge/pyscaffoldext-dsproject

[virtualenv]: https://virtualenv.pypa.io/

[pre-commit]: https://pre-commit.com/

[dvc]: https://dvc.org/

[this blogpost]: https://florianwilhelm.info/2018/11/working_efficiently_with_jupyter_lab/

[pre-commit]: https://pre-commit.com/

[contribution guidelines]: https://pyscaffold.org/en/latest/contributing.html