{"id":13586387,"url":"https://github.com/crmne/cookiecutter-modern-datascience","last_synced_at":"2025-07-11T10:06:47.122Z","repository":{"id":38107984,"uuid":"277539788","full_name":"crmne/cookiecutter-modern-datascience","owner":"crmne","description":"Start a data science project with modern tools","archived":false,"fork":false,"pushed_at":"2023-08-10T08:03:15.000Z","size":102,"stargazers_count":164,"open_issues_count":5,"forks_count":33,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-02-14T21:27:11.792Z","etag":null,"topics":["cookiecutter","cookiecutter-data-science","cookiecutter-template","datascience","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/crmne.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-06T12:46:30.000Z","updated_at":"2024-08-01T16:32:29.700Z","dependencies_parsed_at":"2024-08-01T16:42:36.179Z","dependency_job_id":null,"html_url":"https://github.com/crmne/cookiecutter-modern-datascience","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/crmne/cookiecutter-modern-datascience","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crmne%2Fcookiecutter-modern-datascience","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crmne%2Fcookiecutter-modern-datascience/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crmne%2Fcookiecutter-modern-datascience/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crmne%2Fcookiecutter-modern-datascience/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/crmne","download_url":"https://codeload.github.com/crmne/cookiecutter-modern-datascience/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crmne%2Fcookiecutter-modern-datascience/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264780988,"owners_count":23662767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cookiecutter","cookiecutter-data-science","cookiecutter-template","datascience","python"],"created_at":"2024-08-01T15:05:32.022Z","updated_at":"2025-07-11T10:06:47.101Z","avatar_url":"https://github.com/crmne.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Cookiecutter Modern Data Science\n[Cookiecutter] template for starting a Data Science project with modern, fast Python tools.\n\n## Features\n\n* [Pipenv] for managing packages and virtualenvs in a modern way.\n* [Prefect] for modern pipelines and data workflow.\n* [Weights and Biases] for experiment tracking.\n* [FastAPI] for self-documenting fast HTTP APIs - on par with NodeJS and Go - based on [asyncio], [ASGI], and [uvicorn].\n* Modern CLI with [Typer].\n* Batteries included: [Pandas], [numpy], [scipy], [seaborn], and [jupyterlab] already installed.\n* Consistent code quality: [black], [isort], [autoflake], and [pylint] already installed.\n* [Pytest] for testing.\n* [GitHub Pages] for the public website.\n\n## Quickstart\n\nInstall the latest Cookiecutter and Pipenv:\n\n    pip install -U pipenv cookiecutter\n\nGenerate the project:\n\n    cookiecutter gh:crmne/cookiecutter-modern-datascience\n\nGet inside the project:\n\n    cd \u003crepo_name\u003e\n    pipenv shell  # activates virtualenv\n\n(Optional) Start Weights \u0026 Biases locally, if you don't want to use the cloud/on-premise version:\n\n    wandb local\n\nStart working:\n\n    jupyter-lab\n\n## Directory structure\n\nThis is our your new project will look like:\n\n    ├── .gitignore                \u003c- GitHub's excellent Python .gitignore customized for this project\n    ├── LICENSE                   \u003c- Your project's license.\n    ├── Pipfile                   \u003c- The Pipfile for reproducing the analysis environment\n    ├── README.md                 \u003c- The top-level README for developers using this project.\n    │\n    ├── data\n    │   ├── 0_raw                 \u003c- The original, immutable data dump.\n    │   ├── 0_external            \u003c- Data from third party sources.\n    │   ├── 1_interim             \u003c- Intermediate data that has been transformed.\n    │   └── 2_final               \u003c- The final, canonical data sets for modeling.\n    │\n    ├── docs                      \u003c- GitHub pages website\n    │   ├── data_dictionaries     \u003c- Data dictionaries\n    │   └── references            \u003c- Papers, manuals, and all other explanatory materials.\n    │\n    ├── notebooks                 \u003c- Jupyter notebooks. Naming convention is a number (for ordering),\n    │                                the creator's initials, and a short `_` delimited description, e.g.\n    │                                `01_cp_exploratory_data_analysis.ipynb`.\n    │\n    ├── output\n    │   ├── features              \u003c- Fitted and serialized features\n    │   ├── models                \u003c- Trained and serialized models, model predictions, or model summaries\n    │   └── reports               \u003c- Generated analyses as HTML, PDF, LaTeX, etc.\n    │       └── figures           \u003c- Generated graphics and figures to be used in reporting\n    │\n    ├── pipelines                 \u003c- Pipelines and data workflows.\n    │   ├── Pipfile               \u003c- The Pipfile for reproducing the pipelines environment\n    │   ├── pipelines.py          \u003c- The CLI entry point for all the pipelines\n    │   ├── \u003crepo_name\u003e           \u003c- Code for the various steps of the pipelines\n    │   │   ├──  __init__.py\n    │   │   ├── etl.py            \u003c- Download, generate, and process data\n    │   │   ├── visualize.py      \u003c- Create exploratory and results oriented visualizations\n    │   │   ├── features.py       \u003c- Turn raw data into features for modeling\n    │   │   └── train.py          \u003c- Train and evaluate models\n    │   └── tests\n    │       ├── fixtures          \u003c- Where to put example inputs and outputs\n    │       │   ├── input.json    \u003c- Test input data\n    │       │   └── output.json   \u003c- Test output data\n    │       └── test_pipelines.py \u003c- Integration tests for the HTTP API\n    │\n    └── serve                     \u003c- HTTP API for serving predictions\n        ├── Dockerfile            \u003c- Dockerfile for HTTP API\n        ├── Pipfile               \u003c- The Pipfile for reproducing the serving environment\n        ├── app.py                \u003c- The entry point of the HTTP API\n        └── tests\n            ├── fixtures          \u003c- Where to put example inputs and outputs\n            │   ├── input.json    \u003c- Test input data\n            │   └── output.json   \u003c- Test output data\n            └── test_app.py       \u003c- Integration tests for the HTTP API\n\n\n\n\n[Cookiecutter]: https://github.com/audreyr/cookiecutter\n[Pipenv]: https://pipenv.pypa.io/en/latest/\n[Prefect]: https://docs.prefect.io/\n[Weights and Biases]: https://www.wandb.com/\n[MLFlow]: https://mlflow.org/\n[FastAPI]: https://fastapi.tiangolo.com/\n[asyncio]: https://docs.python.org/3/library/asyncio.html\n[ASGI]: https://asgi.readthedocs.io/en/latest/\n[uvicorn]: https://www.uvicorn.org/\n[Typer]: https://typer.tiangolo.com/\n[Pandas]: https://pandas.pydata.org/\n[numpy]: https://numpy.org/\n[scipy]: https://www.scipy.org/\n[seaborn]: https://seaborn.pydata.org/\n[jupyterlab]: https://jupyterlab.readthedocs.io/en/stable/\n[black]: https://github.com/psf/black\n[isort]: https://github.com/timothycrosley/isort\n[autoflake]: https://github.com/myint/autoflake\n[pylint]: https://www.pylint.org/\n[Pytest]: https://docs.pytest.org/en/latest/\n[GitHub Pages]: https://pages.github.com/\n[Git LFS]: https://git-lfs.github.com/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrmne%2Fcookiecutter-modern-datascience","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrmne%2Fcookiecutter-modern-datascience","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrmne%2Fcookiecutter-modern-datascience/lists"}