An open API service indexing awesome lists of open source software.

https://github.com/itrauco/experiments-test

Minimal sandbox for isolating and testing core machine learning workflow logic across Jupyter notebooks. Used to rebuild clean, reproducible foundations for future MLOps development.
https://github.com/itrauco/experiments-test

computer-vision conda-environment experimental jupyter machine-learning minimal-example ml-workflows mlops model-training notebook-development notebook-isolation python reproducible-research sandbox workflow-prototyping

Last synced: 10 months ago
JSON representation

Minimal sandbox for isolating and testing core machine learning workflow logic across Jupyter notebooks. Used to rebuild clean, reproducible foundations for future MLOps development.

Awesome Lists containing this project

README

          

# Baseline ML Workflow Skeleton

**Repository** → [experiments-test](https://github.com/iTrauco/experiments-test)

A minimal engineering sandbox for isolating core machine learning workflow logic across notebooks.
Built to strip away noise and validate raw workflow mechanics.

---

## Table of Contents

* [Scope](#scope)
* [Upstream Integration](#upstream-integration)
* [Notebook Tools Installation](#notebook-tools-installation)
* [⚠️ Development Status](#️-development-status)
* [Reproducibility Framework](#reproducibility-framework)

* [Environment Setup](#environment-setup)
* [Environment Details](#environment-details)
* [Environment Management](#environment-management)

---

## Scope

* Self-contained notebook logic
* Core workflow structure only
* No external orchestration
* No Python data science virtual environment dependency hell conflicts

## Upstream Integration

* Primary development repo → [traffic-vision-v0.4](https://github.com/iTrauco/traffic-vision-v0.4)
* Current unstable work lives in → [feature/experiments-framework](https://github.com/iTrauco/traffic-vision-v0.4/tree/feature/experiments-framework) — a chaotic prototype branch being deprecated.

This repo will drive a clean rebuild of workflow logic in the next iteration of `traffic-vision-v0.4`.

---

## Notebook Tools Installation

```bash
cd /path/to/notebook_tools
pip install -e .
```

This installs the library in "editable" mode - any changes you make to the code are immediately available without reinstalling.

---

## ⚠️ Development Status

All modules in `lib/` are early-stage development prototypes. Functionality is still being worked out — some modules may be dead code, others are spaghetti. Creating modular packages as I identify what's killing my bandwidth.

---

## Reproducibility Framework

### Environment Setup

This project uses a Conda environment to manage dependencies for reproducible analysis. Follow these steps to set up the environment:

#### Prerequisites

* Anaconda or Miniconda installed on your system
* Git for cloning the repository

#### Setup Instructions

1. Clone the repository:

```bash
git clone https://github.com/iTrauco/experiments-test.git
cd experiments-test
```

2. Create the Conda environment:

```bash
conda create -n traffic-vision-env python=3.11 -y
```

3. Activate the environment:

```bash
conda activate traffic-vision-env
```

4. Install baseline packages:

```bash
conda install -c conda-forge jupyter numpy pandas matplotlib seaborn scikit-learn opencv -y
```

5. Install deep learning and computer vision packages:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install ultralytics supervision
```

6. Launch Jupyter Notebook:

```bash
jupyter notebook
```

7. Access the notebook in your browser via the URL displayed in the terminal.

---

### Environment Details

The environment includes essential data science and computer vision packages:

* [Python 3.11](https://www.python.org/downloads/release/python-3110/)
* [Jupyter Notebook](https://jupyter.org/documentation)
* [pandas](https://pandas.pydata.org/docs/) & [numpy](https://numpy.org/doc/stable/) for data manipulation
* [matplotlib](https://matplotlib.org/stable/index.html) & [seaborn](https://seaborn.pydata.org/) for visualization
* [scikit-learn](https://scikit-learn.org/stable/documentation.html) for traditional ML algorithms
* [OpenCV](https://docs.opencv.org/4.x/) for image and video processing
* [PyTorch](https://pytorch.org/docs/stable/index.html) for deep learning model development
* [Ultralytics](https://docs.ultralytics.com/) for YOLO object detection
* [Supervision](https://supervision.roboflow.com/) for object tracking utilities

---

### Environment Management

For collaborators who enhance the environment with additional packages:

```bash
# Export the updated environment
conda activate traffic-vision-env
conda env export > environment.yml
```

This ensures full reproducibility across systems by preserving all dependencies and versions.