An open API service indexing awesome lists of open source software.

https://github.com/satvikpraveen/numpymasterpro

A hands-on, production-ready toolkit to master NumPy โ€” from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.
https://github.com/satvikpraveen/numpymasterpro

broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization

Last synced: 23 days ago
JSON representation

A hands-on, production-ready toolkit to master NumPy โ€” from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

Awesome Lists containing this project

README

          

# ๐Ÿง  NumPyMasterPro

[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Python](https://img.shields.io/badge/Python-3.10%2B-darkgreen.svg)](https://www.python.org/)
[![CI/CD](https://img.shields.io/badge/CI%2FCD-GitHub%20Actions-brightgreen.svg)](https://github.com/SatvikPraveen/NumPyMasterPro/actions)
[![Tests](https://img.shields.io/badge/Tests-Pytest-blue.svg)](https://docs.pytest.org/)
[![Issues](https://img.shields.io/github/issues/SatvikPraveen/NumPyMasterPro?color=yellowgreen)](https://github.com/SatvikPraveen/NumPyMasterPro/issues)
[![Jupyter Notebooks](https://img.shields.io/badge/Jupyter-Notebook-orange.svg)](https://jupyter.org/)
[![Docker Ready](https://img.shields.io/badge/Docker-Ready-blueviolet.svg)](https://www.docker.com/)
[![NumPy Focused](https://img.shields.io/badge/NumPy-100%25-brightgreen.svg)](https://numpy.org/)
[![Real-World Use Cases](https://img.shields.io/badge/Use%20Cases-Included-ff69b4.svg)](#)
[![K-Means Project](https://img.shields.io/badge/Project-K--Means%20From%20Scratch-9cf.svg)](#)

**NumPyMasterPro** is a comprehensive, modular, and hands-on project designed to help you **master NumPy from first principles to real-world applications**.

This project isn't just a learning exercise โ€” it's a **complete reference toolkit**, **interview-ready resource**, and a **portfolio-quality project** that showcases your fluency with one of Pythonโ€™s most essential libraries for scientific computing and data analysis.

---

## ๐Ÿš€ Why This Project Matters

> Most learners stop at tutorials. This repository takes you further โ€” by combining theory, implementation, real-world use cases, and production practices in one place.

โœ… Covers **100% of NumPy's essential concepts**
โœ… Demonstrates **clean project structure and modular code reuse**
โœ… Includes **interview-ready topics** like broadcasting, vectorization, and matrix algebra
โœ… Provides **Jupyter notebooks + Python utility scripts + cheat sheet**
โœ… Ends with a **K-Means algorithm from scratch with Elbow Method** โ€” great for resumes

---

## ๐Ÿ“Œ Project Objectives

- ๐Ÿ” **Master Core NumPy Syntax** through progressively organized notebooks
- ๐Ÿ”„ **Understand Memory Efficiency**: broadcasting, vectorization, views vs. copies
- โš™๏ธ **Practice Clean Coding** using reusable utility scripts in `/scripts`
- ๐Ÿง  **Explore Real-World Scenarios**: regression, simulations, image ops, clustering
- ๐Ÿ“‚ **Build a Reference Toolkit** for revision, projects, and technical interviews

---

## ๐Ÿงฑ Folder Structure

```bash
NumPyMasterPro/
โ”œโ”€โ”€ notebooks/ # ๐Ÿ““ Themed Jupyter Notebooks (core + advanced topics)
โ”œโ”€โ”€ scripts/ # ๐Ÿ› ๏ธ Modular, reusable Python utilities
โ”œโ”€โ”€ datasets/ # ๐Ÿ“ Data files used in notebooks
โ”œโ”€โ”€ docs/ # ๐Ÿ“œ Cheat sheets and markdown-based quick notes
โ”œโ”€โ”€ requirements.txt # ๐Ÿ“ฆ Minimal dependencies to run the project
โ”œโ”€โ”€ requirements_dev.txt # ๐Ÿ“ฆ Full dev environment
โ”œโ”€โ”€ .env.example # ๐Ÿ›ก๏ธ Sample env file for Docker-based config (login-free setup)
โ”œโ”€โ”€ docker-compose.yml # ๐Ÿณ Multi-container orchestration for Jupyter Lab
โ”œโ”€โ”€ Dockerfile # ๐Ÿณ Docker image setup using Jupyter minimal notebook base
โ”œโ”€โ”€ .gitignore # โŒ Files to exclude from version control
โ””โ”€โ”€ README.md # ๐Ÿ“˜ This file!
```

---

## ๐Ÿงฎ Topics Covered

| Notebook | Description |
| --------------------------------- | ------------------------------------------------------------ |
| `01_array_basics.ipynb` | Array creation, types, shapes, memory attributes |
| `02_indexing_slicing.ipynb` | Indexing, slicing, masking, `.take()`, `.put()` |
| `03_array_manipulation.ipynb` | Reshaping, stacking, splitting, tiling, padding |
| `04_math_operations.ipynb` | Element-wise ops, aggregation, rounding, broadcasting |
| `05_linear_algebra.ipynb` | Dot product, inverse, norms, eig/SVD, solving systems |
| `06_statistics_probability.ipynb` | Descriptive stats, histograms, correlations, sampling |
| `07_masking_conditions.ipynb` | `where`, `select`, logical ops, `nonzero`, `isfinite`, etc. |
| `08_file_io_memory.ipynb` | `save`, `load`, `memmap`, vectorize, views vs. copies |
| `09_real_world_cases.ipynb` | Regression, image ops, time-series scaling, simulations |
| `10_kmeans_from_scratch.ipynb` | ๐ŸŽฏ BONUS: K-Means Clustering + Elbow Method using NumPy only |

---

## ๐Ÿงฐ Utility Scripts

| File | Purpose |
| ------------------------- | -------------------------------------------------------------- |
| `array_utils.py` | Inspect shapes, types, identities, and metadata |
| `linear_algebra_utils.py` | Matrix algebra: dot, inverse, SVD, eigenvalues |
| `math_utils.py` | Element-wise math: power, root, trig, rounding, logs, exponent |
| `aggregation_utils.py` | Sum, mean, std, var, min, max โ€” global & axis-wise |
| `stats_utils.py` | Z-score, normalization, correlation, histogram bins |
| `logical_utils.py` | Boolean logic, masking, conditionals (`any`, `all`, `where`) |
| `kmeans_utils.py` | K-Means from scratch, inertia calculation, and centroid init |

Example usage:

```python
# Direct module import
from scripts.kmeans_utils import kmeans, compute_inertia

# Or use convenient re-exports from __init__.py
from scripts import kmeans, describe_array, minmax_normalize
```

---

## ๐ŸŽ›๏ธ Streamlit Frontend (Interactive Demo)

You can try the K-Means algorithm with different datasets or number of clusters using:

```bash
streamlit run kmeans_app.py
```

This allows you to upload `.csv` files, set cluster count, and visualize results in real time.
Great for experiments, education, and showcasing clustering interactively.

---

## ๐Ÿณ Docker-Based Setup (Optional)

Prefer running in a **containerized Jupyter Lab** environment?

```bash
docker compose up --build
```

Then open the browser at:
๐Ÿ‘‰ [http://localhost:8889](http://localhost:8889)

> You can also stop the container with:

```bash
docker compose down --volumes --remove-orphans
```

---

## ๐Ÿ” Authentication & Security

This project is configured for **login-free use** of Jupyter Lab โ€” no password or token required.

- โœ… `.env.example` is included with recommended settings.
- ๐Ÿšซ `.env` is deliberately **excluded** from the repo (add your own if needed).
- ๐Ÿ›ก๏ธ You may modify the `docker-compose.yml` to add a token or hashed password later.

---

## ๐Ÿง  Recommended Use

- โœ๏ธ Study each notebook sequentially and refer back as needed
- ๐Ÿงช Use `/scripts/` functions in other projects or interview tasks
- ๐Ÿงต Treat `docs/numpy_cheatsheet.md` as your quick review guide
- ๐Ÿง  Use `10_kmeans_from_scratch.ipynb` in your resume to show NumPy fluency
- ๐Ÿ’ก Add your own notebooks (e.g., PCA from scratch, numerical integration, etc.)

---

## ๐Ÿ”ง Getting Started (Without Docker)

1. **Clone the repo**

```bash
git clone https://github.com/SatvikPraveen/NumPyMasterPro.git
cd NumPyMasterPro
```

2. **Create & activate a virtual environment**

```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```

3. **Launch the Jupyter Lab interface**

```bash
jupyter lab
```

---

## ๐Ÿงช Testing

**NumPyMasterPro** includes a comprehensive test suite with **80+ unit tests** covering all utility modules.

### Quick Testing

```bash
# Install test dependencies
pip install pytest pytest-cov

# Run all tests
pytest

# Run with coverage
pytest --cov=scripts --cov-report=term-missing
```

### Using Makefile Commands

```bash
make test # Run all tests
make test-coverage # Generate coverage report
make lint # Check code quality
make format # Auto-format code
make all # Run complete checks
```

### Test Coverage

- โœ… Array utilities (describe, compare, flags)
- โœ… Logical operations (any, all, where, masking)
- โœ… K-Means algorithm (clustering, inertia)
- โœ… Math operations (arithmetic, trig, rounding)
- โœ… Linear algebra (matrices, eigenvalues, SVD)
- โœ… Statistics (normalization, correlation)

๐Ÿ“– **Detailed testing guide:** [TESTING.md](docs/TESTING.md)

### CI/CD Pipeline

Automated testing runs on:
- ๐Ÿ”„ Every push to `main`/`develop`
- ๐Ÿ”„ All pull requests
- โœ… Multi-OS (Ubuntu, macOS, Windows)
- โœ… Python 3.10, 3.11, 3.12
- โœ… Code linting & formatting checks
- โœ… Notebook validation
- โœ… Docker build verification

---

## ๐Ÿ“„ License

This project is licensed under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0). See the [LICENSE](./LICENSE) file for more details.

---

## ๐ŸŒŸ Showcase & Star

If this project helped you master NumPy, feel free to โญ it and share it with others!