https://github.com/satvikpraveen/numpymasterpro
A hands-on, production-ready toolkit to master NumPy โ from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.
https://github.com/satvikpraveen/numpymasterpro
broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization
Last synced: 23 days ago
JSON representation
A hands-on, production-ready toolkit to master NumPy โ from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.
- Host: GitHub
- URL: https://github.com/satvikpraveen/numpymasterpro
- Owner: SatvikPraveen
- License: mit
- Created: 2025-07-21T03:18:59.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-07-21T03:50:24.000Z (10 months ago)
- Last Synced: 2025-07-21T05:33:38.294Z (10 months ago)
- Topics: broadcasting, data-analysis, data-science, data-source, data-visualization, jupyter-notebook, kmeans-clustering, linear-algebra, machine-learning, matrix-algebra, numerical-computation, numpy, numpy-broadcasting, numpy-examples, numpy-tutorial, open-source, python, scientific-computing, standardization, vectorization
- Language: Jupyter Notebook
- Homepage:
- Size: 78.1 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# ๐ง NumPyMasterPro
[](https://www.gnu.org/licenses/gpl-3.0)
[](https://www.python.org/)
[](https://github.com/SatvikPraveen/NumPyMasterPro/actions)
[](https://docs.pytest.org/)
[](https://github.com/SatvikPraveen/NumPyMasterPro/issues)
[](https://jupyter.org/)
[](https://www.docker.com/)
[](https://numpy.org/)
[](#)
[](#)
**NumPyMasterPro** is a comprehensive, modular, and hands-on project designed to help you **master NumPy from first principles to real-world applications**.
This project isn't just a learning exercise โ it's a **complete reference toolkit**, **interview-ready resource**, and a **portfolio-quality project** that showcases your fluency with one of Pythonโs most essential libraries for scientific computing and data analysis.
---
## ๐ Why This Project Matters
> Most learners stop at tutorials. This repository takes you further โ by combining theory, implementation, real-world use cases, and production practices in one place.
โ
Covers **100% of NumPy's essential concepts**
โ
Demonstrates **clean project structure and modular code reuse**
โ
Includes **interview-ready topics** like broadcasting, vectorization, and matrix algebra
โ
Provides **Jupyter notebooks + Python utility scripts + cheat sheet**
โ
Ends with a **K-Means algorithm from scratch with Elbow Method** โ great for resumes
---
## ๐ Project Objectives
- ๐ **Master Core NumPy Syntax** through progressively organized notebooks
- ๐ **Understand Memory Efficiency**: broadcasting, vectorization, views vs. copies
- โ๏ธ **Practice Clean Coding** using reusable utility scripts in `/scripts`
- ๐ง **Explore Real-World Scenarios**: regression, simulations, image ops, clustering
- ๐ **Build a Reference Toolkit** for revision, projects, and technical interviews
---
## ๐งฑ Folder Structure
```bash
NumPyMasterPro/
โโโ notebooks/ # ๐ Themed Jupyter Notebooks (core + advanced topics)
โโโ scripts/ # ๐ ๏ธ Modular, reusable Python utilities
โโโ datasets/ # ๐ Data files used in notebooks
โโโ docs/ # ๐ Cheat sheets and markdown-based quick notes
โโโ requirements.txt # ๐ฆ Minimal dependencies to run the project
โโโ requirements_dev.txt # ๐ฆ Full dev environment
โโโ .env.example # ๐ก๏ธ Sample env file for Docker-based config (login-free setup)
โโโ docker-compose.yml # ๐ณ Multi-container orchestration for Jupyter Lab
โโโ Dockerfile # ๐ณ Docker image setup using Jupyter minimal notebook base
โโโ .gitignore # โ Files to exclude from version control
โโโ README.md # ๐ This file!
```
---
## ๐งฎ Topics Covered
| Notebook | Description |
| --------------------------------- | ------------------------------------------------------------ |
| `01_array_basics.ipynb` | Array creation, types, shapes, memory attributes |
| `02_indexing_slicing.ipynb` | Indexing, slicing, masking, `.take()`, `.put()` |
| `03_array_manipulation.ipynb` | Reshaping, stacking, splitting, tiling, padding |
| `04_math_operations.ipynb` | Element-wise ops, aggregation, rounding, broadcasting |
| `05_linear_algebra.ipynb` | Dot product, inverse, norms, eig/SVD, solving systems |
| `06_statistics_probability.ipynb` | Descriptive stats, histograms, correlations, sampling |
| `07_masking_conditions.ipynb` | `where`, `select`, logical ops, `nonzero`, `isfinite`, etc. |
| `08_file_io_memory.ipynb` | `save`, `load`, `memmap`, vectorize, views vs. copies |
| `09_real_world_cases.ipynb` | Regression, image ops, time-series scaling, simulations |
| `10_kmeans_from_scratch.ipynb` | ๐ฏ BONUS: K-Means Clustering + Elbow Method using NumPy only |
---
## ๐งฐ Utility Scripts
| File | Purpose |
| ------------------------- | -------------------------------------------------------------- |
| `array_utils.py` | Inspect shapes, types, identities, and metadata |
| `linear_algebra_utils.py` | Matrix algebra: dot, inverse, SVD, eigenvalues |
| `math_utils.py` | Element-wise math: power, root, trig, rounding, logs, exponent |
| `aggregation_utils.py` | Sum, mean, std, var, min, max โ global & axis-wise |
| `stats_utils.py` | Z-score, normalization, correlation, histogram bins |
| `logical_utils.py` | Boolean logic, masking, conditionals (`any`, `all`, `where`) |
| `kmeans_utils.py` | K-Means from scratch, inertia calculation, and centroid init |
Example usage:
```python
# Direct module import
from scripts.kmeans_utils import kmeans, compute_inertia
# Or use convenient re-exports from __init__.py
from scripts import kmeans, describe_array, minmax_normalize
```
---
## ๐๏ธ Streamlit Frontend (Interactive Demo)
You can try the K-Means algorithm with different datasets or number of clusters using:
```bash
streamlit run kmeans_app.py
```
This allows you to upload `.csv` files, set cluster count, and visualize results in real time.
Great for experiments, education, and showcasing clustering interactively.
---
## ๐ณ Docker-Based Setup (Optional)
Prefer running in a **containerized Jupyter Lab** environment?
```bash
docker compose up --build
```
Then open the browser at:
๐ [http://localhost:8889](http://localhost:8889)
> You can also stop the container with:
```bash
docker compose down --volumes --remove-orphans
```
---
## ๐ Authentication & Security
This project is configured for **login-free use** of Jupyter Lab โ no password or token required.
- โ
`.env.example` is included with recommended settings.
- ๐ซ `.env` is deliberately **excluded** from the repo (add your own if needed).
- ๐ก๏ธ You may modify the `docker-compose.yml` to add a token or hashed password later.
---
## ๐ง Recommended Use
- โ๏ธ Study each notebook sequentially and refer back as needed
- ๐งช Use `/scripts/` functions in other projects or interview tasks
- ๐งต Treat `docs/numpy_cheatsheet.md` as your quick review guide
- ๐ง Use `10_kmeans_from_scratch.ipynb` in your resume to show NumPy fluency
- ๐ก Add your own notebooks (e.g., PCA from scratch, numerical integration, etc.)
---
## ๐ง Getting Started (Without Docker)
1. **Clone the repo**
```bash
git clone https://github.com/SatvikPraveen/NumPyMasterPro.git
cd NumPyMasterPro
```
2. **Create & activate a virtual environment**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
3. **Launch the Jupyter Lab interface**
```bash
jupyter lab
```
---
## ๐งช Testing
**NumPyMasterPro** includes a comprehensive test suite with **80+ unit tests** covering all utility modules.
### Quick Testing
```bash
# Install test dependencies
pip install pytest pytest-cov
# Run all tests
pytest
# Run with coverage
pytest --cov=scripts --cov-report=term-missing
```
### Using Makefile Commands
```bash
make test # Run all tests
make test-coverage # Generate coverage report
make lint # Check code quality
make format # Auto-format code
make all # Run complete checks
```
### Test Coverage
- โ
Array utilities (describe, compare, flags)
- โ
Logical operations (any, all, where, masking)
- โ
K-Means algorithm (clustering, inertia)
- โ
Math operations (arithmetic, trig, rounding)
- โ
Linear algebra (matrices, eigenvalues, SVD)
- โ
Statistics (normalization, correlation)
๐ **Detailed testing guide:** [TESTING.md](docs/TESTING.md)
### CI/CD Pipeline
Automated testing runs on:
- ๐ Every push to `main`/`develop`
- ๐ All pull requests
- โ
Multi-OS (Ubuntu, macOS, Windows)
- โ
Python 3.10, 3.11, 3.12
- โ
Code linting & formatting checks
- โ
Notebook validation
- โ
Docker build verification
---
## ๐ License
This project is licensed under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0). See the [LICENSE](./LICENSE) file for more details.
---
## ๐ Showcase & Star
If this project helped you master NumPy, feel free to โญ it and share it with others!