An open API service indexing awesome lists of open source software.

https://github.com/efischer19/hoopstat-haus

A GenAI-powered data lakehouse for NBA/WNBA stats. Ingests, processes, and provides insights for predictive analytics and semantic search. Built with Python, robust backend infra, and deployed via GH Actions. Your go-to for advanced hoops data!
https://github.com/efischer19/hoopstat-haus

ai-assisted-development aws basketball data-analytics data-engineering devops docker github-actions gplv3 machine-learning monorepo nba open-data open-source poetry python serverless software-craftsmanship sports-analytics terraform

Last synced: 3 months ago
JSON representation

A GenAI-powered data lakehouse for NBA/WNBA stats. Ingests, processes, and provides insights for predictive analytics and semantic search. Built with Python, robust backend infra, and deployed via GH Actions. Your go-to for advanced hoops data!

Awesome Lists containing this project

README

          

# Hoopstat Haus πŸ€

[![Status: WIP](https://img.shields.io/badge/status-work_in_progress-yellow.svg)](https://github.com/efischer19/hoopstat-haus)

A GenAI-powered data lakehouse for NBA/WNBA stats. Your go-to for advanced hoops data!

---

> **Note:** This project is currently under active development and is not yet functional. The infrastructure and core components are being built. Please check back for updates!

## πŸš€ Quick Start: Access Basketball Analytics (Stateless JSON)

Per ADR-027, initial public access is provided via small, precomputed JSON artifacts served directly from S3. No auth required.

### What’s available
- player_daily: per-player daily metrics
- team_daily: per-team daily metrics
- top_lists: curated top metrics (e.g., top_ts, top_per, top_efg, top_net)
- index/latest.json: pointer to the most recent available dates

All artifacts are versioned (v1) and capped at ~100 KB for fast, low-cost access.

### πŸ“Š Data Availability
- Coverage: 2023-24 NBA season onwards
- Updates: Daily, 2–4 hours after games complete
- Format: JSON artifacts under gold/served/
- Access: Public S3 with CORS (CDN optional)

Note: An MCP adapter may be added later as an optional layer. See meta/plans/v2-architecture-diagram.md.

## About The Project

Hoopstat Haus is an open-source project aimed at creating a comprehensive data lakehouse for basketball analytics. It ingests and processes NBA/WNBA statistics to provide deep insights for predictive modeling and powerful semantic search.

The core mission is to leverage modern data infrastructure and Generative AI to make advanced basketball analysis accessible and powerful.

## Tech Stack

This project is being built with a focus on robust, modern backend infrastructure:

* **Language:** Python
* **Core Functionality:** Data Ingestion, Processing, and Predictive Analytics
* **Deployment:** Fully automated via GitHub Actions

## Current Status

The repository has been seeded with foundational documents and architectural principles. The next phase of development will focus on building the core data ingestion pipelines.

The project is **not operational** at this time.

## Repository Structure

```
apps/ # Individual applications
libs/ # Shared Python libraries
infrastructure/ # Terraform AWS infrastructure (includes ECR)
docs-src/ # Documentation source (MkDocs with Material theme)
scripts/ # Utility scripts (ECR helper, etc.)
meta/ # Project metadata and ADRs
templates/ # Project templates
```

Key infrastructure components:
- **AWS ECR**: Container registry with automated CI/CD integration
- **GitHub Actions**: Automated testing, building, and deployment
- **Terraform**: Infrastructure as code for AWS resources

## Contributing

While the core infrastructure is being established, contributions are welcome in the form of ideas, feature requests, and bug reports. Please see our **[Contributing Guidelines](.github/CONTRIBUTING.md)** for more details on how you can help shape the future of Hoopstat Haus.

### Quality Assurance for Contributors

To maintain code quality and reduce review cycles, please run local quality checks before submitting pull requests:

```bash
# For Python projects (apps and libs)
./scripts/local-ci-check.sh apps/your-app
./scripts/local-ci-check.sh libs/your-lib
```

**Optional**: Set up pre-commit hooks to automatically run quality checks:
```bash
pip install pre-commit
pre-commit install
```

This ensures your code passes the same checks that CI runs, catching formatting and linting issues early.

### Documentation

This project uses [MkDocs with Material theme](https://squidfunk.github.io/mkdocs-material/) for documentation. All documentation is authored in `docs-src/` and automatically published to GitHub Pages.

**Local Documentation Development:**
```bash
# Install documentation dependencies
pip install -r docs-requirements.txt

# Build documentation (includes API docs generation)
./scripts/build-docs.sh

# Serve documentation locally
mkdocs serve
```

The documentation site will be available at `http://localhost:8000` for local preview.

**Documentation Structure:**
- Library API documentation is automatically generated from docstrings
- Development guides and ADRs are manually authored in `docs-src/`
- Documentation is published to: https://efischer19.github.io/hoopstat-haus/

---