https://github.com/bjornmelin/stardex
๐ Stardex: Explore GitHub Stars Intelligently. Stardex is a powerful web app that lets you search, filter, and cluster any GitHub user's starred repositories. Discover hidden patterns and find your next favorite project with intelligent, AI-powered exploration.
https://github.com/bjornmelin/stardex
clustering ml nextjs search-engine shadcn-ui starred-repositories tailwindcss tensorflow
Last synced: 2 months ago
JSON representation
๐ Stardex: Explore GitHub Stars Intelligently. Stardex is a powerful web app that lets you search, filter, and cluster any GitHub user's starred repositories. Discover hidden patterns and find your next favorite project with intelligent, AI-powered exploration.
- Host: GitHub
- URL: https://github.com/bjornmelin/stardex
- Owner: BjornMelin
- License: mit
- Created: 2025-01-15T01:34:45.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-01-26T02:26:59.000Z (4 months ago)
- Last Synced: 2025-01-26T02:28:04.303Z (4 months ago)
- Topics: clustering, ml, nextjs, search-engine, shadcn-ui, starred-repositories, tailwindcss, tensorflow
- Language: TypeScript
- Homepage:
- Size: 242 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# โญ Stardex - Explore GitHub Stars Intelligently
> ๐ Discover patterns in your GitHub stars through machine learning
[](https://nextjs.org/)
[](https://fastapi.tiangolo.com/)
[](https://scikit-learn.org/)
[](https://www.typescriptlang.org/)
[](https://www.python.org/)
[](https://tailwindcss.com)
[](https://github.com/BjornMelin)
[](https://choosealicense.com/licenses/mit/)
[](https://react.dev)Stardex helps you explore and understand your GitHub starred repositories through advanced machine learning clustering and interactive visualizations.
## ๐ Table of Contents
- [โจ Features](#-features)
- [๐ ๏ธ Technology Stack](#๏ธ-technology-stack)
- [๐ Detailed Features](#-detailed-features)
- [๐๏ธ Architecture](#๏ธ-architecture)
- [๐ Getting Started](#-getting-started)
- [๐ API Reference](#-api-reference)
- [๐งช Development](#-development)
- [๐ Performance](#-performance)
- [๐จโ๐ป Author](#-author)
- [๐ How to Cite](#-how-to-cite)
- [๐ License](#-license)## โจ Features
- ๐ **Smart Analysis**: Machine learning-based clustering of repositories
- ๐ **Interactive Visualization**: Dynamic D3.js visualization of repository clusters
- โก **Real-time Processing**: Fast data processing and clustering
- ๐ **Efficient Data Flow**: Optimized communication between services
- ๐ก๏ธ **Type Safety**: Full TypeScript and Python type coverage
- ๐จ **Modern UI**: Clean, responsive interface with Tailwind CSS
- ๐ฑ **Mobile Ready**: Fully responsive design for all devices## ๐ ๏ธ Technology Stack
- **Frontend**
- Next.js 13 with App Router
- React 18 with TypeScript
- TanStack Query for data management
- D3.js for visualizations
- Tailwind CSS for styling
- Shadcn/ui components- **Backend**
- FastAPI for REST API
- scikit-learn for ML operations
- Poetry for dependency management
- Pydantic for data validation## ๐ Detailed Features
### Search & Filtering
- Real-time repository search
- Language-based filtering
- Star count range filtering
- Topic-based filtering
- Date range filtering### AI Clustering
- Multi-algorithm clustering approach:
- K-means for broad repository grouping
- Hierarchical clustering for detailed relationships
- PCA + Hierarchical clustering for large datasets
- TF-IDF vectorization for text analysis
- Configurable clustering parameters
- Performance metrics tracking
- Efficient processing of large datasets### Visualization
- Interactive D3.js force-directed graph
- Cluster-based coloring
- Zoom and pan capabilities
- Repository details on hover
- Smooth animations and transitions## ๐๏ธ Architecture
The application is structured as a monorepo with two main services:
### ๐จ Frontend Service (Next.js)
- Located in `/frontend`
- Built with Next.js, React, and TypeScript
- Uses TanStack Query for data fetching
- Implements a responsive UI with Tailwind CSS
- Visualizes repository clusters using D3.js### โ๏ธ Backend Service (FastAPI)
- Located in `/backend`
- Built with FastAPI and Python
- Implements advanced clustering using scikit-learn
- Provides RESTful API endpoints
- Efficient data processing with sparse matrices
- Parallel processing capabilities## ๐ Getting Started
1. **Clone & Install:**
```bash
# Install root dependencies
npm install# Install frontend dependencies
cd frontend
npm install# Install backend dependencies
cd ../backend
poetry install
```2. **Environment Setup:**
```bash
# Frontend (.env.local)
NEXT_PUBLIC_API_URL=http://localhost:8000
```3. **Development:**
```bash
# Run both services
npm run dev# Or run individually
npm run dev:frontend
npm run dev:backend
```## ๐ API Reference
### ๐ POST /api/cluster
Clusters GitHub repositories based on their features.
Request Body
```json
{
"repositories": [
{
"id": number,
"name": string,
"full_name": string,
"description": string | null,
"html_url": string,
"stargazers_count": number,
"forks_count": number,
"open_issues_count": number,
"size": number,
"watchers_count": number,
"language": string | null,
"topics": string[],
"owner": {
"login": string,
"avatar_url": string
},
"updated_at": string
}
]
}
```Response
```json
[
{
"repo": {
// Repository data (same as input)
},
"cluster_id": number,
"coordinates": [number, number]
}
]
```### ๐ฅ GET /health
Health check endpoint.
```json
{
"status": "healthy"
}
```## ๐งช Development
### ๐ฌ Technical Implementation
The clustering process follows these steps:
1. ๐ **Feature Extraction**
- TF-IDF vectorization for text data
- Repository metadata processing
- Language and topic encoding2. ๐ **Dimensionality Reduction**
- PCA for high-dimensional data
- Configurable number of components
- Efficient sparse matrix operations3. ๐ฏ **Clustering**
- K-means for initial grouping
- Hierarchical clustering with Ward linkage
- PCA-enhanced hierarchical clustering for large datasets4. ๐จ **Visualization**
- Interactive D3.js rendering
- Cluster-based coloring
- Smooth animations### ๐ ๏ธ Code Quality
- ๐ **Style Guides**
- Frontend: ESLint + Prettier
- Backend: Black + isort- โ **Testing**
- Frontend: Jest + React Testing Library
- Backend: pytest- ๐ **Git Workflow**
- Feature branches
- Pull request reviews
- Semantic versioning## ๐ Performance
### โก Backend Optimizations
- Efficient sparse matrix operations
- Parallel processing capabilities
- Memory-optimized data structures
- Request validation & caching### ๐ Frontend Optimizations
- Optimized D3.js rendering
- React Query data caching
- Component lazy loading## ๐จโ๐ป Author
### Bjorn Melin
- GitHub: [@BjornMelin](https://github.com/BjornMelin)
- Website: [bjornmelin.io](https://bjornmelin.io)
- LinkedIn: [@bjorn-melin](https://www.linkedin.com/in/bjorn-melin/)## ๐ How to Cite
If you use Stardex in your research or project, please cite it as follows:
```bibtex
@software{melin2024stardex,
author = {Melin, Bjorn},
title = {Stardex: GitHub Stars Explorer},
year = {2024},
publisher = {GitHub},
url = {https://github.com/BjornMelin/stardex},
version = {1.0.0},
description = {A machine learning-powered tool for exploring and understanding GitHub starred repositories through clustering and interactive visualizations}
}
```## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
Built with โค๏ธ by [Bjorn Melin](https://bjornmelin.io)