{"id":45321181,"url":"https://github.com/ascentia-sandbox/startinsight","last_synced_at":"2026-04-30T02:03:27.048Z","repository":{"id":346590226,"uuid":"1136356534","full_name":"Ascentia-Sandbox/StartInsight","owner":"Ascentia-Sandbox","description":"Daily automated startup intelligence: 6 scrapers (Reddit/HN/PH/Trends/X) → 8 AI agents (Gemini 2.0 Flash) → ranked insights with evidence. FastAPI + Next.js 16 + Supabase + Stripe. Live at ~$30/mo.","archived":false,"fork":false,"pushed_at":"2026-04-07T00:43:02.000Z","size":16570,"stargazers_count":0,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-07T02:25:55.792Z","etag":null,"topics":["ai","ai-agents","arq","business-intelligence","fastapi","gemini","market-research","nextjs","pydantic-ai","python","railway","saas","sentry","startup","startup-ideas","stripe","supabase","typescript"],"latest_commit_sha":null,"homepage":"https://start-insight-ascentias-projects.vercel.app","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ascentia-Sandbox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-01-17T14:45:56.000Z","updated_at":"2026-04-07T00:42:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/Ascentia-Sandbox/StartInsight","commit_stats":null,"previous_names":["ascentia-sandbox/startinsight"],"tags_count":111,"template":false,"template_full_name":null,"purl":"pkg:github/Ascentia-Sandbox/StartInsight","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ascentia-Sandbox%2FStartInsight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ascentia-Sandbox%2FStartInsight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ascentia-Sandbox%2FStartInsight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ascentia-Sandbox%2FStartInsight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ascentia-Sandbox","download_url":"https://codeload.github.com/Ascentia-Sandbox/StartInsight/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ascentia-Sandbox%2FStartInsight/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31687881,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T13:07:20.380Z","status":"ssl_error","status_checked_at":"2026-04-11T13:06:47.903Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-agents","arq","business-intelligence","fastapi","gemini","market-research","nextjs","pydantic-ai","python","railway","saas","sentry","startup","startup-ideas","stripe","supabase","typescript"],"created_at":"2026-02-21T08:02:46.232Z","updated_at":"2026-04-26T02:02:04.786Z","avatar_url":"https://github.com/Ascentia-Sandbox.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# StartInsight\n\n\u003e **AI-Powered Business Intelligence Engine for Startup Idea Discovery**\n\nStartInsight is a daily, automated intelligence platform that discovers, validates, and presents data-driven startup ideas by analyzing real-time market signals from social discussions, search trends, and product launches.\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-00C7B7.svg)](https://fastapi.tiangolo.com)\n[![Next.js 16+](https://img.shields.io/badge/Next.js-16+-black.svg)](https://nextjs.org)\n[![CI/CD](https://github.com/Ascentia-Sandbox/StartInsight/actions/workflows/ci-cd.yml/badge.svg)](https://github.com/Ascentia-Sandbox/StartInsight/actions/workflows/ci-cd.yml)\n\n---\n\n## 🌐 Live Production\n\n| Environment | URL | Status |\n|-------------|-----|--------|\n| **Frontend** | [startinsight.co](https://startinsight.co) | ✅ Live (Vercel) |\n| **Backend API** | [api.startinsight.co](https://api.startinsight.co) | ✅ Live (Railway) |\n| **Staging Frontend** | [start-insight-staging-ascentias-projects.vercel.app](https://start-insight-staging-ascentias-projects.vercel.app) | ✅ Live |\n| **Staging Backend** | [backend-staging-fbd7.up.railway.app](https://backend-staging-fbd7.up.railway.app) | ✅ Live |\n| **API Docs** | `/docs` (Swagger) | [Available](https://api.startinsight.co/docs) |\n\n---\n\n## 📍 Current Phase (2026-04-18)\n\n**Engineering freeze: 2026-04-18 → 2026-06-17.** Phase 1-10 + GTM Phase 1-5 are complete and running on autopilot (6 scrapers, 8 agents, automated content pipeline, social posting, email nurture). Active work for the next 60 days is user discovery and distribution only — no feature work.\n\nSee `memory-bank/active-context.md` for the wedge test + distribution sprint plan, `memory-bank/gtm-automation-plan.md` for the 90-day operational playbook, and `memory-bank/daily-operations.md` for the daily checklist and metrics tracker.\n\n---\n\n## 🎯 What is StartInsight?\n\nUnlike traditional brainstorming tools, StartInsight relies on **real-time market signals** to identify genuine market gaps and consumer pain points. The system operates on an automated **\"Collect → Analyze → Present\"** loop, functioning as an analyst that never sleeps.\n\n### Core Philosophy\n\n- **Signal over Noise**: Surface problems real people are complaining about or searching for\n- **Data-Driven Intuition**: Every insight backed by source data (Reddit threads, search trends)\n- **Automated Intelligence**: AI agents handle market research, leaving users with high-level decision-making\n\n---\n\n## ✨ Features\n\n### Current (Phase 1-10, 12-14, A-L, Q1-Q9, Security, Sentry — Production Live)\n\n**Data Intelligence**\n- **Automated Data Collection**: 6 scrapers (Reddit, Product Hunt, Google Trends, Twitter/X, Hacker News, Firecrawl) — 150+ signals/day target\n- **AI-Powered Analysis**: Gemini 2.0 Flash with 8-dimension scoring (97% cost reduction vs Claude)\n- **8 AI Agents**: Enhanced analyzer, 40-step research agent, competitive intel, market intel, content generator, chat strategist, quality reviewer, weekly digest\n- **Evidence Visualizations**: Radar charts, KPI cards, confidence badges, trend verification, engagement metrics\n- **Content Quality Gates**: Post-LLM validation (300+ word minimum), SHA-256 deduplication, auto-approval at 0.85\n\n**Design System (Phase G) — \"Data Intelligence\" Aesthetic**\n- **Typography**: Instrument Serif (editorial headings), Satoshi (body), JetBrains Mono (data/scores)\n- **Color System**: Deep teal primary, warm amber accent — distinctive identity, not a generic SaaS clone\n- **Motion**: Framer Motion staggered reveals, counter animations, skeleton morphing\n- **Textures**: Dot grid backgrounds, gradient mesh heroes, card noise overlays\n\n**Content Pipeline (Phase H)**\n- **6 Active Scrapers**: Reddit (50/run), Product Hunt (30/run), Google Trends (6 regions), Twitter/X, Hacker News (50+ score filter), Firecrawl\n- **Real Trend Data**: No more synthetic charts — real Google Trends data with \"Search Interest\" badge fallback\n- **Multi-Region**: Google Trends scraped across US, UK, Germany, Japan, Sydney, Australia\n\n**Admin Portal Excellence (Phase I)**\n- **Dashboard Charts**: 4 Recharts visualizations (content volume, agent activity, user growth, quality trends)\n- **Cmd+K Command Palette**: Global keyboard shortcut, 16 commands, arrow key navigation, category groups\n- **Data Pagination**: Reusable pagination with URL state (page/per_page), integrated in 3+ admin pages\n- **Export**: CSV/JSON export endpoints with frontend download buttons\n- **Bulk Operations**: Row selection, bulk delete with confirmation, bulk export\n\n**Public Editorial Design (Phase J)**\n- **Story-Driven Homepage**: Hero gradient with serif titles, animated counters, latest insights grid, 8-dimension deep-dive\n- **InsightCard Redesign**: Teal score bar, platform-colored source badges, market size circles, relative dates\n- **Magazine Detail Pages**: Editorial hero, score dashboard, problem/solution columns, evidence section, sticky action bar\n- **Market Insights**: AI-generated badges, reading time estimates, enhanced author bios\n\n**Evidence \u0026 Social Proof (Phase K)**\n- **Confidence Badges**: High/Medium/Needs Verification on every insight card\n- **Public Stats API**: Real-time counters (total insights, signals, avg quality) on homepage\n- **Engagement Metrics**: View/save/share counts on insight detail pages\n- **Evidence Scoring**: Evidence Score badges, Google Trends verification, data point counts\n\n**Competitive Differentiators (Phase L)**\n- **5 Chat Strategist Modes**: General, Pressure Test, GTM Planning, Pricing Strategy, Competitive Analysis\n- **Competitive Landscape Map**: Recharts ScatterChart with Market Maturity × Innovation Score quadrants\n- **Enhanced Validator**: Hero gradient, radar chart results, free tier badge\n- **Weekly Email Digest**: Top 10 insights every Monday, scheduled via Arq worker\n\n**Enterprise Features (Phase 8-10)**\n- **Superadmin Control Center**: Content quality management, pipeline monitoring, AI agent prompt control, cost tracking\n- **User Engagement**: Preferences \u0026 email digests, AI idea chat, community voting/comments/polls, gamification\n- **Integration Ecosystem**: External integrations, webhooks with retry logic, OAuth connections\n\n**User Features**\n- **Visual Dashboard**: Next.js interface with insights, trend graphs, filters, dark mode\n- **Workspace Management**: Save insights, rate quality, claim for development\n- **Team Collaboration**: RBAC with owner/admin/member roles, shared insights\n- **Custom Research**: Submit research requests with tier-based approval\n\n**Reliability \u0026 Rate-Limit Handling**\n- **Gemini 429 Retry**: `quality_reviewer.py` uses `tenacity` with 4-attempt exponential backoff (5s → 10s → 20s → 40s) + 2s inter-call sleep — eliminates RESOURCE_EXHAUSTED cascades in 10/20-insight audit batches\n- **All AI Agents Protected**: `enhanced_analyzer.py` and `research_agent.py` also use tenacity retry — consistent pattern across all LLM-calling agents\n\n**Data Pipeline Resilience (Phase 6)**\n- **Circuit Breakers**: Per-scraper circuit breakers (2 failures → 15min cooldown) with stale-on-error fallback\n- **3-Tier Caching**: L1 in-memory (cachetools TTLCache, 30s) → L2 Redis (60-300s) → stale fallback with negative caching\n- **Source Health**: Real-time health dashboard (`source_health` table, GET /health/sources) with intelligence gap detection\n- **Anomaly Detection**: Welford's online algorithm for temporal baselines, z-score spike/drought detection\n- **Cross-Source Correlation**: TF-IDF + cosine similarity with union-find grouping across 5 sources\n- **AI Fallback Chain**: Gemini → Claude → rule-based extraction with source credibility weighting\n\n**Security \u0026 Observability**\n- **Security Headers**: HSTS (`max-age=31536000`), CSP, X-Frame-Options, X-Content-Type-Options via middleware\n- **JWT Authentication**: ES256 (ECDSA P-256) via Supabase JWKS endpoint — no shared secret needed\n- **XSS Prevention**: `bleach` sanitization on all user inputs, `markupsafe.escape()` for display\n- **Password Recovery**: `/auth/update-password` page with Supabase recovery token flow\n- **Sentry Monitoring**: Errors + performance traces + structured logs + AI spans on production + staging\n- **AI Agent Observability**: Manual `gen_ai.request` spans (model, token usage, latency) in Sentry Traces\n- **Session Replay**: Sentry Session Replay (maskAllText, blockAllMedia) on frontend errors\n\n**Developer Features**\n- **Public API**: 235+ REST endpoints with Swagger/OpenAPI documentation\n- **API Key Management**: Scoped keys with usage tracking, rate limiting\n- **Export Tools**: CSV/JSON exports with brand customization\n- **Row-Level Security**: Supabase RLS policies on all 70 tables\n- **Comprehensive Testing**: 398 backend tests (47% coverage), 47 E2E tests (8 suites, 5 browsers)\n- **CI/CD Pipeline**: GitHub Actions — Security Scan → Tests (fast + integration parallel) → Migrate → Build → Deploy\n- **Sentry Automation**: Daily triage workflow (Mon-Fri), auto-fix for 4 known error patterns\n\n---\n\n## 🏗️ Architecture\n\n```mermaid\ngraph LR\n    A[Reddit/PH/Trends/HN/Twitter] --\u003e|6 Scrapers| B[Arq Worker]\n    B --\u003e|Raw Signals| C[(Supabase PostgreSQL)]\n    C --\u003e|Unprocessed| D[Gemini 2.0 Flash]\n    D --\u003e|8-Dim Insights| C\n    C --\u003e|API| E[FastAPI]\n    E --\u003e|JSON/SSE| F[Next.js Dashboard]\n    G[Railway Redis] -.-\u003e|Queue| B\n```\n\n**Cloud Infrastructure:**\n- **Database**: Supabase Pro PostgreSQL (Sydney, ap-southeast-2), 200 connection pool limit\n- **Cache/Queue**: Railway Redis (native service, `redis.railway.internal:6379`)\n- **Backend**: Railway (port 8080, Docker, `railway.toml`)\n- **Frontend**: Vercel (Next.js App Router)\n\n### The Three Core Loops\n\n1. **Loop 1: Data Collection** (Every 6 hours)\n   - Scrapes content using Firecrawl (markdown format)\n   - Stores raw signals in Supabase PostgreSQL with metadata\n\n2. **Loop 2: Analysis** (After each collection)\n   - Gemini 2.0 Flash processes unprocessed signals\n   - Validates output with Pydantic schemas\n   - Scores relevance and market potential (8-dimension scoring)\n\n3. **Loop 3: Presentation** (On-demand)\n   - FastAPI serves ranked insights via REST\n   - Next.js dashboard displays top insights with visualizations\n\n---\n\n## 🛠️ Tech Stack\n\n### Backend\n- **Framework**: FastAPI (async-first)\n- **Language**: Python 3.11+\n- **Database**: Supabase Pro PostgreSQL (ap-southeast-2, Sydney)\n- **ORM**: SQLAlchemy 2.0 (async)\n- **Queue**: Redis + Arq (async task queue)\n- **AI**: PydanticAI + Gemini 2.0 Flash ($0.10/M tokens)\n- **Auth**: Supabase Auth (OAuth + email/password)\n\n### Frontend\n- **Framework**: Next.js 16.1.3 (App Router, React 19.2.3)\n- **Language**: TypeScript\n- **Styling**: Tailwind CSS 4.0, teal/amber \"Data Intelligence\" design system\n- **Typography**: Instrument Serif (headings), Satoshi (body), JetBrains Mono (data)\n- **Components**: shadcn/ui (25+ components), Cmd+K command palette\n- **Charts**: Recharts (radar, scatter, line, area, bar)\n- **Animation**: Framer Motion (stagger reveals, counters, skeleton morphing)\n- **State**: TanStack Query (React Query)\n- **Markdown**: react-markdown + remark-gfm + rehype-sanitize\n\n### Data Pipeline\n- **Scraping**: Firecrawl (web → markdown), Tweepy (Twitter/X)\n- **Reddit**: PRAW (Python Reddit API Wrapper)\n- **Trends**: pytrends (Google Trends API)\n- **RSS**: feedparser (custom feeds)\n\n### Services\n- **Payments**: Stripe (4-tier subscriptions, webhooks) — live mode, 3 products, 6 prices (monthly + yearly), webhook configured\n- **Email**: Resend (6 email templates)\n- **Rate Limiting**: SlowAPI + Redis (tier-based quotas)\n- **Error Tracking**: Sentry (`sentry-sdk[fastapi]\u003e=2.0.0`, `@sentry/nextjs@^10.38.0`) — errors + traces + logs\n\n### DevOps \u0026 CI/CD\n- **CI/CD**: GitHub Actions (`.github/workflows/ci-cd.yml`) — `main` → production, `develop` → staging\n- **Backend Hosting**: Railway (Dockerfile + `railway.toml`, port 8080)\n- **Frontend Hosting**: Vercel (App Router, Next.js 16+)\n- **IaC**: Railway MCP + Vercel MCP for environment variable management\n\n### DevOps\n- **Database**: Supabase Pro (PostgreSQL 15+, Row-Level Security, DB_SSL=True)\n- **Cache**: Redis 7\n- **Package Managers**: `uv` (Python), `npm` (Node.js)\n- **Migrations**: Alembic + Supabase migrations (25+ total)\n- **Linting**: Ruff (Python), ESLint + Prettier (TypeScript)\n- **Testing**: pytest (398 tests, 47% coverage), Playwright (E2E, 5 browsers), pytest-rerunfailures\n- **Monitoring**: Sentry (errors + traces + AI spans), daily triage + auto-fix GitHub Actions\n\n---\n\n## 🚀 Quick Start\n\n\u003e **Cloud-First Setup**: StartInsight uses Supabase Cloud PostgreSQL (production) and Railway Redis (production). Local dev uses a local Redis instance.\n\n### Prerequisites\n\n- **Python 3.11+**\n- **Node.js 18+**\n- **uv** (Python package manager): `curl -LsSf https://astral.sh/uv/install.sh | sh`\n- **Supabase Account**: [supabase.com](https://supabase.com) (PostgreSQL database + auth)\n- **Redis**: Local Redis for dev; Railway Redis auto-provisioned in production\n\n### 1. Clone the Repository\n\n```bash\ngit clone https://github.com/Ascentia-Sandbox/StartInsight.git\ncd StartInsight\n```\n\n### 2. Create Supabase Project\n\n1. Go to [supabase.com](https://supabase.com) and create a new project\n2. Choose **Asia Pacific (Sydney)** region\n3. Copy your connection string from **Project Settings \u003e Database \u003e Connection string** (Connection Pooling mode)\n4. Copy your API keys from **Project Settings \u003e API**\n\n### 3. Redis Setup\n\n**Production**: Handled automatically — Railway Redis service is provisioned in the project (`redis.railway.internal:6379`). No external account needed.\n\n**Local development**: Install Redis locally (`brew install redis` / `apt install redis`) and set `REDIS_URL=redis://localhost:6379`.\n\n### 4. Configure Backend\n\n```bash\ncd backend\ncp .env.example .env\n```\n\nEdit `backend/.env` with your cloud credentials:\n```bash\n# Database (Supabase Cloud)\nDATABASE_URL=postgresql+asyncpg://postgres.[PROJECT_REF]:[PASSWORD]@aws-0-ap-southeast-2.pooler.supabase.com:5432/postgres?pgbouncer=true\n\n# Supabase Auth\nSUPABASE_URL=https://[PROJECT_REF].supabase.co\nSUPABASE_ANON_KEY=your_supabase_anon_key\nSUPABASE_SERVICE_ROLE_KEY=your_supabase_service_role_key\nJWT_SECRET=your_jwt_secret_from_supabase\n\n# Redis (local dev; production uses Railway Redis automatically)\nREDIS_URL=redis://localhost:6379\n\n# AI (Gemini 2.0 Flash)\nGOOGLE_API_KEY=your_google_api_key\n\n# See .env.example for all required keys\n```\n\n### 5. Configure Frontend\n\n```bash\ncd ../frontend\ncp .env.example .env.local\n```\n\nEdit `frontend/.env.local`:\n```bash\nNEXT_PUBLIC_API_URL=https://api.startinsight.co  # production; use http://localhost:8000 for local dev\nNEXT_PUBLIC_SUPABASE_URL=https://[PROJECT_REF].supabase.co\nNEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key\n```\n\n### 6. Initialize Database\n\n```bash\ncd backend\n\n# Install dependencies\nuv sync\n\n# Run migrations\nalembic upgrade head\n```\n\n### 7. Start Backend\n\n```bash\n# From backend/ directory\nuvicorn app.main:app --reload\n```\n\nBackend runs at: **http://localhost:8000**\n- API docs: http://localhost:8000/docs\n- Health check: http://localhost:8000/health\n\n### 8. Start Frontend\n\n```bash\n# From frontend/ directory\nnpm install\nnpm run dev\n```\n\nFrontend runs at: **http://localhost:3000**\n\n---\n\n## 🚀 Deployment\n\n### Infrastructure Overview\n\n| Service | Purpose | Tier | Cost |\n|---------|---------|------|------|\n| **Supabase** | PostgreSQL + Auth | Pro | $25/mo |\n| **Gemini 2.0 Flash** | AI analysis | Pay-as-you-go | ~$5/mo |\n| **Railway** | Backend + Redis | Free (500h/mo + free Redis) | $0 |\n| **Vercel** | Frontend hosting | Hobby | $0 |\n| **Sentry** | Error tracking | Free (5K events) | $0 |\n| **Resend** | Transactional email | Free (3K emails) | $0 |\n| | | **Total** | **~$30/mo** |\n\n### Environment Templates\n\n| Environment | Backend | Frontend |\n|-------------|---------|----------|\n| **Staging** | [`backend/.env.staging.example`](backend/.env.staging.example) | [`frontend/.env.staging.example`](frontend/.env.staging.example) |\n| **Production** | [`backend/.env.production.example`](backend/.env.production.example) | [`frontend/.env.production.example`](frontend/.env.production.example) |\n| **Development** | [`backend/.env.example`](backend/.env.example) | [`frontend/.env.example`](frontend/.env.example) |\n\n### CI/CD Pipeline (Automated)\n\nDeployment is fully automated via GitHub Actions:\n\n```\nPush to main  → Security Scan → Backend Tests → Frontend Tests\n             → Migrate Production DB → Build Docker Image → Deploy to Production\n\nPush to develop → Security Scan → Backend Tests → Frontend Tests\n               → Migrate Staging DB → Deploy to Staging\n```\n\n**Deployed URLs:**\n- Production Backend: `https://api.startinsight.co`\n- Production Frontend: `https://startinsight.co`\n- Staging Backend: `https://backend-staging-fbd7.up.railway.app`\n\n### Manual Deployment (First Time)\n\n```bash\n# 1. Create accounts: Railway, Vercel, Sentry, Resend, Google AI Studio\n# 2. Run database migrations against Supabase\ncd backend \u0026\u0026 DATABASE_URL=\"postgresql+asyncpg://...\" alembic upgrade head\n\n# 3. Deploy backend to Railway (link GitHub repo, set root dir to repo root)\n#    Add all vars from backend/.env.production.example in Railway dashboard\n#    ⚠️ Set target port = 8080 in Railway domain settings\n\n# 4. Deploy frontend to Vercel (import repo, set root dir to frontend/)\n#    Add all vars from frontend/.env.production.example in Vercel dashboard\n\n# 5. Update CORS: set Railway CORS_ORIGINS to match Vercel URL\n# 6. Verify: curl https://[railway-url]/health → {\"status\":\"healthy\"}\n```\n\n### Gotchas\n\n- **Railway target port** — must be `8080` (Railway injects `PORT=8080`, not 8000)\n- **NEXT_PUBLIC_* vars are build-time** — changing them in Vercel requires a redeploy\n- **Railway Redis URL** — uses `redis://` (plain TCP on private network), no TLS needed\n- **Sentry env vars** — set via Railway MCP (backend) and GitHub Actions workflow (Vercel)\n- **Alembic migration c008** — `purge_seed_data` is irreversible, run on staging first\n- **CORS whitelist** — production origins must exactly match `CORS_ALLOWED_PRODUCTION_ORIGINS`\n- **Railway 512MB RAM** — Playwright+Chromium takes ~400MB. If OOM, set `USE_CRAWL4AI=false`\n\n---\n\n## 🌏 Cloud-First Architecture\n\nStartInsight uses **cloud services by default** to ensure consistency between development and production:\n\n### Supabase Pro PostgreSQL (Sydney)\n\n- **Tier:** Supabase Pro ($25/mo) - sole database, no local PostgreSQL required\n- **Region:** ap-southeast-2 (Sydney) - Optimized for APAC market\n- **Latency:** \u003c50ms for Southeast Asia (vs 180ms US-based)\n- **Cost:** $25/mo (Supabase Pro) vs $69/mo (Neon) = 64% savings\n- **Features:** PostgreSQL 15+, Row-Level Security, connection pooling (200 limit), real-time subscriptions, SSL required\n\n### Railway Redis (Production)\n\n- **Location:** Same Railway project as backend (private network, zero latency)\n- **Hostname:** `redis.railway.internal:6379` (internal only, not publicly accessible)\n- **Cost:** Free (Railway free tier includes Redis)\n- **Use Cases:** Arq task queue, rate limiting (tier-based quotas)\n\n### Why Cloud-First?\n\n1. **No Infrastructure Setup**: Skip Docker, PostgreSQL installation (only local Redis needed for dev)\n2. **Production Parity**: Development environment matches production closely\n3. **Managed Backups**: Automatic backups and point-in-time recovery (Supabase Pro)\n4. **Global Accessibility**: Access your database from anywhere\n5. **RLS Testing**: Test Row-Level Security policies in real Supabase environment\n\n---\n\n## 📁 Project Structure\n\n```\nStartInsight/\n├── backend/                    # FastAPI application\n│   ├── app/\n│   │   ├── core/              # Config, errors, dependencies\n│   │   ├── db/                # Database session, base classes\n│   │   ├── models/            # SQLAlchemy models (70 tables)\n│   │   ├── schemas/           # Pydantic schemas\n│   │   ├── api/               # API routes (230 endpoints)\n│   │   │   ├── routes/        # Insight, user, admin, public content\n│   │   │   ├── tools.py       # Tools directory API (6 endpoints)\n│   │   │   ├── success_stories.py # Success stories API (6 endpoints)\n│   │   │   ├── trends.py      # Trends API (5 endpoints)\n│   │   │   └── market_insights.py # Blog API (6 endpoints)\n│   │   ├── agents/            # 8 AI agents (enhanced_analyzer, research, competitive_intel, market_intel, content_generator, chat_agent, quality_reviewer, market_insight_publisher)\n│   │   ├── scrapers/          # Data collection modules (4 sources)\n│   │   ├── scripts/           # Seed scripts (84 content items)\n│   │   └── main.py            # FastAPI entry point\n│   ├── alembic/               # Database migrations (25+ migrations)\n│   ├── tests/                 # Pytest test suite\n│   ├── pyproject.toml         # Python dependencies (uv)\n│   └── README.md              # Backend-specific docs\n│\n├── frontend/                   # Next.js application (Phase 3-14)\n│   ├── app/                   # Next.js 16+ App Router\n│   │   ├── (routes)           # 34 total routes\n│   │   ├── tools/             # Tools directory page\n│   │   ├── success-stories/   # Founder case studies\n│   │   ├── trends/            # Trending keywords\n│   │   ├── market-insights/   # Blog articles\n│   │   ├── admin/             # Admin content management\n│   │   └── sitemap.ts         # Dynamic sitemap generation\n│   ├── components/            # React components\n│   │   ├── navigation/        # Mega-menu, mobile drawer\n│   │   ├── ui/                # 25 shadcn components\n│   │   └── evidence/          # Charts, visualizations\n│   ├── lib/                   # Utilities \u0026 API client\n│   └── package.json           # Node dependencies\n│\n├── memory-bank/               # Project documentation\n│   ├── project-brief.md       # Executive summary\n│   ├── active-context.md      # Current phase \u0026 tasks\n│   ├── implementation-plan.md # 3-phase roadmap\n│   ├── architecture.md        # System design\n│   ├── tech-stack.md          # Technology decisions\n│   ├── progress.md            # Development log\n│   └── active-context.md      # Current state (includes Tier 1-3 growth roadmap)\n│\n├── research/                  # Competitive intelligence\n│   ├── ideabrowser-analysis.md          # Full IdeaBrowser teardown\n│   ├── ideabrowser-executive-summary.md # Key findings\n│   └── ideabrowser-competitive-analysis.json\n│\n├── .claude/                   # Claude Code configuration\n│   ├── agents/                # Custom Claude agents\n│   └── skills/                # Code quality standards\n│\n├── docker-compose.yml         # Redis setup (database is Supabase Pro)\n├── CLAUDE.md                  # Claude Code guidelines\n└── README.md                  # This file\n```\n\n---\n\n## 🔄 Development Workflow\n\n### Common Commands\n\n```bash\n# Backend Development\ncd backend \u0026\u0026 uvicorn app.main:app --reload\n\n# Frontend Development\ncd frontend \u0026\u0026 npm run dev\n\n# Database Migrations\ncd backend \u0026\u0026 alembic upgrade head\n\n# Backend Tests (398 tests, 47% coverage)\ncd backend \u0026\u0026 pytest tests/ -v --cov=app\n\n# Frontend E2E Tests (47 tests, 5 browsers)\ncd frontend \u0026\u0026 npx playwright test\n\n# Lint \u0026 Format\ncd backend \u0026\u0026 uv run ruff check . --fix\ncd frontend \u0026\u0026 npm run lint --fix\n```\n\n### Database Utilities\n\n```bash\n# Create new migration\ncd backend \u0026\u0026 uv run alembic revision --autogenerate -m \"description\"\n\n# View migration history\ncd backend \u0026\u0026 uv run alembic history\n\n# Rollback migration\ncd backend \u0026\u0026 uv run alembic downgrade -1\n```\n\n### Cloud Service Management\n\n```bash\n# Verify backend health\ncurl https://api.startinsight.co/health\n\n# View Supabase logs\n# Go to: https://supabase.com/dashboard/project/[PROJECT_REF]/logs/postgres-logs\n\n# View Railway Redis metrics\n# Railway dashboard → startInsight project → Redis service\n\n# Reset database (⚠️ use with caution)\ncd backend \u0026\u0026 alembic downgrade base \u0026\u0026 alembic upgrade head\n```\n\n---\n\n## 📚 Documentation\n\nComprehensive documentation is maintained in the `memory-bank/` directory:\n\n| File | Purpose |\n|------|---------|\n| **[project-brief.md](memory-bank/project-brief.md)** | Executive summary, business objectives, core philosophy |\n| **[active-context.md](memory-bank/active-context.md)** | Current phase, immediate tasks, blockers |\n| **[implementation-plan.md](memory-bank/implementation-plan.md)** | Step-by-step 3-phase roadmap |\n| **[architecture.md](memory-bank/architecture.md)** | System design, data flows, database schema, API endpoints |\n| **[tech-stack.md](memory-bank/tech-stack.md)** | Technology decisions, dependencies, library versions |\n| **[progress.md](memory-bank/progress.md)** | Development log, completed tasks |\n\n---\n\n## 🧪 Testing\n\n### Backend Testing (pytest)\n\n**Stats**: 398 tests across 30+ files, 47% coverage\n\n```bash\n# Run all backend tests\ncd backend \u0026\u0026 pytest tests/ -v\n\n# Run with coverage report\npytest tests/ --cov=app --cov-report=html\n\n# Run specific test file\npytest tests/services/test_payment_service.py -v\n\n# Run specific test category\npytest tests/unit/ -v        # Unit tests only\npytest tests/services/ -v    # Service tests only\n```\n\n### Frontend Testing (Playwright)\n\n**Stats**: 47 E2E tests across 8 suites, 5 browser platforms (Chrome, Firefox, Safari, Mobile Chrome, Mobile Safari)\n\n```bash\n# Run all E2E tests\ncd frontend \u0026\u0026 npx playwright test\n\n# Run with browser UI\nnpx playwright test --headed\n\n# Run specific browser\nnpx playwright test --project=chromium\n\n# Run specific test file\nnpx playwright test tests/frontend/e2e/auth.spec.ts\n\n# Interactive mode\nnpx playwright test --ui\n\n# Generate test report\nnpx playwright show-report\n```\n\n---\n\n## 🤝 Contributing\n\nThis is a private development project. If you have access:\n\n1. **Read Documentation First**: Check `memory-bank/active-context.md` for current phase\n2. **Follow Coding Standards**: See `.claude/skills/` for quality guidelines\n3. **Update Progress**: Log changes to `memory-bank/progress.md`\n4. **Use Conventional Commits**: `feat:`, `fix:`, `docs:`, `chore:`\n\n### Code Quality Standards\n\nThe project enforces 4 core skills via Claude Code:\n\n- **async-alchemy**: Prevents blocking I/O in FastAPI/SQLAlchemy\n- **firecrawl-glue**: Enforces Firecrawl SDK over brittle scrapers\n- **pydantic-validator**: Ensures structured AI agent outputs\n- **vibe-protocol**: Automates documentation synchronization\n\n---\n\n## 🔑 API Keys Required\n\n| Service | Purpose | Get Key |\n|---------|---------|---------|\n| **Supabase** | Database + Auth (ap-southeast-2) | [supabase.com](https://supabase.com) |\n| **Google AI** | Gemini 2.0 Flash (AI analysis) | [aistudio.google.com](https://aistudio.google.com) |\n| **Firecrawl** | Web scraping (web → markdown) | [firecrawl.dev](https://firecrawl.dev) |\n| **Reddit** | Reddit API (PRAW) | [reddit.com/prefs/apps](https://reddit.com/prefs/apps) |\n| **Twitter** | Twitter/X API (Tweepy) | [developer.twitter.com](https://developer.twitter.com) |\n| **Stripe** | Payments (subscriptions) | [stripe.com](https://stripe.com) |\n| **Resend** | Email (transactional) | [resend.com](https://resend.com) |\n| **Sentry** | Error tracking + monitoring | [sentry.io](https://sentry.io) |\n\nStore keys in `backend/.env` and `frontend/.env.local` (never commit `.env` files).\n\n---\n\n## 📊 Current Status\n\n**Status**: ✅ **PRODUCTION LIVE + GTM ACTIVE** (Day 6 of 90-day GTM, 2026-04-12) — Custom domain `startinsight.co` live\n\n| Metric | Value |\n|--------|-------|\n| **Backend** | 235+ API endpoints, 70 database tables, 15+ services |\n| **Frontend** | 35+ routes (dashboard, workspace, research, admin, 10 public pages) |\n| **Database** | 16 Alembic migrations (c016 at head), Row-Level Security enabled |\n| **AI Agents** | 8 agents (enhanced_analyzer, research, competitive_intel, market_intel, content_generator, chat_agent, quality_reviewer, weekly_digest) |\n| **Testing** | 398 backend tests (30+ files, 47% coverage), 47 E2E tests (8 suites, 5 browsers) |\n| **Content** | 2,085+ insights, 180+ market articles, 54 tools, 12 success stories (auto-growing via pipeline) |\n| **Payments** | Stripe live mode — 3 products, 6 prices (monthly + yearly), webhook active |\n| **Monitoring** | Sentry 0 unresolved issues (confirmed 2026-04-12), ascentia-km org, backend + frontend projects |\n| **Security** | HSTS, CSP, JWT ES256 JWKS, XSS prevention (bleach), rate limiting |\n| **CI/CD** | GitHub Actions — main→production, develop→staging, all passing |\n| **Scheduler** | All background jobs running (every 6h scrapers + analyzer, daily social/email, weekly digest) |\n| **GTM** | Day 6/90 — Content pipeline GREEN (+240 insights/5 days), social posting fix in progress |\n\n**Phase Completion**:\n- ✅ Phase 1-3: MVP Foundation (scrapers, AI analysis, Next.js dashboard)\n- ✅ Phase 4: Authentication \u0026 Admin Portal (Supabase Auth, 8-dimension scoring)\n- ✅ Phase 5-7: Advanced Features (research, Stripe, teams, API keys, multi-tenancy)\n- ✅ Phase 8-10: Enterprise Features (superadmin, engagement, integrations)\n- ✅ Phase 12-14: Public Content \u0026 SEO (tools, success stories, trends, blog, sitemap)\n- ✅ Phase A-L: Professional Overhaul (design system, admin portal, competitive features)\n- ✅ Phase Q1-Q9: Quality Audit Fixes (Pulse, SEO, sanitization, rate limiting)\n- ✅ Phase S: Security Hardening (HSTS, CSP, JWT ES256, XSS prevention)\n- ✅ Phase M: Sentry Monitoring (errors + traces + logs + AI spans + Session Replay)\n- ✅ Phase P: Production Deployment (Railway + Vercel + CI/CD pipeline)\n- ✅ Phase R: Redis + Scheduler (Railway Redis provisioned, scheduler running clean)\n- ✅ QA Bug Fixes: 11 P0/P1/P2 bugs fixed (terms/privacy 404, CORS, Deep Research, `$$`, Google OAuth signup, context-aware CTAs, skeleton loaders)\n- ✅ 429 Rate-Limit Fix: tenacity retry + inter-call sleep in quality_reviewer.py (Gemini RESOURCE_EXHAUSTED eliminated)\n- ✅ Phase 6: Data Pipeline Resilience (circuit breakers, source health, 3-tier caching, anomaly detection, cross-source correlation, AI fallback)\n- ✅ API Fixes: `/api/validate` 500 fixed (invalid `RawSignal` kwarg); `research.py` FastAPI deprecation warning cleared\n\n**Business Metrics (Targets)**:\n- Signup Conversion: 4% target (2% pre-Phase 14 baseline)\n- PMF Validation Cost: ~$30/mo (Supabase Pro $25 + Gemini ~$5)\n- Revenue Target: $59K MRR at 10K users (10% paid conversion)\n\n**Competitive Position**:\n- 100% feature parity with IdeaBrowser\n- 11 unique competitive advantages\n- 50-70% lower pricing\n\n**Performance Optimizations (2026-03-04)**:\n- Home page converted to SSR with ISR (revalidate: 300s) — eliminates client-side LCP delay\n- Framer-motion (219 KB) lazy-loaded via dynamic import — TBT: 340ms → 140ms\n- Satoshi font preloaded; Instrument Serif uses font-display:optional — LCP improvement\n- ReactQueryDevtools excluded from production bundle\n- lucide-react, recharts, @radix-ui tree-shaken via optimizePackageImports\n\n**Codebase Cleanup (2026-03-05)**:\n- Deleted stale locale-unaware `/validate` route (duplicate of `[locale]/validate`)\n- Removed orphaned `trend-sparkline-lazy.tsx` component\n- Organized 16 root-level screenshots → `docs/screenshots/`\n- Deleted stale `docs/memory-bank-readme-cleanup-2026-02-25` branch\n\n**Recent Improvements (2026-02-25)**:\n- ✅ 37 Sentry issues resolved (422 errors, chat fixes, research pre-fill)\n- ✅ Chat agent prompts refactored; admin agents page rewritten\n- ✅ Trends backfill script added (`backend/scripts/backfill_trends_table.py`)\n- ✅ Uptime monitoring — UptimeRobot (free, 5-min interval); GitHub Actions schedule cron disabled\n- ✅ Scraper pipeline fixed (Crawl4AI timeout + duplicate APScheduler/Arq scheduling removed)\n- ✅ Domain sweep: all `startinsight.ai` → `startinsight.co` across codebase\n\n**Service Health Check (2026-04-12)**:\n- ✅ Railway: `api.startinsight.co` → `{\"status\":\"ready\",\"checks\":{\"database\":\"healthy\",\"redis\":\"healthy\"}}`\n- ✅ Supabase Pro: PostgreSQL accessible (70 tables, 16 migrations at c016)\n- ✅ Vercel: startinsight.co → HTTP 200\n- ✅ Sentry: 0 unresolved issues (live confirmed 2026-04-12), ascentia-km org\n- ✅ Google Gemini: gemini-2.0-flash API accessible\n- ✅ CI/CD: GitHub Actions passing, main→production pipeline active\n- ⚠️ Twitter social posting: X Premium purchased; pending token regeneration (developer.x.com → Read+Write → regenerate Access Token)\n\nSee `memory-bank/active-context.md` for current state and full growth roadmap.\n\n---\n\n## 📄 License\n\nMIT License - See [LICENSE](LICENSE) file for details.\n\n---\n\n## 🙏 Acknowledgments\n\n- **FastAPI**: For the excellent async Python framework\n- **Google**: For Gemini 2.0 Flash (97% cost reduction, primary LLM)\n- **Anthropic**: For Claude 3.5 Sonnet (fallback LLM, quality validation)\n- **Firecrawl**: For making web scraping sane again\n- **Next.js**: For the best React production framework\n- **Playwright**: For comprehensive cross-browser E2E testing\n\n---\n\n## 📞 Support\n\nFor questions or issues:\n- Check `memory-bank/` documentation\n- Review `backend/README.md` for backend-specific help\n- See `CLAUDE.md` for development guidelines\n\n---\n\n**Built with the \"Glue Coding\" philosophy: Don't reinvent, integrate.**\n\n---\n\n*v1.0.8 — GTM Day 6 status update (2026-04-12): 2,085+ insights live, Sentry clean (0 issues), X Premium purchased for Twitter posting fix, daily operations tracking active.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fascentia-sandbox%2Fstartinsight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fascentia-sandbox%2Fstartinsight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fascentia-sandbox%2Fstartinsight/lists"}