{"id":51439760,"url":"https://github.com/usv240/blueprint","last_synced_at":"2026-07-05T10:30:21.619Z","repository":{"id":362604743,"uuid":"1250562049","full_name":"usv240/blueprint","owner":"usv240","description":"AI property due diligence — 7-agent pipeline (Google ADK + Gemini 3 + Elastic) that turns public records into a debated Buyer Risk Score for any US address","archived":false,"fork":false,"pushed_at":"2026-06-05T04:03:09.000Z","size":219,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-06-05T04:21:39.873Z","etag":null,"topics":["elasticsearch","fastapi","gemini","google-adk","property-intelligence","real-estate"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/usv240.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T18:53:16.000Z","updated_at":"2026-06-05T04:03:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/usv240/blueprint","commit_stats":null,"previous_names":["usv240/blueprint"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/usv240/blueprint","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/usv240%2Fblueprint","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/usv240%2Fblueprint/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/usv240%2Fblueprint/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/usv240%2Fblueprint/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/usv240","download_url":"https://codeload.github.com/usv240/blueprint/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/usv240%2Fblueprint/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35151638,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-05T02:00:06.290Z","response_time":100,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elasticsearch","fastapi","gemini","google-adk","property-intelligence","real-estate"],"created_at":"2026-07-05T10:30:19.274Z","updated_at":"2026-07-05T10:30:21.591Z","avatar_url":"https://github.com/usv240.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BLUEPRINT: AI Property Due Diligence\n\nType any US address. BLUEPRINT reads the public record (deeds, building permits, flood maps, earthquake history, EPA environmental data) and has two AI agents argue the findings before giving you a single sourced verdict.\n\n[![Apache 2.0 License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)\n[![Google ADK 2.0](https://img.shields.io/badge/Google%20ADK-2.0-blue)](https://github.com/google/adk-python)\n[![Gemini 3 Flash](https://img.shields.io/badge/Gemini-3%20Flash%20Preview-blue)](https://ai.google.dev)\n[![Elastic Agent Builder](https://img.shields.io/badge/Elastic-Agent%20Builder%20MCP-pink)](https://www.elastic.co)\n[![FastAPI](https://img.shields.io/badge/FastAPI-async-green)](https://fastapi.tiangolo.com)\n[![Live Demo](https://img.shields.io/badge/Live%20Demo-blueprint-brightgreen)](https://blueprint-4foxtttuoa-uc.a.run.app)\n\nMost buyers close on a $500K–$1M home with a 30-minute walkthrough and a seller's disclosure. That disclosure won't mention the 12 open DOB permits, the Superfund site half a mile away, or the fact that the flood zone designation hasn't been updated since 2009. BLUEPRINT surfaces all of it in about 60 seconds.\n\n---\n\n## Architecture\n\n```mermaid\nflowchart TD\n    U([\"👤 User\"])\n\n    subgraph CLOUD[\"Google Cloud Run\"]\n        direction LR\n        FE[\"Frontend\\nVanilla JS\"]\n        BE[\"Backend\\nFastAPI · SSE stream\"]\n        FE --\u003e BE\n    end\n\n    subgraph ADK[\"Google Cloud ADK · SequentialAgent · Gemini 3 Flash · Vertex AI fallback\"]\n        direction LR\n        COL[\"Data Collection  ①–⑤\\nGeocoder · Deed · Permit\\nClimate · Neighbourhood\"]\n        SYN[\"⑥ SynthesisAgent\\nElastic MCP hybrid search\\n5 ES|QL queries · Risk Score\"]\n        DEB[\"⑦ DebateAgent\\nOptimist vs Pessimist\\nBUY / NEGOTIATE / AVOID\"]\n        COL --\u003e SYN --\u003e DEB\n    end\n\n    subgraph EL[\"Elastic Cloud Serverless · Agent Builder MCP\"]\n        direction LR\n        ES1[\"ELSER + RRF hybrid\\nText similarity reranker\"]\n        ES2[\"ES|QL · Percolator\\nGeo-distance · Sig. Terms\"]\n        ES3[\"Memory Layer · 6 Indices\\nevents · reports · cases\\nalerts · shared · watched\"]\n        ES1 --\u003e ES2 --\u003e ES3\n    end\n\n    subgraph DATA[\"Public Data Sources · Authoritative · Free\"]\n        direction LR\n        D1[\"FEMA NFHL · USGS\"] \n        D2[\"EPA EJSCREEN · OSM\"]\n        D3[\"NYC DOB · Socrata 50+ cities\"]\n    end\n\n    U --\u003e CLOUD --\u003e ADK\n    COL --\u003e DATA\n    SYN \u003c--\u003e|\"Agent Builder MCP\\nELSER · ES|QL tools\"| ES1\n    DEB --\u003e ES2\n```\n\nSeven agents run in sequence on Google Cloud ADK. The first five collect data from public APIs. SynthesisAgent uses Elastic Agent Builder MCP (ELSER hybrid search, five ES|QL cross-references) to build the risk score. DebateAgent then runs two opposing Gemini sub-agents (Optimist vs Pessimist) before the verdict reaches the buyer. Every finding persists to Elasticsearch, so the system compounds: each new analysis makes cross-property intelligence richer.\n\n→ [Full architecture walkthrough](docs/architecture.md) · [Why we built it this way](docs/adr/)\n\n---\n\n## The pipeline\n\n| # | Agent | What it actually does |\n|---|---|---|\n| 1 | GeocoderAgent | Normalises the address, geocodes to lat/lng, identifies county and FEMA flood zone, opens the Elasticsearch case file |\n| 2 | DeedAgent | Fetches deed and sale history from public county APIs. Flags price drops \u003e30% in \u003c12 months, rapid flips, and quitclaim deeds in purchase contexts |\n| 3 | PermitAgent | Queries 50+ city building permit databases via Socrata. Flags every open/unresolved permit: buyers inherit the liability at closing |\n| 4 | ClimateAgent | FEMA National Flood Hazard Layer for zone classification (AE, X, VE, AO), USGS Earthquake Catalog within 75 km |\n| 5 | NeighborhoodAgent | EPA EJSCREEN for PM2.5, Superfund proximity, traffic pollution. OSM Overpass for schools, parks, and transit within 500m |\n| 6 | SynthesisAgent | ELSER hybrid search + BM25 over all stored events, five ES|QL cross-reference queries, then Gemini produces the Buyer Risk Score, timeline, diligence questions, and Escape Plan |\n| 7 | DebateAgent | OptimistAgent argues the score is too high. PessimistAgent argues it's too low. VerdictAgent adjudicates → confidence-adjusted BUY / NEGOTIATE / AVOID |\n\n---\n\n## What you get\n\n- **Buyer Risk Score (0–100)**: composite from 7 data sources, stress-tested by the debate before you see it\n- **Escape Plan**: ranked steps to reduce your risk score, each with an estimated point impact\n- **Interactive map**: Leaflet with risk-coloured pin, 500m analysis radius, FEMA flood zone overlay\n- **Property timeline**: every dated public record (deeds, permits, flood events, earthquakes) in one filterable history, each citing its source\n- **Neighbourhood intelligence**: EPA air quality index, Superfund proximity, school/park/transit access\n- **Flip-fraud detection**: ES|QL cross-references permit filing dates against deed transfer dates\n- **Cross-property intelligence**: similar-risk properties from Elasticsearch's accumulated memory layer\n- **Property comparison**: two full pipelines in parallel, head-to-head verdict\n- **Share links**: 90-day public report URL, backed by Elasticsearch\n- **Watchlist**: properties scoring ≥75 are auto-watched for 24h re-analysis\n- **Q\u0026A chat**: ask Gemini follow-up questions about any open report\n- **HTML export**: standalone buyer brief with gauge, timeline, debate, and escape plan\n- **Slack alerts**: webhook notification when risk score meets a configurable threshold\n\n---\n\n## Stack\n\n| Layer | What's running |\n|---|---|\n| Agent framework | [Google ADK 2.0](https://github.com/google/adk-python): `SequentialAgent` + `LlmAgent` + `FunctionTool` + `MCPToolset` |\n| Primary model | Gemini 3 Flash Preview via AI Studio |\n| Fallback model | Gemini 2.5 Flash via Vertex AI (automatic) |\n| Search \u0026 memory | [Elastic Cloud Serverless](https://cloud.elastic.co): ELSER, Agent Builder MCP, ES|QL, reranker |\n| Backend | FastAPI + Uvicorn: async Python, SSE streaming |\n| Frontend | Vanilla JS + Leaflet.js: everything rendered from `/api/*`, nothing hardcoded |\n| Geocoding | OpenStreetMap Nominatim |\n| Permit data | 36 cities with schema-mapped Socrata feeds, 65 portals wired total |\n| Climate data | FEMA NFHL, USGS, EPA EJSCREEN: all 50 states |\n| Hosting | Google Cloud Run: Docker, scales to zero |\n\n---\n\n## Elasticsearch\n\nSix indices make up the intelligence layer:\n\n| Index | What's in it |\n|---|---|\n| `blueprint_cases` | One document per address: geocoded location with `geo_point` |\n| `blueprint_events` | All property events: permits, deeds, climate, neighbourhood (`semantic_text` for ELSER) |\n| `blueprint_reports` | Synthesised reports: risk scores, escape plans, debate verdicts |\n| `blueprint_shared` | Share links with 90-day expiry |\n| `blueprint_watched` | Watchlist: properties re-analysed every 24 hours |\n| `blueprint_alerts` | Percolator queries: saved risk profiles for proactive reverse-search alerting |\n\nEvery Elastic capability degrades gracefully to the next-best path. The live state of each is at `/api/elastic/status`, which drives the in-app **Elastic Intelligence** dashboard: nothing is hardcoded in the frontend.\n\n**Retrieval:** ELSER semantic (`semantic_text`, `.elser-2-elasticsearch`) → RRF hybrid (BM25 + ELSER) → `text_similarity_reranker` (`.rerank-v1-elasticsearch`) → BM25 fallback. Every analysis records which path ran.\n\n**ES|QL: five queries per analysis:**\n1. Event type distribution with value aggregates\n2. Permit-sale timing cross-reference (undisclosed construction detection)\n3. High-confidence events filter (confidence ≥ 0.9)\n4. Semantic RERANK: top 5 risk events via `.rerank-v1-elasticsearch`\n5. Flip-fraud detection: rapid deed transfer pattern\n\n**Beyond search:** `geo_distance` surfaces nearby analysed properties. `significant_terms` identifies risk flags statistically over-represented per band. `terms/stats/percentiles/date_histogram/cardinality` power the market intelligence dashboard at `/api/elastic/insights`. Percolator fires on every finished report.\n\n**Agent Builder MCP:** `platform.core.search` + `platform.core.execute_esql` over Streamable HTTP. Three custom ES|QL tools (`blueprint_flip_fraud`, `blueprint_permit_sale_timing`, `blueprint_top_risk_events`) are provisioned into Agent Builder via the Kibana API at startup, then wired into SynthesisAgent via `MCPToolset`.\n\n---\n\n## Permit coverage\n\nPermit data comes from Socrata open-data portals. 36 cities have fully schema-mapped feeds (real dataset IDs); the rest are wired and fall back gracefully. The live count is at `/api/coverage`.\n\n**Northeast:** New York City, Philadelphia, Baltimore, Washington DC, Boston, Pittsburgh  \n**Southeast:** Atlanta, Miami, Tampa, Orlando, Jacksonville, Charlotte, Raleigh, New Orleans, Nashville, Memphis  \n**Midwest:** Chicago, Columbus, Cincinnati, Cleveland, Detroit, Indianapolis, Minneapolis, Kansas City, St. Louis  \n**South:** Houston, Dallas, San Antonio, Austin, Fort Worth, El Paso  \n**West:** Los Angeles, San Diego, San Francisco, San Jose, Sacramento, Oakland, Phoenix, Denver, Las Vegas, Portland, Seattle\n\nAll other US addresses still get full climate, flood, earthquake, and environmental analysis via FEMA + USGS + EPA + OSM.\n\n---\n\n## Setup\n\n### What you need\n\n- Python 3.11+\n- A [Google Cloud project](https://console.cloud.google.com) with Vertex AI API enabled\n- An [Elastic Cloud Serverless account](https://cloud.elastic.co): free trial works fine\n- A [Gemini API key](https://aistudio.google.com): paid tier recommended (free tier: 15 req/min)\n\n### Elastic setup\n\n1. [cloud.elastic.co](https://cloud.elastic.co) → create a **Serverless Elasticsearch** project, pick Google Cloud as the region\n2. Kibana → **Agent Builder** → enable it (the MCP server starts automatically)\n3. **Agent Builder → Tools → MCP** → copy the endpoint URL\n4. **Stack Management → API keys** → create a key with `read` + `write` + `manage` on `blueprint_*` indices, plus `monitor_inference` cluster privilege\n5. Copy your Elasticsearch URL from the Connection details page\n\n### Configure\n\n```bash\ncp .env.example .env\n```\n\n```env\nGOOGLE_CLOUD_PROJECT=your-gcp-project-id\nGOOGLE_CLOUD_REGION=us-central1\nGEMINI_API_KEY=your-ai-studio-api-key\nGEMINI_MODEL=gemini-3-flash-preview\nVERTEX_MODEL=gemini-2.5-flash\n\nELASTIC_URL=https://your-deployment.es.us-central1.gcp.cloud.es.io\nELASTIC_API_KEY=your_api_key_here\nELASTIC_MCP_URL=https://your-deployment.kb.us-central1.gcp.cloud.es.io/api/agent_builder/mcp\n\n# Optional: leave blank to disable Slack alerts\nSLACK_WEBHOOK_URL=https://hooks.slack.com/services/...\nSLACK_ALERT_THRESHOLD=60\n\nAPP_URL=http://localhost:8080\nPORT=8080\n```\n\n### Run\n\n```bash\npip install -r requirements.txt\nuvicorn backend.main:app --reload --port 8080\n```\n\nOpen [http://localhost:8080](http://localhost:8080). Good addresses to start with:\n\n- **363 Van Brunt St, Brooklyn, NY**: Sandy flood history, open DOB permits\n- **2121 Airline Dr, Houston, TX**: Superfund proximity, hurricane zone, PM2.5\n- **2000 E Olympic Blvd, Los Angeles, CA**: Traffic pollution, earthquake zone\n\n```bash\ncurl http://localhost:8080/api/health\n# Should show: \"elasticsearch\": \"connected\", \"agents\": 7\n```\n\nIf `elastic_mcp` shows `\"unavailable (direct SDK fallback)\"`, your API key is missing Kibana privileges. The full pipeline still works, it just uses the Elasticsearch Python client directly instead of MCP.\n\n---\n\n## Slack alerts\n\n1. [api.slack.com/apps](https://api.slack.com/apps) → **Create New App** → **Incoming Webhooks** → enable → **Add New Webhook** → pick a channel\n2. Copy the webhook URL into `SLACK_WEBHOOK_URL` in `.env`\n3. Set `SLACK_ALERT_THRESHOLD` (default 60: alerts fire when the debate-adjusted score meets or exceeds this)\n\n---\n\n## Deploy to Cloud Run\n\n```bash\ngcloud auth login \u0026\u0026 gcloud auth application-default login\ngcloud config set project YOUR_PROJECT_ID\n\n# Store secrets\necho -n \"your-api-key\" | gcloud secrets create GEMINI_API_KEY --data-file=-\necho -n \"https://...\"  | gcloud secrets create ELASTIC_URL --data-file=-\necho -n \"your-key\"     | gcloud secrets create ELASTIC_API_KEY --data-file=-\necho -n \"https://...\"  | gcloud secrets create ELASTIC_MCP_URL --data-file=-\n\n./deploy.sh\n```\n\nCloud Build packages it, Cloud Run deploys it (2 vCPU / 2 GiB, scales to zero). The script prints the live URL: set that as `APP_URL` in your environment for correct share link generation.\n\n---\n\n## API\n\n| Method | Path | What it does |\n|---|---|---|\n| `GET` | `/api/analyze/stream` | SSE real-time streaming analysis |\n| `POST` | `/api/analyze` | One-shot JSON analysis |\n| `POST` | `/api/compare` | Two properties, parallel pipelines |\n| `POST` | `/api/ask` | Q\u0026A about a stored report |\n| `GET` | `/api/report/{hash}` | Retrieve stored report |\n| `GET` | `/api/export/{hash}` | Download standalone HTML brief |\n| `POST` | `/api/share/{hash}` | Create share link (90-day expiry) |\n| `GET` | `/api/share/{share_id}` | Open shared report |\n| `POST/GET/DELETE` | `/api/watch` | Watchlist management |\n| `GET` | `/api/similar/{hash}` | Similar-risk properties from memory layer |\n| `GET` | `/api/elastic/status` | Live Elastic capability matrix |\n| `GET` | `/api/elastic/insights` | Cross-property market aggregations |\n| `GET` | `/api/coverage` | Permit cities + nationwide sources |\n| `GET` | `/api/health` | Service health |\n| `GET` | `/api/about` | Methodology, glossary, agent descriptions |\n| `GET` | `/api/stats` | Platform statistics |\n\nSwagger at `/docs`, ReDoc at `/redoc`.\n\n---\n\n## Project layout\n\n```\nblueprint/\n├── backend/\n│   ├── main.py                  # FastAPI app, health/about/stats/similar/elastic endpoints\n│   ├── config.py                # All config from environment variables\n│   ├── routes/\n│   │   ├── analyze.py           # /api/analyze, SSE stream, Q\u0026A, recent\n│   │   ├── compare.py           # Parallel dual-pipeline comparison\n│   │   ├── export.py            # Gemini-generated HTML buyer brief\n│   │   ├── share.py             # Share links with expiry\n│   │   └── watch.py             # Watchlist CRUD + 24h background re-analysis\n│   └── services/\n│       ├── adk_runner.py        # 7-agent ADK pipeline + SSE queue\n│       ├── elastic_client.py    # Elasticsearch + Agent Builder MCP, ELSER, ES|QL\n│       ├── gemini.py            # Gemini + Vertex AI fallback\n│       ├── geocoder.py          # Nominatim\n│       ├── data_fetchers.py     # FEMA, USGS, EPA, OSM, Socrata 65+ cities\n│       └── slack.py             # Slack webhook alerts\n├── frontend/\n│   ├── index.html               # Landing page\n│   ├── app.html                 # Analysis app\n│   ├── app.js                   # SSE client, gauge, map, report rendering\n│   ├── style.css                # Dark/light theme, responsive\n│   ├── landing.js               # Landing page JS\n│   └── landing.css              # Landing page styles\n├── docs/\n│   ├── architecture.md          # Full system architecture + data flow\n│   └── adr/                     # Architecture decision records\n├── tests/                       # 86+ fast tests + full pipeline slow tests\n├── Dockerfile\n├── deploy.sh                    # Cloud Build + Cloud Run\n├── requirements.txt\n└── .env.example\n```\n\n---\n\n## A few caveats\n\nNYC and Austin have the most complete permit histories. Other cities use the Socrata generic schema, which varies in quality. Addresses outside the 65 covered cities still get full climate and environmental analysis.\n\nThe Gemini free tier caps at 15 requests/minute. The pipeline makes several model calls per analysis, so a paid AI Studio key is worth it for anything beyond casual use.\n\nBLUEPRINT is informational. The data comes from public records and automated analysis: not licensed professionals. Verify anything that matters before signing.\n\n---\n\n## License\n\nApache 2.0: see [LICENSE](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fusv240%2Fblueprint","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fusv240%2Fblueprint","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fusv240%2Fblueprint/lists"}