https://github.com/zaydiscold/hooke-preview
FastAPI research app for hard-science questions with literature retrieval, genomic follow-up, and streaming briefs.
https://github.com/zaydiscold/hooke-preview
fastapi genomics research science sse
Last synced: 13 days ago
JSON representation
FastAPI research app for hard-science questions with literature retrieval, genomic follow-up, and streaming briefs.
- Host: GitHub
- URL: https://github.com/zaydiscold/hooke-preview
- Owner: zaydiscold
- Created: 2026-03-15T22:56:06.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-16T03:47:33.000Z (3 months ago)
- Last Synced: 2026-03-16T14:43:58.709Z (3 months ago)
- Topics: fastapi, genomics, research, science, sse
- Language: Python
- Size: 60.5 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
hooke
agent-orchestrated research assistant for hard-science questions.
overview ·
what the app does ·
run locally ·
example questions
Hooke is an agent-orchestrated research assistant for hard-science questions.
It retrieves evidence from scientific and web sources, optionally adds genomic
follow-up, and returns a citation-grounded research brief in a streaming
interface.
The repository contains a local research workflow for questions that need
source collection, synthesis, and explicit next-step reasoning.
click the gif for the full demo video.
## Overview
A user submits a question, Hooke classifies the request into one of three
investigation modes, runs the relevant agents, and streams both intermediate
logs and the final brief to the browser.
## What the app does
Hooke provides these capabilities:
- Retrieves literature from PubMed, Semantic Scholar, Tavily, OpenAlex, and
arXiv through the literature pipeline.
- Selects among three investigation modes: literature-only, parallel genomic
follow-up, or literature-first gene discovery followed by genomic analysis.
- Streams agent progress and final output to the frontend through server-sent
events.
- Uses AlphaGenome when available and falls back to Ensembl-based genomic
interpretation when needed.
- Produces a structured research brief with findings, research gaps, proposed
experiments, and citations.
- Generates compact lucky-mode starter queries for exploratory research.
## Architecture
The application is split into a small number of focused components:
- `main.py`: FastAPI entrypoint, static file serving, lucky-query handling, and
SSE endpoints.
- `orchestrator.py`: query classification, mode routing, and pipeline control.
- `agents/literature.py`: source retrieval, filtering, and paper analysis.
- `agents/genomic.py`: AlphaGenome and Ensembl-backed genomic analysis.
- `agents/synthesis.py`: brief generation and JSON normalization.
- `static/index.html`: single-page interface for queries, logs, and research
briefs.
- `health_check.py`: provider and API connectivity checks.
## Requirements
Set up the app from the project root:
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
```
The environment file must define these variables:
- `NEBIUS_API_KEY`
- `OPENROUTER_API_KEY`
- `TAVILY_API_KEY`
- `GOOGLE_API_KEY`
- `SEMANTIC_SCHOLAR_API_KEY` for higher Semantic Scholar rate limits
- `PUBMED_EMAIL`
## Run locally
Start the development server with Uvicorn:
```bash
uvicorn main:app --reload --port 8000
```
Then open [http://127.0.0.1:8000](http://127.0.0.1:8000).
## Health check
Run the connectivity check before a demo or local test session:
```bash
python3 health_check.py
```
This script verifies whether the configured providers are reachable.
## Example questions
These prompts match the current demo flow:
1. How does Ozempic actually work at the molecular level, and why does it cause
muscle loss?
2. What tissues is the `LCT` gene most active in, and why can some adults
digest milk while others cannot?
3. Why do some people get severe kidney disease, and what genes are involved?
4. What makes some cancer tumors resistant to PD-1 or PD-L1 immunotherapy?
## Operational notes
Keep these constraints in mind when you run the app:
- Semantic Scholar can rate-limit unauthenticated requests.
- AlphaGenome is optional; Hooke falls back to Ensembl-based interpretation if
AlphaGenome is unavailable.
- Prompt-injection evaluation is not implemented yet. Promptfoo is a planned
addition for future prompt-injection testing and security review.
- Generated cache files remain local and are excluded from git.
zayd / cold
zayd.wtf · twitter · github
icarus only fell because he flew
to do
☑ streaming brief and pipeline logs
☐ prompt-injection eval coverage (promptfoo)