{"id":48268980,"url":"https://github.com/finish06/ndc-loader","last_synced_at":"2026-04-04T22:01:54.723Z","repository":{"id":349030786,"uuid":"1193271308","full_name":"finish06/ndc-loader","owner":"finish06","description":"rx-dag — Fast, self-hosted FDA drug data API. Ingests NDC Directory \u0026 Drugs@FDA daily, serves via REST with sub-5ms lookups. Drop-in openFDA replacement.","archived":false,"fork":false,"pushed_at":"2026-04-03T21:29:30.000Z","size":343,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-03T22:32:09.918Z","etag":null,"topics":["docker","drug-data","fda","golang","ndc","openfda","postgresql","rest-api","rx-dag"],"latest_commit_sha":null,"homepage":"https://rx-dag.calebdunn.tech","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finish06.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-27T03:36:33.000Z","updated_at":"2026-04-03T21:31:47.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/finish06/ndc-loader","commit_stats":null,"previous_names":["finish06/ndc-loader"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/finish06/ndc-loader","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finish06%2Fndc-loader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finish06%2Fndc-loader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finish06%2Fndc-loader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finish06%2Fndc-loader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finish06","download_url":"https://codeload.github.com/finish06/ndc-loader/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finish06%2Fndc-loader/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31416333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T20:09:54.854Z","status":"ssl_error","status_checked_at":"2026-04-04T20:09:44.350Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","drug-data","fda","golang","ndc","openfda","postgresql","rest-api","rx-dag"],"created_at":"2026-04-04T22:01:53.952Z","updated_at":"2026-04-04T22:01:54.710Z","avatar_url":"https://github.com/finish06.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ndc-loader\n\nFDA NDC Directory bulk loader and REST API. Downloads the complete NDC Directory and Drugs@FDA datasets daily from FDA bulk downloads, loads them into PostgreSQL, and serves them via a REST API with full-text search, NDC format normalization, and openFDA-compatible responses.\n\nReplaces the openFDA API dependency for [drug-cash](https://github.com/finish06) and internal microservices.\n\n## Quick Start\n\n```bash\n# Clone and start\ncp .env.example .env\n# Edit .env — set API_KEYS to a real secret\ndocker compose up -d\n\n# Trigger initial data load\ncurl -X POST http://localhost:8081/api/admin/load \\\n  -H \"X-API-Key: your-secret-key-here\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{}'\n\n# Search for a drug\ncurl http://localhost:8081/api/ndc/search?q=metformin \\\n  -H \"X-API-Key: your-secret-key-here\"\n\n# Browse interactive API docs\nopen http://localhost:8081/swagger/\n```\n\n## API\n\nAll endpoints require `X-API-Key` header unless noted. Interactive docs at `/swagger/`.\n\n### Query Endpoints\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `GET` | `/api/ndc/{ndc}` | Lookup by NDC code (any format) |\n| `GET` | `/api/ndc/search?q={query}\u0026limit=50\u0026offset=0` | Full-text search |\n| `GET` | `/api/ndc/{ndc}/packages` | List packages for a product |\n| `GET` | `/api/ndc/stats` | Dataset statistics |\n\n### openFDA-Compatible Endpoint\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `GET` | `/api/openfda/ndc.json?search={query}\u0026limit=1\u0026skip=0` | Drop-in replacement for openFDA `/drug/ndc.json` |\n\nSupports openFDA search syntax: `brand_name:metformin`, `product_ndc:\"0002-1433\"`, AND via `+`.\n\n### Admin Endpoints\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `POST` | `/api/admin/load` | Trigger manual data load |\n| `GET` | `/api/admin/load/{id}` | Check load status with checkpoints |\n\n### Operations (no auth required)\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `GET` | `/health` | Health check with postgres dep check, uptime, data freshness |\n| `GET` | `/version` | Build info (git commit, branch, Go version, OS, arch) |\n| `GET` | `/metrics` | Prometheus metrics |\n| `GET` | `/swagger/` | Interactive API documentation (Swagger UI) |\n\n### NDC Format Normalization\n\nAccepts any common NDC format:\n- Hyphenated 2-segment: `0002-1433` (product lookup)\n- Hyphenated 3-segment: `0002-1433-61` (package lookup, returns parent product)\n- Unhyphenated 10-digit: `0002143361` (tries 4-4-2, 5-3-2, 5-4-1 patterns)\n- Unhyphenated shorter: `00021433` (product lookup)\n\n### Example Response\n\n```json\n{\n  \"product_ndc\": \"0002-1433\",\n  \"brand_name\": \"Trulicity\",\n  \"generic_name\": \"DULAGLUTIDE\",\n  \"dosage_form\": \"INJECTION, SOLUTION\",\n  \"route\": \"SUBCUTANEOUS\",\n  \"manufacturer\": \"Eli Lilly and Company\",\n  \"pharm_classes\": \"GLP-1 Receptor Agonist [EPC], Glucagon-Like Peptide 1 [CS]\",\n  \"pharm_classes_structured\": {\n    \"epc\": [\"GLP-1 Receptor Agonist\"],\n    \"moa\": [\"Glucagon-like Peptide-1 (GLP-1) Agonists\"],\n    \"cs\": [\"Glucagon-Like Peptide 1\"],\n    \"pe\": [],\n    \"raw\": \"GLP-1 Receptor Agonist [EPC], Glucagon-Like Peptide 1 [CS], ...\"\n  },\n  \"packages\": [\n    {\"ndc\": \"0002-1433-80\", \"description\": \"4 SYRINGE in 1 CARTON\", \"sample\": false}\n  ]\n}\n```\n\n## Data Sources\n\n| Dataset | Source | Records | Refresh |\n|---------|--------|---------|---------|\n| NDC Directory | [FDA ndctext.zip](https://www.accessdata.fda.gov/cder/ndctext.zip) | ~112K products, ~212K packages | Daily 3am |\n| Drugs@FDA | [FDA media/89850](https://www.fda.gov/media/89850/download) | ~29K applications, ~51K products, ~191K submissions | Daily 3am |\n\nDatasets are joinable via `application_number` (NDC) to `appl_no` (Drugs@FDA) after stripping the type prefix (e.g., `ANDA076543` -\u003e `076543`).\n\nAdditional datasets can be added via `datasets.yaml` without code changes.\n\n## Configuration\n\nCopy `.env.example` to `.env` and configure:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `DATABASE_URL` | `postgres://ndc:ndc@localhost:5432/ndc` | PostgreSQL connection |\n| `API_KEYS` | (required) | Comma-separated API keys |\n| `LISTEN_ADDR` | `:8081` | HTTP listen address |\n| `LOAD_SCHEDULE` | `0 3 * * *` | Cron schedule for daily refresh |\n| `POSTGRES_PORT` | `5432` | Host port for PostgreSQL |\n| `APP_PORT` | `8081` | Host port for ndc-loader |\n| `LOG_LEVEL` | `info` | Log level (debug, info, warn, error) |\n| `LOG_FORMAT` | `json` | Log format (json, text) |\n\n## Development\n\n```bash\nmake build                # Build binary with version info\nmake test                 # Unit tests\nmake test-integration     # Integration tests (requires postgres)\nmake test-e2e             # E2E tests (downloads real FDA data)\nmake lint                 # golangci-lint\nmake docs                 # Regenerate Swagger spec\nmake docker-build         # Build Docker image locally\n```\n\n## Deployment\n\n### Staging (192.168.1.145)\n\n```bash\nmake deploy-staging-first  # First time: sync config + .env template\nmake deploy-staging        # Routine: pull latest image + restart\nmake staging-status        # Health check\nmake staging-logs          # Tail logs\nmake staging-load          # Trigger FDA data load\nmake staging-psql          # Open psql shell\n```\n\n### CI/CD Pipeline\n\nPush to `main` triggers: lint -\u003e test -\u003e build image -\u003e push to registry -\u003e staging webhook deploy.\n\nTag `v*` triggers: same pipeline + push `:version` and `:latest` tags.\n\n```\npush main → lint + test + vet → build :beta → push to registries → webhook → staging deploy\ntag v*    → lint + test + vet → build :tag + :latest → push to registries\n```\n\n## Architecture\n\n```\ncmd/ndc-loader/         Entry point (ldflags: version, git commit, branch)\ninternal/\n  api/                  HTTP handlers, middleware, NDC normalization, openFDA compat\n  loader/               FDA download, parsing, orchestration, scheduling\n  store/                PostgreSQL queries, bulk loading, checkpoints\n  model/                Domain types\nmigrations/             SQL schema (embedded, auto-applied on startup)\ndatasets.yaml           Configurable dataset sources\ndocs/swagger/           Generated OpenAPI spec (served at /swagger/)\n```\n\n### System Overview\n\n```mermaid\ngraph LR\n    FDA[(FDA Bulk Downloads)] --\u003e|Daily cron| Loader\n    subgraph ndc-loader\n        Loader[Data Loader] --\u003e|COPY| PG[(PostgreSQL)]\n        PG --\u003e API[REST API]\n        PG --\u003e OpenFDA[openFDA Compat]\n    end\n    API --\u003e|JSON| DC[drug-cash]\n    OpenFDA --\u003e|openFDA format| DC\n    API --\u003e|JSON| MS[Microservices]\n    API --\u003e|/metrics| Prom[Prometheus]\n    API --\u003e|/swagger/| Docs[Swagger UI]\n```\n\n### Data Pipeline\n\n```mermaid\nflowchart TD\n    A[Cron Trigger / Manual POST] --\u003e B[Download FDA ZIPs]\n    B --\u003e|Retry with backoff| B\n    B --\u003e|Success| C[Extract to temp dir]\n    C --\u003e D[Parse tab-delimited files]\n    D --\u003e|UTF-8 sanitize| D1[Sanitize invalid bytes]\n    D1 --\u003e E[Map columns + coerce types]\n    E --\u003e|Date: YYYYMMDD → time.Time| E\n    E --\u003e|Bool: Y/N → bool| E\n    E --\u003e F[Bulk COPY into staging table]\n    F --\u003e G{Row count safe?}\n    G --\u003e|Drop \u003e 20%| H[Abort — keep existing data]\n    G --\u003e|OK| I[Atomic swap: staging → live]\n    I --\u003e J[Rebuild search vectors]\n    J --\u003e K[Record checkpoint metadata]\n    K --\u003e L{More tables?}\n    L --\u003e|Yes| D\n    L --\u003e|No| M[Load complete]\n\n    style H fill:#ef4444,color:#fff\n    style M fill:#22c55e,color:#fff\n```\n\n### Request Flow\n\n```mermaid\nflowchart LR\n    Client --\u003e|X-API-Key| MW[Auth Middleware]\n    MW --\u003e|401| Reject[Unauthorized]\n    MW --\u003e|Valid| Router\n\n    Router --\u003e Q1[\"GET /api/ndc/{ndc}\"]\n    Router --\u003e Q2[\"GET /api/ndc/search\"]\n    Router --\u003e Q3[\"GET /api/ndc/{ndc}/packages\"]\n    Router --\u003e Q4[\"GET /api/ndc/stats\"]\n    Router --\u003e OF[\"GET /api/openfda/ndc.json\"]\n    Router --\u003e A1[\"POST /api/admin/load\"]\n\n    Q1 --\u003e NDC[NDC Normalizer]\n    NDC --\u003e|2-segment| PL[Product Lookup]\n    NDC --\u003e|3-segment| PKL[Package Lookup]\n    NDC --\u003e|10-digit| VAR[Try 4-4-2 / 5-3-2 / 5-4-1]\n    VAR --\u003e PL\n    VAR --\u003e PKL\n\n    PL --\u003e DB[(PostgreSQL)]\n    PKL --\u003e DB\n    Q2 --\u003e|tsvector + ts_rank| DB\n    Q3 --\u003e DB\n    Q4 --\u003e DB\n    OF --\u003e|openFDA search parser| DB\n\n    style Reject fill:#ef4444,color:#fff\n```\n\n### Data Model\n\n```mermaid\nerDiagram\n    products ||--o{ packages : \"product_ndc\"\n    products }o--o| applications : \"application_number ~ appl_no\"\n    applications ||--o{ drugsfda_products : \"appl_no\"\n    applications ||--o{ submissions : \"appl_no\"\n    applications ||--o{ marketing_status : \"appl_no\"\n    applications ||--o{ te_codes : \"appl_no\"\n\n    products {\n        text product_id PK\n        text product_ndc\n        text proprietary_name\n        text nonproprietary_name\n        text labeler_name\n        text application_number\n        tsvector search_vector\n    }\n\n    packages {\n        serial id PK\n        text product_ndc\n        text ndc_package_code\n        text description\n    }\n\n    applications {\n        text appl_no PK\n        text appl_type\n        text sponsor_name\n    }\n\n    drugsfda_products {\n        serial id PK\n        text appl_no\n        text drug_name\n        text active_ingredient\n    }\n\n    submissions {\n        serial id PK\n        text appl_no\n        text submission_type\n        text submission_status\n    }\n\n    marketing_status {\n        serial id PK\n        text appl_no\n        text marketing_status_id\n    }\n\n    te_codes {\n        serial id PK\n        text appl_no\n        text te_code\n    }\n```\n\n### Checkpoint \u0026 Recovery\n\n```mermaid\nstateDiagram-v2\n    [*] --\u003e Pending\n    Pending --\u003e Downloading\n    Downloading --\u003e Downloaded: Success\n    Downloading --\u003e Failed: Error (retry exhausted)\n    Downloaded --\u003e Loading\n    Loading --\u003e Loaded: Atomic swap OK\n    Loading --\u003e Failed: Row count drop / DB error\n    Failed --\u003e Downloading: Resume load\n\n    note right of Failed\n        On resume, skip tables\n        with status = Loaded\n    end note\n```\n\n### Resilience\n\n- **Retry**: Downloads retry with exponential backoff (configurable max attempts)\n- **Checkpoints**: Per-table progress tracking; resume from failure point\n- **Row count safety**: Abort if row count drops \u003e20% from previous load\n- **Atomic swap**: Consumers never see partial data\n- **UTF-8 sanitization**: Handles Windows-1252 bytes in FDA data\n\n## drug-cash Integration\n\nndc-loader is a drop-in upstream replacement for the openFDA NDC API:\n\n```yaml\n# drug-cash config.yaml slugs:\n- slug: ndc-products\n  base_url: http://ndc-loader:8081\n  path: /api/openfda/ndc.json\n  search_params: [\"search={QUERY}\"]\n\n- slug: ndc-lookup\n  base_url: http://ndc-loader:8081\n  path: /api/openfda/ndc.json\n  search_params: [\"search=product_ndc:\\\"{NDC}\\\"\", \"limit=1\"]\n```\n\n## Tech Stack\n\n- **Go 1.26+** with Chi v5 router\n- **PostgreSQL 16+** with pgx v5 driver, GIN indexes for full-text search\n- **Docker Compose** for local dev and deployment\n- **Prometheus** metrics at `/metrics`\n- **Swagger UI** at `/swagger/` (OpenAPI spec via swaggo/swag)\n- **GitHub Actions** CI/CD with staging webhook deploy\n\n## License\n\nInternal use only.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinish06%2Fndc-loader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinish06%2Fndc-loader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinish06%2Fndc-loader/lists"}