{"id":32807140,"url":"https://github.com/vishal-codes/course-hero-rag-bot","last_synced_at":"2026-05-09T05:08:52.238Z","repository":{"id":322744723,"uuid":"1053118650","full_name":"vishal-codes/course-hero-rag-bot","owner":"vishal-codes","description":"A compact Retrieval-Augmented Generation (RAG) API on Cloudflare Workers (Python) that powers a chatbot designed to help CSUF students in course selection.","archived":false,"fork":false,"pushed_at":"2025-11-21T01:01:49.000Z","size":288,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-21T03:08:52.686Z","etag":null,"topics":["cloudflare-vectorize","cloudflare-workers","llama3","python3","rag","rag-chatbot"],"latest_commit_sha":null,"homepage":"https://csuf-python-rag-bot.vishal-codes.workers.dev/health","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vishal-codes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-09T02:43:00.000Z","updated_at":"2025-11-21T01:01:52.000Z","dependencies_parsed_at":"2025-11-06T06:06:04.469Z","dependency_job_id":null,"html_url":"https://github.com/vishal-codes/course-hero-rag-bot","commit_stats":null,"previous_names":["vishal-codes/course-hero-rag-bot"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vishal-codes/course-hero-rag-bot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vishal-codes%2Fcourse-hero-rag-bot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vishal-codes%2Fcourse-hero-rag-bot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vishal-codes%2Fcourse-hero-rag-bot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vishal-codes%2Fcourse-hero-rag-bot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vishal-codes","download_url":"https://codeload.github.com/vishal-codes/course-hero-rag-bot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vishal-codes%2Fcourse-hero-rag-bot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32807872,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"online","status_checked_at":"2026-05-09T02:00:06.633Z","response_time":123,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloudflare-vectorize","cloudflare-workers","llama3","python3","rag","rag-chatbot"],"created_at":"2025-11-06T15:01:28.698Z","updated_at":"2026-05-09T05:08:52.233Z","avatar_url":"https://github.com/vishal-codes.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# CSUF Python RAG Bot\n\nA compact Retrieval-Augmented Generation (RAG) API on **Cloudflare Workers (Python)** that:\n\n* embeds user questions with **Workers AI**, queries a **Vectorize** index, and returns concise answers\n* enforces CORS and consistent JSON errors\n* supports per-IP rate limiting (3 requests/minute) via **Workers KV** \n\n\nThis api powers the [Course_Hero Frontend](https://github.com/vishal-codes/course-hero)\n\n## Directory structure\n\n```\nvishal-codes-course-hero-rag-bot/\n├── README.md\n├── example_query.py\n├── example_rag_answer.py\n├── package.json\n├── pyproject.toml\n├── requirements.txt\n├── uv.lock\n├── .python-version\n├── data_loader/\n│   ├── courses.csv\n│   ├── vector_builder.py\n│   └── vectors_to_upload.ndjson\n└── src/\n    ├── entry.py\n    └── app/\n        ├── __init__.py\n        ├── config.py\n        ├── httpHandler.py\n        ├── pipeline.py\n        ├── ratelimit.py\n        └── utils.py\n```\n\n\n\n## API endpoints\n\n* `GET /` – simple hello JSON for smoke testing. \n* `GET /health` – liveness probe. \n* `GET /version` – app version from env. \n* `POST /ask` – runs the RAG pipeline and returns `{ answer, sources }`. \n\n### Request shape\n\n```http\nPOST /ask\nContent-Type: application/json\n\n{\n  \"question\": \"How important is CPSC 131? Is it a prerequisite for other courses?\",\n  \"topK\": 5   // optional, 1..10\n}\n```\n\n### Success response (200)\n\n```json\n{\n  \"answer\": \"CPSC 131 is a foundation course and a prerequisite for several follow-on modules including Compilers and Database Systems.\",\n  \"sources\": [\n    {\n      \"id\": \"CPSC_323_Mohamadreza_Ahmadnia_42\",\n      \"score\": 0.688,\n      \"course\": \"CPSC 323\",\n      \"courseName\": \"Compilers and Languages\",\n      \"instructor\": \"Mohamadreza Ahmadnia\"\n    }\n  ]\n}\n```\n\n(Same shape as produced by `pipeline.build_context`.) \n\n### Error response (examples)\n\n* Bad body:\n\n```json\n{ \"error\": \"Body must be JSON object\" }\n```\n\n\n\n* Missing or invalid fields:\n\n```json\n{ \"error\": \"Missing 'question'\" }\n{ \"error\": \"Invalid 'topK'\" }\n```\n\n\n\n### Rate limiting\n\n* Limit: **3 requests per minute per IP**\n* Returns standard headers: `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`, and on block `Retry-After`.\n* CORS exposes these headers to the browser via `Access-Control-Expose-Headers`.  \n\n#### Example 429 response\n\n```json\n{\n  \"error\": \"Rate limit exceeded\",\n  \"detail\": \"3 requests per minute allowed for this IP.\"\n}\n```\n\n\n\n## Local setup\n\n1. **Install dependencies**\n\n```bash\npip install -r requirements.txt\n```\n\n\n\n2. **Create `.dev.vars`** with your Cloudflare credentials and app config:\n\n```dotenv\nCLOUDFLARE_ACCOUNT_ID=acc_XXXXXXXXXXXXXXXXXXXXXXXXX\nCLOUDFLARE_API_TOKEN=cf_api_token_with_ai_vectorize_permissions\n\n# App vars\nALLOWED_ORIGINS=http://localhost:3000\nAPP_VERSION=2025.09.02-1\nDEBUG=1\nCF_EMBED_MODEL=@cf/baai/bge-base-en-v1.5\nCF_CHAT_MODEL=@cf/meta/llama-3.1-8b-instruct-fast\nCF_TOPK=5\nGEN_TEMPERATURE=0.2\nGEN_MAX_TOKENS=350\n```\n\n\n\n3. **Ensure `wrangler.jsonc` has bindings** (AI, Vectorize, and your KV for rate limiting). Example:\n\n```jsonc\n{\n  \"$schema\": \"node_modules/wrangler/config-schema.json\",\n  \"name\": \"\u003cname\u003e\",\n  \"main\": \"src/entry.py\",\n  \"compatibility_date\": \"\u003cdate\u003e\",\n  \"compatibility_flags\": [\"python_workers\"],\n\n  \"ai\": { \"binding\": \"AI\" },\n  \"vectorize\": [{ \"binding\": \"\u003cbinding_namesapce\u003e\", \"index_name\": \"\u003cindex_namespace\u003e\" }],\n\n  \"env\": {\n    \"staging\": {\n      \"ai\": { \"binding\": \"AI\" },\n      \"vectorize\": [{ \"binding\": \"\u003cCOURSES\u003e\", \"index_name\": \"\u003cindex_namespace\u003e\" }],\n      \"kv_namespaces\": [\n        { \"binding\": \"\u003cbinding_namesapce\u003e\", \"id\": \"\u003cprod-id\u003e\", \"preview_id\": \"\u003cpreview-id\u003e\" }\n      ],\n      \"vars\": { \"ALLOWED_ORIGINS\": \"http://localhost:3000\", \"DEBUG\": \"1\", \"APP_VERSION\": \"\u003cdate\u003e\" }\n    }\n  }\n}\n```\n\n(Envs don’t inherit—repeat bindings under `env.staging`.) \n\n4. **Build vectors (optional for local testing)**\n\n```bash\npython data_loader/vector_builder.py \\\n  --csv data_loader/courses.csv \\\n  --out data_loader/vectors_to_upload.ndjson \\\n  --index csuf-courses \\\n  --insert \\\n  --env ./.dev.vars\n```\n\n\n\n5. **Run dev server**\n\n```bash\nnpx wrangler dev --env staging --experimental-vectorize-bind-to-prod\n# → http://localhost:8787\n```\n\nQuick checks:\n\n```bash\ncurl -s http://localhost:8787/health\ncurl -s http://localhost:8787/version\n\n# CORS preflight\ncurl -i -X OPTIONS \"http://localhost:8787/ask\" \\\n  -H \"Origin: http://localhost:3000\" \\\n  -H \"Access-Control-Request-Method: POST\" \\\n  -H \"Access-Control-Request-Headers: content-type\"\n\n# Ask\ncurl -i -X POST \"http://localhost:8787/ask\" \\\n  -H \"Origin: http://localhost:3000\" \\\n  -H \"Content-Type: application/json\" \\\n  --data '{\"question\":\"How imp is this 131 course? is it a prereq for any other course?\"}'\n```\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvishal-codes%2Fcourse-hero-rag-bot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvishal-codes%2Fcourse-hero-rag-bot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvishal-codes%2Fcourse-hero-rag-bot/lists"}