{"id":18768775,"url":"https://github.com/get-convex/dryad","last_synced_at":"2025-10-08T20:24:37.479Z","repository":{"id":192877828,"uuid":"686124103","full_name":"get-convex/dryad","owner":"get-convex","description":"Dryad talks to you tree! Easy semantic code search on any repository","archived":false,"fork":false,"pushed_at":"2023-09-09T01:47:12.000Z","size":1540,"stargazers_count":36,"open_issues_count":10,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-02T01:30:38.690Z","etag":null,"topics":["ai","embedding-vectors","gpt-4"],"latest_commit_sha":null,"homepage":"https://dryad.gg","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/get-convex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-09-01T20:02:39.000Z","updated_at":"2025-09-02T18:20:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"edc3a3be-8ecc-44a5-93db-3defa6c239b5","html_url":"https://github.com/get-convex/dryad","commit_stats":null,"previous_names":["get-convex/dryad"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/get-convex/dryad","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/get-convex%2Fdryad","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/get-convex%2Fdryad/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/get-convex%2Fdryad/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/get-convex%2Fdryad/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/get-convex","download_url":"https://codeload.github.com/get-convex/dryad/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/get-convex%2Fdryad/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279000659,"owners_count":26082817,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","embedding-vectors","gpt-4"],"created_at":"2024-11-07T19:13:57.381Z","updated_at":"2025-10-08T20:24:37.473Z","avatar_url":"https://github.com/get-convex.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# dryad - talk to your tree\n\nEasy semantic code search on any GitHub repository in ~1000 SLOC.\n\n[Check out the running demo](http://convex.dev/dryad)\n\n![dryad](dryad_ss.png)\n\nDryad is intended to be a useful demo project and starter template for building more sophisticated\nsemantic search web apps.\n\nFeatures:\n\n- Automatically tracks changes in the target repo and keeps the search index in sync with `HEAD`\n- Built with [Convex](https://convex.dev), [OpenAI](https://openai.com),\n  [Vite](https://vitejs.dev/) + [React](https://react.dev/).\n- Easy to read, fork, and modify.\n- Reconfigurable on the fly using the Convex dashboard\n\n# Running your own dryad (on your favorite codebase)\n\nFirst, clone the repository and start it up:\n\n    $ git clone https://github.com/get-convex/dryad.git\n    $ npm i\n    $ npm run dev\n\nThis will create your Convex backend deployment, which will\nattempt to start indexing the default repository (https://github.com/get-convex/convex-helpers).\nThen, the frontend will start up, running on vite's usual port 5173.\n\nIn another terminal in this same repository, launch the Convex dashboard and watch the logs to\nfollow along with backend indexing:\n\n    $ npx convex dashboard\n\nIn the `Logs` panel, you'll see errors about missing environment variables.\nWe have a little more set up to do!\n\n## 1. Set deployment environment variables for OpenAI and GitHub\n\n### OpenAI\n\nDryad uses OpenAI for summarization and embedding. You'll need an OpenAI platform account\nand an API key. Visit [platform.openai.com](https://platform.openai.com) to\ntake care of that.\n\n\u003e :warning: Summarizing and indexing even a moderate codebase consumes a fair amount of OpenAI\n\u003e credits. You will almost certainly need a paid account!\n\n### GitHub\n\nAnonymous uses of the GitHub API get rate limited very easily. So dryad require that you\ngenerate a personal access token using your GitHub account. Visit\n[https://github.com/settings/tokens](https://github.com/settings/tokens) to generate\na token for dryad.\n\n### Setting these environment variables in your Convex deployment\n\nWith your OpenAI API key and GitHub access token in hand, go back to your\nConvex deployment's dashboard. In the left navigation panel, click \"Settings\",\nand then \"Environment Variables\".\n\nName the two secret environment variables `OPENAI_API_KEY` and `GITHUB_ACCESS_TOKEN`, like so:\n\n![dashboard environment variables](env_ss.png)\n\n## 2. Customize your dryad settings in the `settings` table\n\nIf you check the `Logs` view in your Convex dashboard, dryad now should\nbe running successfully! But it's indexing the default repository,\n`get-convex/convex-helpers`. You probably want it indexing your own\ncode instead.\n\nGood news! It's easy to customize dryad's behavior. Dryad keeps all\nits configuration in a `settings` table in your Convex database\nitself. Click on the `Data` view in the dashboard, and then choose\nthe `settings` table:\n\n![settings table](dryad_settings.png)\n\nDouble click any value in the settings document to edit it, or click the blue \"EDIT\" button to add missing fields to the document. Normally, you shouldn't need to do anything for your changes to take effect. But if you want to reindex anyway click the `Fn` function runner in the lower right panel\nof the dashboard, and then choose to run `syncState:reset` from the dropdown. No arguments are required.\n\nThe schema of this table can be found in `convex/schema.ts` in this repository. Here's what it looks like:\n\n```tsx\n  // Various project settings you can tweak in the dashboard as we go.\n  settings: defineTable({\n    org: v.string(),\n    repo: v.string(),\n    branch: v.string(),\n    extensions: v.array(v.string()),\n    exclusions: v.optional(v.array(v.string())), // defaults to no exclusions\n    byteLimit: v.optional(v.number()), // defaults to 24,000 bytes\n    chatModel: v.optional(v.string()), // defaults to gpt-4\n  }),\n```\n\n### Settings fields\n\n- **org** - The organization owner of the target GitHub repo to index. For React (https://github.com/facebook/react), this is `facebook`.\n- **repo** - The repository name of the target GitHub repo to index. For React (https://github.com/facebook/react), this is `react`.\n- **branch** - The the branch name in the repository to index. This is usually 'main', or 'master'.\n- **extensions** - An array of file extensions (like '.ts') that should be considered code and therefore dryad should attempt to index.\n- **exclusions** - An array of relative file paths with the repository you wish to explicitly skip indexing.\n- **byteLimit** - Do not index files larger than this byte count. Large files will produce more tokens\n  that the OpenAI model is able to process in one pass.\n- **chatModel** - Which OpenAI chat model to use for summarizing the purposes of source files. Typical choices are `gpt-3.5-turbo`, `gpt-4`.\n\n# How dryad works\n\nThree main things to cover:\n\n1. Keeping up to date with repository changes\n2. Indexing source files\n3. Searching for semantic matches\n\n## 1. Keeping up to date with repository changes\n\nEvery minute, dryad calls a job named `repo:sync`. This\nis a Convex action which uses a table called `syncState` to\nloop between two states:\n\n1. Polling for a new commit on HEAD.\n2. Indexing that commit\n\nWhile polling for a new commit, dryad uses the GitHub API (via Octokit)\nto check the sha of the target repo + branch. As long as the value coming back from GitHub\nremains the same as the last indexed sha in `syncState.commit`, `repo:sync` exits until the next poll.\n\nBut when a new commit is discovered, the `syncState.commit` field is set to\nthat is set to the new sha, and tha `commitDone` field is set to false. This puts\ndryad into \"Indexing that commit\" mode.\n\nWhen indexing a commit, `repo:sync` first uses the GitHub \"trees\" API to fetch the entire\nfile tree of that commit, including the file checksums associated with every file.\n\nDryad then walks this whole tree, looking for source code files (according to the `settings``\ntable's extension specification). For every source file, it determines if the checksum\nhas changed since the last time the file was indexed. If the file is new or has changed,\nit is downloaded from the repo and re-indexed.\n\nOtherwise the file is marked current–still valid in new commit.\n\nFinally, after all files in the tree are properly indexed, any files that no longer part of this new commit tree are removed from the index.\n\nAnd with that, `commitDone` is set to true and dryad goes back to polling for a new commit.\n\n## 2. Indexing source files\n\nIndexing source files involves three steps:\n\n1. Ask ChatGPT to summarize the \"primary goals\" of the source file in JSON format.\n1. Take each of those goals and independently ask OpenAI to generate a vector embedding\n   for it. [Learn more about embeddings here.](https://youtu.be/m6eWdnRhBpA)\n1. Store each goal and associated vector into Convex's `fileGoals` table, with a reference to the parent source file record in `files`. The goal's vector field is using Convex's vector indexing to support fast searching from the web app.\n\n## 3. Searching for semantic matches\n\nWhen someone submits a query in the web app, dryad uses the same OpenAI embeddings API to generate\na vector, and then uses Convex's vector index to find source files with a semantically-similar goal\nto the search term.\n\nSearching only returns each source file one time, returning the highest-ranked goal as the primary\nreason for that file's inclusion in the result set.\n\n# Exercises – Next improvements for dryad\n\nDryad is quite basic at this point! There are a lot of directions you could take the project in.\n\nThe project's issues have been seeded with [a collection of potential extensions and improvements to dryad](https://github.com/get-convex/dryad/labels/good%20first%20issue) to get the wheels turning\nabout more sophisticated things that could be built from dryad.\n\nHappy hacking!\n\n# Community\n\n[Join our discord to talk about dryad.](https://convex.dev/community)\n\n# What is Convex?\n\n[Convex](https://convex.dev) is a hosted backend platform with a\nbuilt-in database that lets you write your\n[database schema](https://docs.convex.dev/database/schemas) and\n[server functions](https://docs.convex.dev/functions) in\n[TypeScript](https://docs.convex.dev/typescript). Server-side database\n[queries](https://docs.convex.dev/functions/query-functions) automatically\n[cache](https://docs.convex.dev/functions/query-functions#caching--reactivity) and\n[subscribe](https://docs.convex.dev/client/react#reactivity) to data, powering a\n[realtime `useQuery` hook](https://docs.convex.dev/client/react#fetching-data) in our\n[React client](https://docs.convex.dev/client/react). There are also clients for\n[Python](https://docs.convex.dev/client/python),\n[Rust](https://docs.convex.dev/client/rust),\n[ReactNative](https://docs.convex.dev/client/react-native), and\n[Node](https://docs.convex.dev/client/javascript), as well as a straightforward\n[HTTP API](https://github.com/get-convex/convex-js/blob/main/src/browser/http_client.ts#L40).\n\nThe database supports\n[NoSQL-style documents](https://docs.convex.dev/database/document-storage) with\n[relationships](https://docs.convex.dev/database/document-ids) and\n[custom indexes](https://docs.convex.dev/database/indexes/)\n(including on fields in nested objects).\n\nThe\n[`query`](https://docs.convex.dev/functions/query-functions) and\n[`mutation`](https://docs.convex.dev/functions/mutation-functions) server functions have transactional,\nlow latency access to the database and leverage our\n[`v8` runtime](https://docs.convex.dev/functions/runtimes) with\n[determinism guardrails](https://docs.convex.dev/functions/runtimes#using-randomness-and-time-in-queries-and-mutations)\nto provide the strongest ACID guarantees on the market:\nimmediate consistency,\nserializable isolation, and\nautomatic conflict resolution via\n[optimistic multi-version concurrency control](https://docs.convex.dev/database/advanced/occ) (OCC / MVCC).\n\nThe [`action` server functions](https://docs.convex.dev/functions/actions) have\naccess to external APIs and enable other side-effects and non-determinism in\neither our\n[optimized `v8` runtime](https://docs.convex.dev/functions/runtimes) or a more\n[flexible `node` runtime](https://docs.convex.dev/functions/runtimes#nodejs-runtime).\n\nFunctions can run in the background via\n[scheduling](https://docs.convex.dev/scheduling/scheduled-functions) and\n[cron jobs](https://docs.convex.dev/scheduling/cron-jobs).\n\nDevelopment is cloud-first, with\n[hot reloads for server function](https://docs.convex.dev/cli#run-the-convex-dev-server) editing via the\n[CLI](https://docs.convex.dev/cli). There is a\n[dashboard UI](https://docs.convex.dev/dashboard) to\n[browse and edit data](https://docs.convex.dev/dashboard/deployments/data),\n[edit environment variables](https://docs.convex.dev/production/environment-variables),\n[view logs](https://docs.convex.dev/dashboard/deployments/logs),\n[run server functions](https://docs.convex.dev/dashboard/deployments/functions), and more.\n\nThere are built-in features for\n[reactive pagination](https://docs.convex.dev/database/pagination),\n[file storage](https://docs.convex.dev/file-storage),\n[reactive search](https://docs.convex.dev/text-search),\n[https endpoints](https://docs.convex.dev/functions/http-actions) (for webhooks),\n[streaming import/export](https://docs.convex.dev/database/import-export/), and\n[runtime data validation](https://docs.convex.dev/database/schemas#validators) for\n[function arguments](https://docs.convex.dev/functions/args-validation) and\n[database data](https://docs.convex.dev/database/schemas#schema-validation).\n\nEverything scales automatically, and it’s [free to start](https://www.convex.dev/plans).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fget-convex%2Fdryad","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fget-convex%2Fdryad","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fget-convex%2Fdryad/lists"}