{"id":19236890,"url":"https://github.com/crunchydata/postgres-ai-tutorial","last_synced_at":"2025-04-21T05:32:44.368Z","repository":{"id":89933703,"uuid":"605207520","full_name":"CrunchyData/Postgres-AI-Tutorial","owner":"CrunchyData","description":null,"archived":false,"fork":false,"pushed_at":"2023-08-25T16:03:31.000Z","size":19626,"stargazers_count":58,"open_issues_count":0,"forks_count":7,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-01T10:35:34.698Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CrunchyData.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-22T17:13:21.000Z","updated_at":"2025-02-14T15:53:15.000Z","dependencies_parsed_at":"2024-11-09T16:37:15.401Z","dependency_job_id":null,"html_url":"https://github.com/CrunchyData/Postgres-AI-Tutorial","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrunchyData%2FPostgres-AI-Tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrunchyData%2FPostgres-AI-Tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrunchyData%2FPostgres-AI-Tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrunchyData%2FPostgres-AI-Tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CrunchyData","download_url":"https://codeload.github.com/CrunchyData/Postgres-AI-Tutorial/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250002307,"owners_count":21359092,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T16:23:52.151Z","updated_at":"2025-04-21T05:32:44.350Z","avatar_url":"https://github.com/CrunchyData.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Postgres + AI Tutorial\n\nThis code and data package accompanies a blog post written by the Crunchy Data team.  Check it out at blog.crunchydata.com\n\n## Contents\n\nThe data packet contains the following:\n\n- ArmedForcesRecipes.xml - list of recipes\n- parser.rb - a file for parsing the XML and loading it into a Postgres database\n- classfier.rb - a file for that returns the OpenAI embeddings for the recipes\n- recipe-tracker.sql - a SQL dump from a database ready to query\n\n## Speedrun Walk Through\n\n1. Load the data into a Postgres database:\n\n```bash\nbash\u003e cat recipe-tracker.sql | psql recipe_tracker\n```\n\n2. Run the following query against your database:\n\n```sql\nSELECT\n        recipe_1.id,\n        recipe_1.name,\n        recipe_2.id,\n        recipe_2.name\nFROM (SELECT * FROM recipes WHERE name = 'Dish, turkey, curry') recipe_1,\n        recipes AS recipe_2\nWHERE recipe_1.id != recipe_2.id\nORDER BY recipe_1.embedding \u003c=\u003e recipe_2.embedding\nLIMIT 10;\n```\n\nYou’ll get the recommendations for the following similar meals:\n\n```\n id  |        name         | id  |                  name\n-----+---------------------+-----+----------------------------------------\n 272 | Dish, turkey, curry | 271 | Dish, turkey \u0026 noodles, baked\n 272 | Dish, turkey, curry | 251 | Dish, pot pie, turkey\n 272 | Dish, turkey, curry | 675 | Soup, tomato\n 272 | Dish, turkey, curry | 659 | Soup, chicken rice\n 272 | Dish, turkey, curry | 672 | Soup, rice w/beef\n 272 | Dish, turkey, curry | 660 | Soup, chicken vegetable/Mulligatawny\n 272 | Dish, turkey, curry | 199 | Dish, chicken, a la king\n 272 | Dish, turkey, curry |  38 | Beef, simmered\n 272 | Dish, turkey, curry | 689 | Stuffing, savory bread\n 272 | Dish, turkey, curry | 686 | Stew, beef chunks, w/juices \u0026 veg, cnd\n(10 rows)\n```\n\n## Typical Walk Through\n\n1. Install the necessary Ruby gems: `bundle install`\n2. Create a Postgres database, and run `create extension vector;`.  If you need a Postgres database, checkout [Crunchy Bridge](https://crunchybridge.com/).\n3. Sign up for OpenAI, and get your API token\n4. Set `DATABASE_URL` to the Postgres connection string, and set `OPENAI_API_KEY` to the value from #3\n4. Parse the `ArmedForcesRecipes.xml` file and load it into the database by running `ruby parser.rb`\n7. Pull down the OpenAI embeddings for your recipes by running `ruby classfier.rb` (will take ~ 5 minutes due to rate limiting)\n8. Query the database to find similar recipes:\n\n```sql\nSELECT\n        recipe_1.id,\n        recipe_1.name,\n        recipe_2.id,\n        recipe_2.name\nFROM (SELECT * FROM recipes WHERE name = 'Dish, turkey, curry') recipe_1,\n        recipes AS recipe_2\nWHERE recipe_1.id != recipe_2.id\nORDER BY recipe_1.embedding \u003c=\u003e recipe_2.embedding\nLIMIT 10;\n```\n\nThat is it. Now you have an AI powered recommendation engine.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrunchydata%2Fpostgres-ai-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrunchydata%2Fpostgres-ai-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrunchydata%2Fpostgres-ai-tutorial/lists"}