{"id":18577126,"url":"https://github.com/brianlesko/rag-text-search","last_synced_at":"2026-04-16T10:35:30.599Z","repository":{"id":205685026,"uuid":"714759405","full_name":"BrianLesko/RAG-text-search","owner":"BrianLesko","description":"This git repository hosts a user interface for a chat-app, with integrated text similarity search for querying a document. Think of it as an upgrded Cmd+F search. It's written in Pure Python. Created for Learning Purposes.","archived":false,"fork":false,"pushed_at":"2024-01-17T23:16:50.000Z","size":8567,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-16T01:11:19.649Z","etag":null,"topics":["cosine-similarity","gpt","llm","openai","python","search-engine","streamlit","text","text-embedding","text-processing","ui"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BrianLesko.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-11-05T19:22:25.000Z","updated_at":"2024-03-18T22:12:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"54029147-44e6-4a8a-822e-79e3b365e4d0","html_url":"https://github.com/BrianLesko/RAG-text-search","commit_stats":null,"previous_names":["brianlesko/text-similarity-search","brianlesko/rag-text-search"],"tags_count":0,"template":true,"template_full_name":null,"purl":"pkg:github/BrianLesko/RAG-text-search","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrianLesko%2FRAG-text-search","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrianLesko%2FRAG-text-search/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrianLesko%2FRAG-text-search/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrianLesko%2FRAG-text-search/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BrianLesko","download_url":"https://codeload.github.com/BrianLesko/RAG-text-search/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrianLesko%2FRAG-text-search/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31882652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-16T09:23:21.276Z","status":"ssl_error","status_checked_at":"2026-04-16T09:23:15.028Z","response_time":69,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cosine-similarity","gpt","llm","openai","python","search-engine","streamlit","text","text-embedding","text-processing","ui"],"created_at":"2024-11-06T23:27:54.858Z","updated_at":"2026-04-16T10:35:30.567Z","avatar_url":"https://github.com/BrianLesko.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Text Similarity Search\nThis code implements a chat-app with text similarity search for querying a document. Think of it as an upgraded Cmd+F search. It's written in [Pure Python](https://github.com/BrianLesko/text-similarity-search/blob/main/app.py). Created for Learning Purposes.\n\n\n\u0026nbsp;\n\n\u003cdiv align=\"center\"\u003e\u003cimg src=\"docs/preview.png\" width=\"800\"\u003e\u003c/div\u003e\n\n\u0026nbsp;\n\n## Dependencies\n\nThis code uses the following libraries:\n- `streamlit`: for building the user interface.\n- `openai`: for generating responses to user questions.\n- `tiktoken`: for tokenizing text\n- `scikit-learn`: for finding the relevant text chunks based on a user's question.\n- `numpy`: for creating arrays\n- `pandas`: for creating dataframes\n\n\u0026nbsp;\n\n## Usage\n\nTo run this code, you need an OpenAI API Key. You can get an OpenAI API key by creating an account on the OpenAI website. Copy it to your clipboard and paste it into the app once its running. All the dependencies are handled automatically from the requirements.txt file\n\nRun the following command:\n```\npip install --upgrade streamlit\nstreamlit run https://github.com/BrianLesko/text-similarity-search/blob/main/app.py\n```\n\nThis will start the Streamlit server, and you can access the chatbot by opening a web browser and navigating to `http://localhost:8501`.\n\n\u0026nbsp;\n\n## How it Works\n\nThe chatbot works as follows:\n1. The user enters a question in the input field.\n2. The chatbot retrieves relevant text chunks based on the user's question using scikit-learn cosine similarity search.\n3. The chatbot adds the user's question to the retrieved text chunks to create an augmented query.\n4. The chatbot generates a response to the augmented query using OpenAI's GPT-3.5 (Chat GPT) language model.\n5. The chatbot displays the response to the user, along with the chat history.\n\nThe chat history is saved in the `st.session_state` dictionary, which is a dictionary that persists across Streamlit sessions.\n\n\u0026nbsp;\n## Repository Structure\n```\ndoc-chat/\n├── .streamlit/\n│   └── config.toml # theme info for the UI\n├── docs/\n│   └── content.png\n├── app.py # the code and UI integrated together live here\n├── about.py # for the UI\n├── requirements.txt # the python packages needed to run locally\n└── .gitignore # includes the api key file and the local virtual environment\n```\n\n\u0026nbsp;\n\n## Topics \n```\nPython | Streamlit | Git | Low Code UI\nTemplate Repository | Chat interface | LLM\nText similarity | Text embeddings | Cosine Similarity \nSklearn | OpenAI\n```\n\u0026nbsp;\n\n\u003chr\u003e\n\n\u0026nbsp;\n\n\u003cdiv align=\"center\"\u003e\n\n\n\n╭━━╮╭━━━┳━━┳━━━┳━╮╱╭╮        ╭╮╱╱╭━━━┳━━━┳╮╭━┳━━━╮\n┃╭╮┃┃╭━╮┣┫┣┫╭━╮┃┃╰╮┃┃        ┃┃╱╱┃╭━━┫╭━╮┃┃┃╭┫╭━╮┃\n┃╰╯╰┫╰━╯┃┃┃┃┃╱┃┃╭╮╰╯┃        ┃┃╱╱┃╰━━┫╰━━┫╰╯╯┃┃╱┃┃\n┃╭━╮┃╭╮╭╯┃┃┃╰━╯┃┃╰╮┃┃        ┃┃╱╭┫╭━━┻━━╮┃╭╮┃┃┃╱┃┃\n┃╰━╯┃┃┃╰┳┫┣┫╭━╮┃┃╱┃┃┃        ┃╰━╯┃╰━━┫╰━╯┃┃┃╰┫╰━╯┃\n╰━━━┻╯╰━┻━━┻╯╱╰┻╯╱╰━╯        ╰━━━┻━━━┻━━━┻╯╰━┻━━━╯\n  \n\n\n\u0026nbsp;\n\n\n\u003ca href=\"https://twitter.com/BrianJosephLeko\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/BrianLesko/BrianLesko/f7be693250033b9d28c2224c9c1042bb6859bfe9/.socials/svg-white/x-logo-white.svg\" width=\"30\" alt=\"X Logo\"\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u003ca href=\"https://github.com/BrianLesko\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/BrianLesko/BrianLesko/f7be693250033b9d28c2224c9c1042bb6859bfe9/.socials/svg-white/github-mark-white.svg\" width=\"30\" alt=\"GitHub\"\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u003ca href=\"https://www.linkedin.com/in/brianlesko/\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/BrianLesko/BrianLesko/f7be693250033b9d28c2224c9c1042bb6859bfe9/.socials/svg-white/linkedin-icon-white.svg\" width=\"30\" alt=\"LinkedIn\"\u003e\u003c/a\u003e\n\nfollow all of these or i will kick you\n\n\u003c/div\u003e\n\n\n\u0026nbsp;\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrianlesko%2Frag-text-search","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrianlesko%2Frag-text-search","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrianlesko%2Frag-text-search/lists"}