{"id":15203989,"url":"https://github.com/behkamfallah/chat-duck","last_synced_at":"2026-02-23T18:06:59.152Z","repository":{"id":246558725,"uuid":"821342971","full_name":"behkamfallah/Chat-Duck","owner":"behkamfallah","description":"This repository is a 'Chat-with-your-PDF' project using RAG approach. ","archived":false,"fork":false,"pushed_at":"2025-01-06T12:54:29.000Z","size":7181,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-09T03:51:10.506Z","etag":null,"topics":["elasticsearch","huggingface","hybrid-retrieval","knn","langchain","openai","pinecone","rag","rrf","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/behkamfallah.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-28T10:25:16.000Z","updated_at":"2025-01-06T12:54:33.000Z","dependencies_parsed_at":"2024-08-16T12:09:07.323Z","dependency_job_id":"dcf81e05-3abe-4d81-8e7c-d317f2159e8d","html_url":"https://github.com/behkamfallah/Chat-Duck","commit_stats":{"total_commits":37,"total_committers":2,"mean_commits":18.5,"dds":0.3783783783783784,"last_synced_commit":"c13105da2093c38454d7c87db2ead065e3e186c8"},"previous_names":["behkamfallah/chat-duck"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/behkamfallah/Chat-Duck","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behkamfallah%2FChat-Duck","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behkamfallah%2FChat-Duck/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behkamfallah%2FChat-Duck/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behkamfallah%2FChat-Duck/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/behkamfallah","download_url":"https://codeload.github.com/behkamfallah/Chat-Duck/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behkamfallah%2FChat-Duck/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278976696,"owners_count":26078881,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elasticsearch","huggingface","hybrid-retrieval","knn","langchain","openai","pinecone","rag","rrf","streamlit"],"created_at":"2024-09-28T05:04:49.954Z","updated_at":"2025-10-08T16:28:48.041Z","avatar_url":"https://github.com/behkamfallah.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### About the Project\nChat with your PDF\n\nThis repository is a 'Chat-with-your-PDF' project using two different implementations, namely Light and Enterprise. Me and @Pardis-Rahbarsooreh have worked on this project.\n\n\n### Prerequisites\nEnsure that you have installed the libraries in `requirements.txt` which is located in the `.\\source\\requirements.txt`.\nYou can run this code from terminal:\n```py\n!pip install -r requirements.txt\n```\n\n\nIf you get \"recursive_guard\" error while running the code, try using python 3.11.\n\n\nIf you would like to fork the repository be sure that create an .env file in the ./source and put the API keys in it.\nThese APIs will be needed if you would like to fully operate this code:\n```py\nOPENAI_API_KEY='...'\nELASTIC_API_KEY='...'\nELASTIC_CLOUD_ID='...'\nELASTIC_END_POINT='...'\nUNSTRUCTURED_API_KEY='...'\nUNSTRUCTURED_SERVER_URL='...'\nPINECONE_API_KEY='...'\n```\n\n### Files and Folders\n\nThis repository has three main folders:\n1. ```./data``` is the folder you should put your pdf file there.\n\n2. ```./source``` is the folder that consists of ```.py``` files.\nThis folder has these python files with these usages:\n   1. To insert data to databases, use these files:\n      \n      1. ```data_to_ElasticCloud.py```\n      2. ```data_to_Pinecone.py```\n      \n      Simply specify your file in the line 12 and run the file.\n   2. To run the whole application on Streamlit you will need the ```streamlit_app.py```:\n         Open Terminal an change directory to ```./source``` and then type:\n         ```.py\n      streamlit run streamlit_app.py\n      ```      \n   3. ```document_loader.py``` has the responsibility to Load PDFs. You can call an instance of LoadDocument class that is implemented in this file.\n   4. ```chunker.py``` has the responsibility to chunk the data. This file is used only for dealing with the data that will be indexed to Pinecone database.\n   5. ```pinecone_handler.py``` handles the client and connection to Pinecone servers. It also retrieves data.\n   6. ```elasticsearchhandler.py``` handles the client and connection to Elastic Cloud.\n   7. ```unstructured_io_handler.py``` handles the connection and getting results from the 'Unstructured.io' servers.\n   8. ```light_model.py``` has the chain related to Light Model.\n   9. ```enterprise_model.py``` has the chain related to Enterprise Model.\n   10. ```test_synthetic_data.py``` is for testing the app via benchmarks. If you want to run this file, remember to change context window of light model and use ```enterprise_model_for_test.py``` instead of ```enterprise_model.py```.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbehkamfallah%2Fchat-duck","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbehkamfallah%2Fchat-duck","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbehkamfallah%2Fchat-duck/lists"}