{"id":29222799,"url":"https://github.com/sortphy/chatgpdune","last_synced_at":"2025-07-03T04:02:20.945Z","repository":{"id":301058265,"uuid":"1008031286","full_name":"sortphy/chatGPDUNE","owner":"sortphy","description":"Dune themed RAG based LLM ChatBot. | Using Ollama, DeepSeek, Neo4J and LangChain","archived":false,"fork":false,"pushed_at":"2025-07-03T02:11:19.000Z","size":10076,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-03T03:23:55.730Z","etag":null,"topics":["chatbot","deepseek","dune","langchain","llm","neo4j","ollama","rag"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sortphy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-24T23:21:48.000Z","updated_at":"2025-07-03T02:11:23.000Z","dependencies_parsed_at":"2025-06-25T00:38:12.629Z","dependency_job_id":null,"html_url":"https://github.com/sortphy/chatGPDUNE","commit_stats":null,"previous_names":["sortphy/chatgpdune"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sortphy/chatGPDUNE","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sortphy%2FchatGPDUNE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sortphy%2FchatGPDUNE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sortphy%2FchatGPDUNE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sortphy%2FchatGPDUNE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sortphy","download_url":"https://codeload.github.com/sortphy/chatGPDUNE/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sortphy%2FchatGPDUNE/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263256549,"owners_count":23438262,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","deepseek","dune","langchain","llm","neo4j","ollama","rag"],"created_at":"2025-07-03T04:02:20.004Z","updated_at":"2025-07-03T04:02:20.903Z","avatar_url":"https://github.com/sortphy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ChatGPDune\n### Dune themed RAG based LLM ChatBot. | Using Ollama, DeepSeek, Neo4J and LangChain\n\n\n#### Grupo: Gustavo Henrique, Icaro Botelho, Maruan Biasi, Mauricio Nunes\n\n-------------------------------\n\n# To run:\n\n## Ollama Setup:\n- Install Ollama from https://ollama.com/download\n- Pull whatever model you want to use, by default the project uses only deepseek-r1, you can pull it using the following command:\n- ```pull deepseek-r1:latest```\n\n## Neo4j Setup:\n- Install neo4j desktop from https://neo4j.com/download/\n- Open Neo4j desktop and create a database, preferably called chatgpdune\n- When creating the database, make sure to add your user and password to the .env\n\n## Project Setup\n- clone git repo\n- create and activate venv via ```python -m venv venv```\n- install python requirements via ```pip install -r requirements.txt```\n\n## Database Setup\n- you need a populated database to use the RAG\n- the easiest way is to import the pre-processed embeddings from the Dune 1 book using the csv file\n- open neo4j desktop, connect to the database and import the file located inside our project at /database/Ingested/book-1-only/node-export.csv\n- This csv file contains all embeddings from the Dune 1 book, which would take hours to process.\n- If you want more data inside you database, which is recommended, follow the tutorial below on how to process data locally.\n\n## Run Backend\n- from project root do the following\n- ```cd backend```\n- ```uvicorn app:app --reload```\n\n## Run Frontend\n- from project root do the following\n- ```cd frontend/chatgpdune```\n- ```npm i```\n- ```npm run start```\n\n## (Optional) How to process data locally for the RAG\n- To process data locally, which is basically generate the embeddings for chunks of text using your own machine, do the following\n- Pull the nomic-embed-text model via ollama using ```ollama pull nomic-embed-text```\n- ```cd RAG```\n- Everything inside the \"data\" folder will be processed and it's embeddings will the added to the database.\n- If you want to ignore a file, which means, skip it's embeddings, you can put it inside /data/ignore. All other children folders inside /data will be processed, only /data/ignore wont.\n- It supports the following file formats: [.txt, .pdf, .html, .htm, .md, .markdown]\n- It also does webscraping from the dune fandom wiki, you can choose to turn this feature on or off when you run the data ingestion system.\n- Inside the file \"data_ingestion.py\" you can customize different parameters on the \"Tweakable settings\" section.\n- On line 294, you can customize how many wiki pages to scrape.\n- To run the data ingestion system, use the following command: ```python data_ingestion.py```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsortphy%2Fchatgpdune","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsortphy%2Fchatgpdune","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsortphy%2Fchatgpdune/lists"}