https://github.com/venuv/langchain_semantic_search

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python
https://github.com/venuv/langchain_semantic_search

Last synced: 3 months ago
JSON representation

Search and indexing your own Google Drive Files using GPT3, LangChain, and Python

Host: GitHub
URL: https://github.com/venuv/langchain_semantic_search
Owner: venuv
Created: 2023-02-07T06:43:55.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-02-07T11:42:25.000Z (over 2 years ago)
Last Synced: 2024-11-06T15:43:52.504Z (8 months ago)
Language: Jupyter Notebook
Homepage:
Size: 13.7 KB
Stars: 41
Watchers: 1
Forks: 8
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-langchain - Langchain Semantic Search
awesome-langchain-zh - Langchain Semantic Search

README

## Search and indexing your own Google Drive Files using GPT3, LangChain, and Python.

The jupyter notebook included here (langchain_semantic_search.ipynb) will enable you to build a FAISS index on your document corpus of interest, and search it using semantic search. Details of this flowchart are described in https://medium.com/@venuv62/can-chatgpt-be-your-bff-code-companion-4375fd73ec3a.

![image](https://user-images.githubusercontent.com/1031925/217168553-d74ef962-1a9d-4351-8c96-9033e65d58ab.png)

I've provided a test directory of Neuromodulation papers if you want to as a sample Drive folder to test against - https://drive.google.com/drive/folders/1eIBnSO7MVOW9-BKPCJhs7JuBDRyXPOFC?usp=sharing. Since the code needs a Google Drive directory path (not an https URL) to work with, you will have to :
- copy the contents of this directory into a GDrive subdirectory of your own
- set the gdrive_path variable in the jupyter notebook appropriately
- set the question within print_answer to 'is sleep a health epidemic' for instance, which should give you a non-null answer

I will be working on a few enhancements to speed up the indexing (perhaps using a Vectorstore) and to optimize the query cost (using ideas from https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/venuv/langchain_semantic_search

Awesome Lists containing this project

README