{"id":15129819,"url":"https://github.com/apsinghanalytics/finragify_app","last_synced_at":"2026-01-19T05:32:39.396Z","repository":{"id":254074411,"uuid":"845398965","full_name":"apsinghAnalytics/FinRAGify_App","owner":"apsinghAnalytics","description":"An LLM app leveraging RAG with LangChain and GPT-4 mini to analyze earnings call transcripts, assess company performance, using natural language queries (NLP), FAISS (vector database), and Hugging Face re-ranking models.","archived":false,"fork":false,"pushed_at":"2024-08-28T04:15:32.000Z","size":5085,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-05T19:19:03.255Z","etag":null,"topics":["aws-ec2","cloud-application","docker-container","earnings-transcripts","faiss-vector-database","finance","fine-tuning","gpt-4o-mini","huggingface-models","langchain-python","large-language-model","natural-language-processing","pretrained-language-model","prompt-engineering","question-answering-system","reranking","retrieval-augmented-generation","stocks","vector-embeddings"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apsinghAnalytics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-21T07:10:14.000Z","updated_at":"2025-04-02T11:09:47.000Z","dependencies_parsed_at":"2024-10-31T10:44:03.249Z","dependency_job_id":"01f65d62-c07e-4750-9404-87a652dc6d8e","html_url":"https://github.com/apsinghAnalytics/FinRAGify_App","commit_stats":null,"previous_names":["apsinghanalytics/finragify_app"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apsinghAnalytics%2FFinRAGify_App","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apsinghAnalytics%2FFinRAGify_App/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apsinghAnalytics%2FFinRAGify_App/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apsinghAnalytics%2FFinRAGify_App/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apsinghAnalytics","download_url":"https://codeload.github.com/apsinghAnalytics/FinRAGify_App/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247386407,"owners_count":20930634,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-ec2","cloud-application","docker-container","earnings-transcripts","faiss-vector-database","finance","fine-tuning","gpt-4o-mini","huggingface-models","langchain-python","large-language-model","natural-language-processing","pretrained-language-model","prompt-engineering","question-answering-system","reranking","retrieval-augmented-generation","stocks","vector-embeddings"],"created_at":"2024-09-26T02:21:03.732Z","updated_at":"2026-01-19T05:32:39.391Z","avatar_url":"https://github.com/apsinghAnalytics.png","language":"Jupyter Notebook","readme":"*An LLM app leveraging RAG with LangChain and GPT-4 mini to analyze earnings call transcripts, assess company performance, evaluate management's track record by using natural language queries (NLP), FAISS (vector database), and Hugging Face re-ranking models.*\n\nCheckout out the deployed app here: [http://ec2-40-177-46-181.ca-west-1.compute.amazonaws.com:8501](http://ec2-40-177-46-181.ca-west-1.compute.amazonaws.com:8501)\n\nThis readme provides a brief overview off the app, focusing on the app installation instructions. For more details, please checkout the [blog here](https://apsinghanalytics.github.io/2024/08/26/FinRAGifyApp/) \n\n# FinRAGify_App: \n\n\u003cp align=\"center\"\u003e \u003cimg width=\"150\" src=\"https://raw.githubusercontent.com/apsinghAnalytics/FinRAGify_App/main/images/finragify.png\"\u003e \u003c/p\u003e\n\nFinRAGify is a user-friendly research tool designed to simplify the process of retrieving information from earnings calls of publicly traded companies. Users can select a company from a limited list (available for this proof-of-concept) and ask questions from a set of presets or create custom queries, such as *\"Were any new products launched?\"* or *\"What are the company’s future plans and outlook?\"* The app then searches (using embeddings) the last two years (8 quarters) of quarterly earnings calls by leveraging **RAG (Retrieval-Augmented Generation)** technology, a machine learning technique that combines retrieval-based and generative models (GPT, LLMs), to find and present contextually relevant answers.\n\n\u003cp align=\"center\"\u003e \u003cimg width=\"800\" src=\"https://raw.githubusercontent.com/apsinghAnalytics/FinRAGify_App/main/images/finragify_UI.gif\"\u003e \u003c/p\u003e\n\n\n\n## Features\n\n- **Load and Process Earnings Call Transcripts:** Fetch earnings call transcripts for selected stocks through the [*FinancialModelingPrep API,*](https://site.financialmodelingprep.com/developer/docs#earnings-transcripts) retrieving up to 8 quarters of data and sorting them by year and quarter.\n- **Embedding and Vector Store Creation:** Construct embedding vectors using *OpenAI's embeddings* and store them in a [*FAISS (Facebook AI Similarity Search) vector store*](https://python.langchain.com/v0.2/docs/integrations/vectorstores/faiss/) for fast and effective retrieval of relevant transcript chunks.\n- **Re-rank Documents for Relevance:** Use a [*CrossEncoder model (ms-marco-MiniLM-L-6-v2)*](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) available on *hugging face* to re-rank retrieved transcript chunks and choose a smaller pool of the most relevant informatio for answering user queries.\n- **Preset and Custom Financial Questions:** Offer a selection of preset financial questions focused on key business areas (e.g., future plans, product launches) with the flexibility to input custom queries.\n- **Management Consistency Analysis:** Evaluate management's track record by comparing past promises with actual outcomes across multiple quarters, summarizing how often targets were met.\n\n## Project Structure\n\n- main.py: The main Streamlit application script.\n- backend_functions: The functions for the app are defined here. \n- requirements.txt: A list of required Python packages for the project.\n- .env: Configuration file for storing your OpenAI and FinancialModelingPrep API keys:  \n- dockerfile: The docker file to create the docker image if the user prefers to run the app by containerizing and deploying via Docker.\n- lean_finragify: The repo for the light weight version of this app, which uses the [Cohere Rerank API,](https://docs.cohere.com/reference/rerank) instead of the open source [*CrossEncoder model (ms-marco-MiniLM-L-6-v2)*](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) for reranking the retrieved data chunks. This *reduces the RAM requirements from 300- 600 MB to about 150-300 MB*, which can be very helpful in deploying the app to smaller cloud compute instances like the AWS EC2 t3.micro. Please refer to the [readme](https://github.com/apsinghAnalytics/FinRAGify_App/blob/main/lean_finragify/README.md) inside for installation instructions of that light version. \n\n## Installation\n\n### Method 1: Cloning GitHub Repo to Local Machine\n\n1. Clone this repository to your local machine using:\n\n```bash\ngit clone https://github.com/apsinghAnalytics/FinRAGify_App.git\n```\n\n2. Navigate to the project directory:\n\n```bash\ncd FinRAGify_APP\n```\n\n3. Create a local Python environment and activate it:\n\n```bash\npython3.10 -m venv venv\nsource venv/bin/activate  # On Windows, use `venv\\Scripts\\activate`\n```\n\n4. Install the required packages, starting with the specific version of Torch (**this must be installed before installing from requirements.txt**):\n\n```bash\npip install torch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 --index-url https://download.pytorch.org/whl/cpu\n```\n\n5. Install the remaining dependencies from `requirements.txt`:\n\n```bash\npip install -r requirements.txt\n```\n\n6. Set up your API keys by creating a `.env` file in the project root and adding your keys:\n\n```bash\nOPENAI_API_KEY='your_openai_api_key_here'\nFMP_API_KEY='your_fmp_api_key_here'\n```\n\n\u003cp align=\"center\"\u003e \u003cimg width=\"600\" src=\"https://raw.githubusercontent.com/apsinghAnalytics/FinRAGify_App/main/images/env_file.png\"\u003e \u003c/p\u003e\n\n7. Run the Streamlit app by executing:\n\n```bash\nstreamlit run main.py\n```\n\n### Method 2: Docker Containerization\n\n**Note:** Using a dockerized container to deploy this app requires about 200 MB more in terms of RAM\n\n1. Copy the `Dockerfile` and `.env` file to the same folder on your local machine.\n\n2. Open PowerShell (or your preferred terminal) and navigate to this folder:\n\n```bash\ncd path_to_your_folder\n```\n\n3. Build the Docker image using the following command:\n\n```bash\ndocker build -t finragify_app:latest .\n```\n**Note:** *Ensure that you have docker (docker desktop for Windows) installed and running before using docker commands*\n\n4. Once the Docker image is created, run the Docker container by mapping the exposed port `8501` (see the dockerfile) to an available port on your local machine (e.g., `8501`, `8502`, `8503`):\n\n```bash\ndocker run -d -p 8503:8501 --name finragify_container finragify_app:latest #this maps 8503 of local machine to exposed port 8501 of the app\n```\n\n5. Access the Streamlit app by navigating to `http://localhost:8503` (or the port you've mapped) in your web browser.\n\n### Deployment Note\n\nIf you prefer to **deploy this application on an AWS EC2 instance**, you can follow the general EC2 Streamlit app deployment steps mentioned in my previous README for another app [here](https://github.com/apsinghAnalytics/streamlit_VentureGen).\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapsinghanalytics%2Ffinragify_app","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapsinghanalytics%2Ffinragify_app","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapsinghanalytics%2Ffinragify_app/lists"}