{"id":19893542,"url":"https://github.com/deepchecks/qa-over-csv","last_synced_at":"2025-03-01T05:27:42.069Z","repository":{"id":194097774,"uuid":"689910494","full_name":"deepchecks/qa-over-csv","owner":"deepchecks","description":null,"archived":false,"fork":false,"pushed_at":"2023-12-15T17:53:30.000Z","size":904,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-01-11T19:50:42.215Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deepchecks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-11T06:57:30.000Z","updated_at":"2023-09-11T07:29:32.000Z","dependencies_parsed_at":"2023-09-11T21:08:48.926Z","dependency_job_id":"d89023f1-2080-4d08-be19-efddae097bb2","html_url":"https://github.com/deepchecks/qa-over-csv","commit_stats":null,"previous_names":["deepchecks/qa-over-csv"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepchecks%2Fqa-over-csv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepchecks%2Fqa-over-csv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepchecks%2Fqa-over-csv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepchecks%2Fqa-over-csv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deepchecks","download_url":"https://codeload.github.com/deepchecks/qa-over-csv/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241322507,"owners_count":19944069,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T18:29:44.557Z","updated_at":"2025-03-01T05:27:42.051Z","avatar_url":"https://github.com/deepchecks.png","language":"Python","funding_links":[],"categories":["Resources"],"sub_categories":["Examples"],"readme":"# CSV Question Answering\n\n\u003cimg src=\"./assets/deepchecks_llm_app.svg\"\u003e\n\n🤖 Integrate your Agent-based LLM application with Deepchecks LLM Evaluation using Deepchecks LLM SDK 🤖\n\n- [App description](#app-description)\n- [Environment Setup](#environment-setup)\n- [How to use Deepchecks LLM SDK?](#how-to-use-deepchecks-llm-sdk)\n  - [Instantiate Deepchecks LLM SDK client](#instantiate-deepchecks-llm-sdk-client)\n  - [Process the user queries in real time](#process-the-user-queries-in-real-time)\n  - [Annotate the LLM response](#annotate-the-llm-response)\n- [Deploy the app to Streamlit](#deploy-the-app-to-streamlit)\n\n# App description\nThis application utilizes the dataframe agent by Langchain in order to answer questions from a pandas DataFrame. For this application, you can upload any CSV or Excel file and then ask questions about it. We can configure the GPT model utilized to either use GPT-3.5-turbo or GPT-4. To access the information retrieval, the python code snippet to fetch data from the dataframe, and intermediate steps of the agent etc., we log all these things in a text file and then extract the required information. You can test the app at [Deepy Bot](https://question-answer-over-csv-deepchecks.streamlit.app/)\n\n\u003e **User Input:** How many rows are there?\u003cbr\u003e\n  **LLM Response:** There are 1460 rows in the dataframe.\n\n\u003e **User Input:** What is the price of the property having largest number of pools??\u003cbr\u003e\n  **LLM Response:** The price of the property with the largest number of pools is $274,970.\n\n\n## Environment Setup\n\nThe application works on Windows, Linux, and Mac. In order to set up your environment, you can create a virtual environment and install all requirements:\n\n```shell\npython -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n```\n\nThen rename the `.env.example` file to `.env` and update the following keys as follows:\n\n```python\n# Get the OpenAI API key\nOPENAI_API_KEY='\u003cOPENAI_API_KEY\u003e'\n# Login to deepchecks' service and generate new API Key (Configuration -\u003e API Key) and place it here\nDEEPCHECKS_LLM_API_KEY='\u003cDEEPCHECKS_LLM_API_KEY\u003e'\n# Fill deepchecks host name here\nDEEPCHECKS_LLM_HOST_URL='\u003cDEEPCHECKS_LLM_HOST_URL\u003e'\n```\n\nNow, you are ready to start the streamlit app locally by running the following command:\n```python\nstreamlit run main.py\n```\n\nAfter running the application, if your Deepchecks LLM application name and version name does not match with the names on the Deepchecks LLM app, you need to update the Deepchecks LLM application name and version name from the UI by going to the **Settings** section as shown in the below image:\n\n\u003cimg src=\"./assets/settings-section.png\"\u003e\n\nYou can also update the GPT model from the **Settings** section. By default, the GPT model is selected to *\"gpt-3.5-turbo\"*.\n\n# How to use Deepchecks LLM SDK?\nBefore proceeding, make sure that you have an app created in the Deepchecks LLM evaluation application. We will not use the auto collect feature of the SDK here since there can be multiple LLM calls depending upon your agent pipeline. So we will be using the logging each interaction using the `log_interaction()` function provided by the SDK.\n\n## Instantiate Deepchecks LLM SDK client\n\n```python\nfrom deepchecks_llm_client.client import dc_client\n\ndc_client.init(host=DEEPCHECKS_LLM_HOST_URL,\n               api_token=DEEPCHECKS_LLM_API_KEY,\n               app_name=DEEPCHECKS_LLM_APP_NAME,\n               version_name=DEEPCHECKS_LLM_APP_VERSION_NAME,\n               env_type=EnvType.PROD,\n               auto_collect=False  # Setting auto collect to False\n               )\n```\n\n## Process the user queries in real time\n\n```python\n\nresult = call_llm_with_chatopenai(st.session_state.dataset, user_input)\n\ndc_client.log_interaction(user_input=result['user_input'],\n                          model_response=result['response'],\n                          full_prompt=result['llm_prompt'],\n                          information_retrieval=str(result['information_retrieval']),\n                          ext_interaction_id=user_generated_unique_key)\n\n```\n\n## Annotate the LLM response\n\n```python\nfrom deepchecks_llm_client.api import AnnotationType\n\n# If you want to annotate the LLM response as 'Good'\ndc_client.annotate(ext_interaction_id=user_generated_unique_key, annotation=AnnotationType.GOOD)\n\n# If you want to annotate the LLM response as 'Bad'\ndc_client.annotate(ext_interaction_id=user_generated_unique_key, annotation=AnnotationType.GOOD)\n```\n\n# Deploy the app to Streamlit\nThe code to run the StreamLit app is in `main.py`. Note that when setting up your StreamLit app you should make sure to add all the environment variables in your `.env` file as a secret environment variable as Secrets in Settings of your deployed Streamlit app.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepchecks%2Fqa-over-csv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeepchecks%2Fqa-over-csv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepchecks%2Fqa-over-csv/lists"}