{"id":20355811,"url":"https://github.com/zackproser/pinecone-speed-experiments","last_synced_at":"2025-07-04T18:34:42.057Z","repository":{"id":186995756,"uuid":"675879904","full_name":"zackproser/pinecone-speed-experiments","owner":"zackproser","description":null,"archived":false,"fork":false,"pushed_at":"2023-08-08T15:09:27.000Z","size":134,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-15T01:07:52.564Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zackproser.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-08-08T00:25:19.000Z","updated_at":"2023-08-08T00:26:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"b08270fb-8746-4c39-8edc-932f47daedfc","html_url":"https://github.com/zackproser/pinecone-speed-experiments","commit_stats":null,"previous_names":["zackproser/pinecone-speed-experiments"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Fpinecone-speed-experiments","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Fpinecone-speed-experiments/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Fpinecone-speed-experiments/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zackproser%2Fpinecone-speed-experiments/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zackproser","download_url":"https://codeload.github.com/zackproser/pinecone-speed-experiments/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241889494,"owners_count":20037518,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T23:14:06.698Z","updated_at":"2025-03-04T17:30:55.407Z","avatar_url":"https://github.com/zackproser.png","language":"Python","readme":"# Overview\nThis is a work-in-progress [WIP] test repository for profiling the Pinecone Python client. \n\nIt is derived from [this LangChain example / Jupyter notebook](https://github.com/langchain-ai/langchain/blob/master/docs/extras/integrations/vectorstores/pinecone.ipynb)\n\nIt installs `pyinstrument` in addition to the requirements for the notebook itself. It uses virtualenv to allow the user to modify library code locally, which is\nuseful for chasing down slow codepaths and testing out fixes.\n\n# Usage \n\n## Set up virtualenv \n\nIn order to be able to modify libraries like `langchain` and the Pinecone client, it's necessary to set up a virtualenv:\n\n```bash\n# Install virtualenv \npip install virtualenv\n\n# Create a new virtualenv for Python 3.8\nvirtualenv -p python3.8 myenv\n\n# Activate the virtualenv\nsource myenv/bin/activate\n\n# Now, you can install the requirements, but they'll be available in \n# `myenv/lib/python/3.8/site-packages/\u003cpackage-name\u003e/`\npip3 install -r requirements.txt\n```\n\n## Export env vars in your shell \n\nIf you don't have them set, the script will prompt you for them. \n\n## IMPORTANT - you must run your test scripts in `module` mode \n\nNote the `-m`\n```bash\npython -m pinecone-speed-test.py\n```\n\n## Profiling code\n\nThere are challenges with running `pyinstrument` directly against a test script when you also want your test script to load packages / libraries from your virtualenv. \n\nFindings: \n\n1. Name your overall repo checkout directory the same as your test script: `pinecone-speed-test`, in this case. \n2. Ensure that directory has a `__init__.py` file in it, which signals that it is a package\n3. Add the pyinstrument code to your actual test script, like so: \n\n```python\n# At the top of your test script: \nfrom pyinstrument import Profiler\nprofiler = Profiler()\nprofiler.start()\n \n# Perform whatever jobs or processing is necessary or that you're trying to profile\n# \u003ccode to profile\u003e\n\n\n# At the end of your test script, have the pyinstrument profiler print out its findings: \nprofiler.stop()\nprint(profiler.output_text(unicode=True, color=True))\n```\n\nYou can then run your test script like so: \n\n```bash\npyhon -m pinecone-speed-test\n```\n\nWhen your test script is finished running you should get output similar to this: \n\n```bash\n  _     ._   __/__   _ _  _  _ _/_   Recorded: 10:02:50  Samples:  1927\n /_//_/// /_\\ / //_// / //_'/ //     Duration: 25.185    CPU time: 2.651\n/   _/                      v4.5.1\n\nProgram: /home/zachary/Pinecone/pinecone-speed-test/pinecone-speed-test.py\n\n25.184 \u003cmodule\u003e  pinecone-speed-test.py:2\n├─ 21.623 Pinecone.from_documents  langchain/vectorstores/base.py:410\n│     [108 frames hidden]  langchain, pinecone, urllib3, http, s...\n│        8.791 _SSLSocket.read  \u003cbuilt-in\u003e\n│        6.136 _SSLSocket.read  \u003cbuilt-in\u003e\n├─ 2.070 list_indexes  pinecone/manage.py:182\n│     [26 frames hidden]  pinecone, urllib3, http, socket, ssl,...\n├─ 0.805 init  pinecone/config.py:235\n│     [17 frames hidden]  pinecone, requests, urllib3\n└─ 0.664 Pinecone.similarity_search  langchain/vectorstores/pinecone.py:148\n      [47 frames hidden]  langchain, tenacity, openai, requests...\n```\n\n## Integration with `pyenv`\n\n![pyenv](https://github.com/pyenv/pyenv) is a tool that allows you to quickly install and switch between multiple versions of python locally, which is very useful. \n\nUnfortunately, using pyenv successfully with the virtualenv workflow described here involves some more setup: \n\n1. Install `pyenv-virtualenv` (a plugin to manage virtual environments for `pyenv`) by cloning it from GitHub: \n\n```bash\ngit clone https://github.com/pyenv/pyenv-virtualenv.git $(pyenv root)/plugins/pyenv-virtualenv\n```\n\n1. Add initialization to your shell. The correct file to edit will vary depending on your preferred shell. I use ZSH, so I edit `~/.zshrc`: \n```bash\n# Add this to your .bashrc, .bash_profile, or .zshrc:\n# If you already use pyenv and have the following snippet in place: \n# \n# if command -v pyenv 1\u003e/dev/null 2\u003e\u00261; then\n#  eval \"$(pyenv init -)\"\n# fi\n#\n# then you can add the following within the same if statement, like so: \n#\n# if command -v pyenv 1\u003e/dev/null 2\u003e\u00261; then\n#  eval \"$(pyenv init -)\"\n#  eval \"$(pyenv virtualenv-init -)\"\n# fi\n#\n# Otherwise, ensure this line is present in your preferred shell's configuration file:\neval \"$(pyenv virtualenv-init -)\"\n```\n\nBe sure to reload your shell with `exec \"$SHELL\"`\n\n1. Create a virtual environment using the specific Python version you want: \n```python\npyenv virtualenv 3.8.1 myenv\n```\n\n1. Activate the virtual environment \n\n```bash\npyenv activate myenv\n```\n\n1. Deactivate when you're done\n\n```bash\npyenv deactivate\n```\n\n## Use the `pyenv local` command to set a specific virtualenv to be active in a given directory\n\n```bash\npyenv local myenv\n```\n\nThis will create a `.python-version` file in that directory, and anytime you `cd` into this directory, \nthe specified virtual environment will be automatically activated. \n\n## Finding and modifying virtualenv code with pyenv and the virtualenv plugin\n\nTo find where pyenv installed your virtualenv, you can run: \n```bash\npyenv prefix myenv\n```\n\nThis gives you to the path the to virtual environment names `myenv`. \n\nNext, navigate to the directory returned by `pyenv prefix myenv` and you'll find the Python installation\nfor that virtual environment. To find and modify library code, you can navigate to the `site-packages` directory \nwhere the installed packages reside: \n\n```bash\n# (Replace 3.8 with the exact version of Python you specified for your virtualenv if it's different)\n\ncd $(pyenv prefix myenv)/lib/python3.8/site-packages\n```\n\nHere, you'll find the latest code for all the installed packages, and you can edit them as needed to test out \nyour changes. Changes made to the library code here will only affect your currently activated Python virtual \nenvironment. \n\n## Running your test script against the modified library code in your virtualenv\n\nIn your test directory, with your `pyenv-virtualenv` plugin installed and your\nvirtualenv created and activated as described above, you can now run \n\n`python -m pinecone-speed-test.py`, for example, in order to perform profiling and \narbitrary tests against modified library code. \n\n## Test documents\n\nThere are files of various sizes in `./test-documents/` that may be useful for running various \nPinecone and langhchain operations against.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzackproser%2Fpinecone-speed-experiments","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzackproser%2Fpinecone-speed-experiments","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzackproser%2Fpinecone-speed-experiments/lists"}