{"id":16749009,"url":"https://github.com/cthoyt/embeddingdb","last_synced_at":"2026-05-16T06:36:24.672Z","repository":{"id":77808300,"uuid":"192898201","full_name":"cthoyt/embeddingdb","owner":"cthoyt","description":"A database for storing and comparing entity embeddings","archived":false,"fork":false,"pushed_at":"2019-06-26T21:37:57.000Z","size":33,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-09-22T14:42:36.214Z","etag":null,"topics":["database","network-representation-learning","representation-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cthoyt.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-20T10:15:04.000Z","updated_at":"2022-10-24T20:34:47.000Z","dependencies_parsed_at":"2023-04-03T13:34:41.122Z","dependency_job_id":null,"html_url":"https://github.com/cthoyt/embeddingdb","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/cthoyt/embeddingdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cthoyt%2Fembeddingdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cthoyt%2Fembeddingdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cthoyt%2Fembeddingdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cthoyt%2Fembeddingdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cthoyt","download_url":"https://codeload.github.com/cthoyt/embeddingdb/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cthoyt%2Fembeddingdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33092717,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T04:41:52.686Z","status":"ssl_error","status_checked_at":"2026-05-16T04:41:52.009Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","network-representation-learning","representation-learning"],"created_at":"2024-10-13T02:23:35.673Z","updated_at":"2026-05-16T06:36:24.656Z","avatar_url":"https://github.com/cthoyt.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Embedding Database |zenodo|\n===========================\nThis package provides a database schema and Python wrapper\nfor storing the embeddings generated through various representation\nlearning packages.\n\nCurrently, this package focuses on using a SQL database with SQLAlchemy,\nbut might be extended to use a NoSQL database as an alternative.\n\nInstallation\n------------\nInstall ``embeddingdb`` from `PyPI \u003chttps://pypi.org/project/embeddingdb/\u003e`_ with:\n\n.. code-block:: sh\n\n   $ pip install embeddingdb\n\nAlternatively, install the latest development version of ``embeddingdb`` directly\nfrom GitHub with:\n\n.. code-block:: sh\n\n   $ pip install git+https://github.com/cthoyt/embeddingdb\n\nFor developers, install ``embeddingdb`` in development mode from GitHub with:\n\n.. code-block:: sh\n\n   $ git clone https://github.com/cthoyt/embeddingdb.git\n   $ cd embeddingdb\n   $ pip install -e .\n\nSet the environment variable ``EMBEDDINGDB_CONNECTION`` to a valid\nSQLAlchemy connection string for a PostgreSQL instance, as this package uses\nthe PostgreSQL-specific ``ARRAY`` type.\n\nCommand Line Interface\n----------------------\nThis package installs an entrypoint ``embeddingdb`` that can be used directly from\nthe shell.\n\nUploading Entity Embeddings\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\nEntities can be embedded and stored from various types of representation learning,\nincluding network representation learning, knowledge graph embedding, and textual\nlearning.\n\nUpload embeddings generated by ``word2vec`` by specifying the file path with:\n\n.. code-block:: sh\n\n   $ embeddingdb upload --fmt word2vec --path ~/path/to/file.txt\n\nUpload embeddings generated by ``pykeen`` by specifying the output directory\nwith:\n\n.. code-block:: sh\n\n   $ embeddingdb upload --fmt keen --path ~/path/to/directory/\n\nListing Entity Embeddings\n~~~~~~~~~~~~~~~~~~~~~~~~~\nAfter uploading, the collections can be listed with:\n\n.. code-block:: sh\n\n   $ embeddingdb ls\n\nAnalyzing Entity Embeddings' Correlations\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nOne of the motivations for building this repository was to make a convenient way to\ncompare the embeddings for entities generated through orthogonal embedding tecnhiques.\nFor example, we wanted to know to what extent the embeddings for proteins generated from\ntheir sequences with ``ratvec`` contained the same information as the embeddings generated\nfrom protein-protein interaction networks with ``pykeen`` or ``nrl``.\n\nThe two positional arguments correspond to the collection identifiers in the database.\n\n.. code-block:: sh\n\n   $ embeddingdb analyze 1 2\n\nRunning with Docker\n-------------------\nAfter installing Docker, the entire web application can be instantiated with:\n\n.. code-block:: sh\n\n   $ docker-compose up\n\nGet the endpoint ``/test`` to instantiate the database and add a test collection.\n\n.. |zenodo| image:: https://zenodo.org/badge/192898201.svg\n   :target: https://zenodo.org/badge/latestdoi/192898201\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcthoyt%2Fembeddingdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcthoyt%2Fembeddingdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcthoyt%2Fembeddingdb/lists"}