{"id":48060354,"url":"https://github.com/andrefs/embeddings-cos-sim","last_synced_at":"2026-04-04T14:31:53.407Z","repository":{"id":346247004,"uuid":"1189071883","full_name":"andrefs/embeddings-cos-sim","owner":"andrefs","description":"Calculate the cosine similarity of embeddings.","archived":false,"fork":false,"pushed_at":"2026-03-23T19:00:51.000Z","size":133,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-23T20:24:13.943Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andrefs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-23T00:13:55.000Z","updated_at":"2026-03-23T19:00:55.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/andrefs/embeddings-cos-sim","commit_stats":null,"previous_names":["andrefs/embeddings-cos-sim"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/andrefs/embeddings-cos-sim","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fembeddings-cos-sim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fembeddings-cos-sim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fembeddings-cos-sim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fembeddings-cos-sim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andrefs","download_url":"https://codeload.github.com/andrefs/embeddings-cos-sim/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fembeddings-cos-sim/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31402700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T10:20:44.708Z","status":"ssl_error","status_checked_at":"2026-04-04T10:20:06.846Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-04T14:31:53.228Z","updated_at":"2026-04-04T14:31:53.390Z","avatar_url":"https://github.com/andrefs.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# embeddings-cos-sim\n\n\u003e **Note:** This project is the continuation of the now-defunct [we-cos-sim](https://github.com/andrefs/we-cos-sim) project, which was limited to word embeddings. This version supports any embedding type including graph-based node embeddings.\n\nA versatile tool for calculating cosine similarity using embeddings. Supports word embeddings (FastText) and graph-based node embeddings (Node2Vec, RDF2Vec), or any custom embeddings in the standard text vector format.\n\n## Features\n\n- Pre-configured embeddings: FastText word vectors, DBpedia Node2Vec, DBpedia RDF2Vec\n- Support for custom embeddings via simple configuration\n- Convert embedding files to LevelDB for fast lookups\n- Calculate cosine similarity between any two keys (words or nodes)\n- Unified interface for all embedding types\n\n## Installation\n\n```bash\nnpm install\nnpm run build\n```\n\nFor global CLI access:\n\n```bash\nnpm install -g .\n```\n\n## Built-in Embeddings\n\nThe following embeddings are pre-configured out of the box:\n\n| Name | Type | Description |\n|------|------|-------------|\n| `fasttext-en`, `fasttext-de`, `fasttext-fr`, `fasttext-es` | Word | FastText Common Crawl vectors (300d) |\n| `node2vec-dbpedia` | Node | DBpedia embeddings via Node2Vec (300d) |\n| `rdf2vec-dbpedia` | Node | DBpedia embeddings via RDF2Vec (300d) |\n\n## CLI Usage\n\n### Calculate Similarity\n\n```bash\nembeddings-cos-sim \u003cembeddingName\u003e \u003ckey1\u003e \u003ckey2\u003e\n```\n\nOr with an explicit flag:\n\n```bash\nembeddings-cos-sim --embedding \u003cname\u003e \u003ckey1\u003e \u003ckey2\u003e\n```\n\n**Examples:**\n\n```bash\n# FastText word similarity\nembeddings-cos-sim fasttext-en king queen\n\n# Node similarity (full URIs as keys)\nembeddings-cos-sim node2vec-dbpedia \"http://dbpedia.org/resource/Paris\" \"http://dbpedia.org/resource/France\"\n```\n\n### Download a Model\n\n```bash\nembeddings-cos-sim-download \u003cembeddingName\u003e\n```\n\nThis downloads the source file and converts it to LevelDB in one step.\n\n**Example:**\n\n```bash\nembeddings-cos-sim-download fasttext-en\n```\n\n### Convert Model to LevelDB\n\n```bash\nembeddings-cos-sim-level \u003csourceFilePath\u003e \u003ctargetLevelDbPath\u003e [-v|--verbose|-p|--progress]\n```\n\nOr using a pre-configured embedding:\n\n```bash\nembeddings-cos-sim-level --embedding \u003cname\u003e [-v|--verbose|-p|--progress]\n```\n\nOr use a pre-configured embedding:\n\n```bash\nembeddings-cos-sim-level --embedding \u003cname\u003e [-v|--verbose|-p|--progress]\n```\n\n**Examples:**\n\n```bash\n# With explicit paths\nembeddings-cos-sim-level vectors_dbpedia_Node2Vec.txt.gz ~/.embeddings-cos-sim/level/node2vec.lvl -p\n\n# With a predefined embedding\nembeddings-cos-sim-level --embedding node2vec-dbpedia -p\n```\n\n### Verify a LevelDB\n\n```bash\nembeddings-cos-sim-verify \u003clevelPath\u003e [key1] [key2] ...\n```\n\nOr using a registered embedding:\n\n```bash\nembeddings-cos-sim-verify --embedding \u003cname\u003e [key1] [key2] ...\n```\n\n### Manage Custom Embeddings\n\nList all registered embeddings:\n\n```bash\nembeddings-cos-sim-embeddings list\n```\n\nAdd a custom embedding:\n\n```bash\nembeddings-cos-sim-embeddings add \u003cname\u003e \u003clevelPath\u003e [--model \u003cmodelPath\u003e] [--url \u003curl\u003e] [--desc \u003cdescription\u003e]\n```\n\nRemove a custom embedding:\n\n```bash\nembeddings-cos-sim-embeddings remove \u003cname\u003e\n```\n\n**Example:**\n\n```bash\nembeddings-cos-sim-embeddings add my-custom-emb ~/.embeddings-cos-sim/level/myemb.lvl --model ~/downloads/myvectors.vec.gz --url \"https://example.com/myvectors.vec.gz\"\n```\n\n## Usage as a Library\n\n```typescript\nimport { loadVec, buildCosSimFn } from \"embeddings-cos-sim/lib/cosSim\";\nimport { getEmbeddingConfig } from \"embeddings-cos-sim/lib/utils\";\n```\n\n## File Format\n\nEmbedding files should be in FastText `.vec` format:\n- Plain text (gzip-compressed or not)\n- Space-separated values\n- First token is the key (word or URI)\n- Remaining tokens are floating-point vector components\n\nExample:\n```\nking 0.345 0.123 -0.456 ...(300 dimensions total)\nqueen 0.312 0.156 -0.389 ...\nhttp://dbpedia.org/resource/Paris 0.234 -0.567 ...\n```\n\n## Paths\n\nBy default, configs and data are stored under `~/.embeddings-cos-sim/`:\n\n- `vectors/` - vector files (`.vec.gz` or `.txt.gz`)\n- `level/` - LevelDB databases\n- `embeddings.json` - custom embedding configurations\n\nPaths in embedding configs can be absolute or relative to `~/.embeddings-cos-sim/`.\n\n## Testing\n\n```bash\nnpm test\n```\n\n## License\n\nISC\n\n## Author\n\nAndré Santos\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrefs%2Fembeddings-cos-sim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrefs%2Fembeddings-cos-sim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrefs%2Fembeddings-cos-sim/lists"}