{"id":48002022,"url":"https://github.com/finalfusion/finalfusion-python","last_synced_at":"2026-04-04T12:23:55.946Z","repository":{"id":57429543,"uuid":"149509818","full_name":"finalfusion/finalfusion-python","owner":"finalfusion","description":"Finalfusion embeddings in Python","archived":false,"fork":false,"pushed_at":"2021-11-30T13:33:34.000Z","size":525,"stargazers_count":9,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-27T23:23:12.419Z","etag":null,"topics":["embeddings","finalfusion","python","subword-units","word"],"latest_commit_sha":null,"homepage":"https://finalfusion.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finalfusion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-19T20:42:00.000Z","updated_at":"2023-10-27T13:45:30.000Z","dependencies_parsed_at":"2022-08-27T17:02:16.525Z","dependency_job_id":null,"html_url":"https://github.com/finalfusion/finalfusion-python","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/finalfusion/finalfusion-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finalfusion%2Ffinalfusion-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finalfusion%2Ffinalfusion-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finalfusion%2Ffinalfusion-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finalfusion%2Ffinalfusion-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finalfusion","download_url":"https://codeload.github.com/finalfusion/finalfusion-python/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finalfusion%2Ffinalfusion-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31399819,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T10:20:44.708Z","status":"ssl_error","status_checked_at":"2026-04-04T10:20:06.846Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embeddings","finalfusion","python","subword-units","word"],"created_at":"2026-04-04T12:23:55.866Z","updated_at":"2026-04-04T12:23:55.932Z","avatar_url":"https://github.com/finalfusion.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# finalfusion-python\n[![Documentation Status](https://readthedocs.org/projects/finalfusion-python/badge/?version=latest)](https://finalfusion-python.readthedocs.io/en/latest/?badge=latest)\n\n## Introduction\n\n`finalfusion` is a Python package for reading, writing and using \n[finalfusion](https://finalfusion.github.io) embeddings, but also\nsupports other commonly used embeddings like fastText, GloVe and\nword2vec. \n\nThe Python package supports the same types of embeddings as the\n[finalfusion-rust crate](https://docs.rs/finalfusion/):\n\n* Vocabulary:\n  * No subwords\n  * Subwords\n* Embedding matrix:\n  * Array\n  * Memory-mapped\n  * Quantized\n* Norms\n* Metadata\n\n## Installation\n\nThe finalfusion module is\n[available](https://pypi.org/project/finalfusion/#files) on PyPi for Linux,\nMac and Windows. You can use `pip` to install the module:\n\n~~~shell\n$ pip install --upgrade finalfusion\n~~~\n\n## Installing from source\n\nBuilding from source depends on `Cython`. If you install the package using\n`pip`, you don't need to explicitly install the dependency since it is\nspecified in `pyproject.toml`.\n\n~~~shell\n$ git clone https://github.com/finalfusion/finalfusion-python\n$ cd finalfusion-python\n$ pip install .\n~~~\n\nIf you want to build wheels from source, `wheel` needs to be installed.\nIt's then possible to build wheels through:\n\n~~~shell\n$ python setup.py bdist_wheel\n~~~\n\nThe wheels can be found in `dist`.\n\n## Package Usage\n\n### Basic usage\n\n~~~python\nimport finalfusion\n# loading from different formats\nw2v_embeds = finalfusion.load_word2vec(\"/path/to/w2v.bin\")\ntext_embeds = finalfusion.load_text(\"/path/to/embeds.txt\")\ntext_dims_embeds = finalfusion.load_text_dims(\"/path/to/embeds.dims.txt\")\nfasttext_embeds = finalfusion.load_fasttext(\"/path/to/fasttext.bin\")\nfifu_embeds = finalfusion.load_finalfusion(\"/path/to/embeddings.fifu\")\n\n# serialization to formats works similarly\nfinalfusion.compat.write_word2vec(\"to_word2vec.bin\", fifu_embeds)\n\n# embedding lookup\nembedding = fifu_embeds[\"Test\"]\n\n# reading an embedding into a buffer\nimport numpy as np\nbuffer = np.zeros(fifu_embeds.storage.shape[1], dtype=np.float32)\nfifu_embeds.embedding(\"Test\", out=buffer)\n\n# similarity and analogy query\nsim_query = fifu_embeds.word_similarity(\"Test\")\nanalogy_query = fifu_embeds.analogy(\"A\", \"B\", \"C\")\n\n# accessing the vocab and printing the first 10 words\nvocab = fifu_embeds.vocab\nprint(vocab.words[:10])\n\n# SubwordVocabs give access to the subword indexer:\nsubword_indexer = vocab.subword_indexer\nprint(subword_indexer.subword_indices(\"Test\", with_ngrams=True))\n\n# accessing the storage and calculate its dot product with an embedding\nres = embedding.dot(fifu_embeds.storage)\n\n# printing metadata\nprint(fifu_embeds.metadata) \n~~~\n\n### Beyond Embeddings\n\n~~~Python\n# load only a vocab from a finalfusion file\nfrom finalfusion import load_vocab\nvocab = load_vocab(\"/path/to/finalfusion_file.fifu\")\n\n# serialize vocab to single file\nvocab.write(\"/path/to/vocab_file.fifu.voc\")\n\n# more specific loading functions exist\nfrom finalfusion.vocab import load_finalfusion_bucket_vocab\nfifu_bucket_vocab = load_finalfusion_bucket_vocab(\"/path/to/vocab_file.fifu.voc\")\n~~~\n\nThe package supports loading and writing all `finalfusion` chunks this way.\nThis is only supported by the Python package, reading will fail with e.g.\nthe `finalfusion-rust`.\n\n## Scripts\n\n`finalfusion` also includes a conversion script `ffp-convert` to convert\nbetween the supported formats.\n~~~shell\n# convert from fastText format to finalfusion\n$ ffp-convert -f fasttext fasttext.bin -t finalfusion embeddings.fifu\n~~~\n\n`ffp-bucket-to-explicit` can be used to convert bucket embeddings to embeddings\nwith an explicit ngram lookup.\n~~~shell\n# convert finalfusion bucket embeddings to explicit\n$ ffp-bucket-to-explicit -f finalfusion embeddings.fifu explicit.fifu\n~~~ \n\n`ffp-select` generates new embedding files based on some embeddings and a word\nlist. Using `ffp-select` with embeddings with a simple vocab results in a\nsubset of the original embeddings. With subword embeddings, vectors for unknown\nwords in the word list are computed and added to the new embeddings. The\nresulting embeddings **cannot** provide representations for OOV words anymore.\nThe new vocabulary covers only the words in the word list.\n~~~shell\n$ ffp-select large-embeddings.fifu subset-embeddings.fifu words.txt\n~~~\n\nFinally, the package comes with `ffp-similar` and `ffp-analogy` to do\nanalogy and similarity queries.\n~~~shell\n# get the 5 nearest neighbours of \"Tübingen\"\n$ echo Tübingen | ffp-similar embeddings.fifu\n# get the 5 top answers for \"Tübingen\" is to \"Stuttgart\" like \"Heidelberg\" to...\n$ echo Tübingen Stuttgart Heidelberg | ffp-analogy embeddings.fifu\n~~~\n\n## Where to go from here\n\n  * [documentation](https://finalfusion-python.readthedocs.io/en/latest)\n  * [finalfrontier](https://finalfusion.github.io/finalfrontier)\n  * [finalfusion](https://finalfusion.github.io/)\n  * [pretrained embeddings](https://finalfusion.github.io/pretrained)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinalfusion%2Ffinalfusion-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinalfusion%2Ffinalfusion-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinalfusion%2Ffinalfusion-python/lists"}