{"id":50333365,"url":"https://github.com/casperdcl/ellama","last_synced_at":"2026-05-29T11:30:30.091Z","repository":{"id":354293250,"uuid":"1220257168","full_name":"casperdcl/ellama","owner":"casperdcl","description":"Embeddings interface for Ollama","archived":false,"fork":false,"pushed_at":"2026-05-09T19:56:47.000Z","size":41,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-09T20:39:13.008Z","etag":null,"topics":["embeddings","faiss","langchain","ollama","pca","semantic-search","t-sne","umap","vector","vector-database","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/casperdcl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-24T18:00:38.000Z","updated_at":"2026-05-09T19:52:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/casperdcl/ellama","commit_stats":null,"previous_names":["casperdcl/ellama"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/casperdcl/ellama","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperdcl%2Fellama","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperdcl%2Fellama/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperdcl%2Fellama/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperdcl%2Fellama/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/casperdcl","download_url":"https://codeload.github.com/casperdcl/ellama/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperdcl%2Fellama/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33650712,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embeddings","faiss","langchain","ollama","pca","semantic-search","t-sne","umap","vector","vector-database","vector-search"],"created_at":"2026-05-29T11:30:29.250Z","updated_at":"2026-05-29T11:30:30.075Z","avatar_url":"https://github.com/casperdcl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Ellama\n\n*Embeddings library*\n\n[![ollama](https://img.shields.io/badge/models-ollama-black?logo=ollama)](https://ollama.com/search?c=embedding)\n[![unsloth](https://img.shields.io/badge/LoRA-unsloth-55b48c?logo=faker\u0026logoColor=55b48c)](https://unsloth.ai/docs/basics/embedding-finetuning)\n[![faiss](https://img.shields.io/badge/database-faiss-0866ff?logo=facebook)](https://github.com/facebookresearch/faiss)\n[![langchain](https://img.shields.io/badge/glue-langchain-7fc8ff?logo=langchain)](https://github.com/langchain-ai/langchain)\n\n[![pca](https://img.shields.io/badge/projection-PCA-f7931e?logo=scikit-learn)](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)\n[![t-sne](https://img.shields.io/badge/projection-t--SNE-f7931e?logo=scikit-learn)](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html)\n[![umap](https://img.shields.io/badge/projection-UMAP-f7931e)](https://umap-learn.readthedocs.io)\n[![matplotlib](https://img.shields.io/badge/plot-matplotlib-green)](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.scatter.html)\n\n[![test](https://github.com/casperdcl/ellama/actions/workflows/test.yml/badge.svg)](https://github.com/casperdcl/ellama/actions/workflows/test.yml)\n\nUnlike the overwhelming majority of alternatives:\n\n- handles long inputs without truncation even if the underlying model has a small context window\n- minimal config\n- 100% human-written (thus sane \u0026 clean) codebase\n\n\u003e [!TIP]\n\u003e Please open an issue if you know of any better alternatives. I would love to archive this repo.\n\n## search\n\nDatabase semantic search:\n\n```py\nfrom ellama import EllamaDB, Document\n\ndb = EllamaDB(\"test\")\ndb.add_documents([\n    Document(\"hello world\", id=\"salutation\"),\n    Document(\"goodbye and goodnight\", id=\"farewell\")])\ndocs = db.similarity_search(\"Greetings, Earth!\", k=1)\nassert len(docs) == 1\nassert docs[0].id == \"salutation\"\n```\n\n## plot\n\nEmbeddings visualisation:\n\n```py\nimport matplotlib.pyplot as plt\nfrom ellama import EllamaDB, Document\nfrom sklearn.datasets import fetch_20newsgroups\n\nraw = fetch_20newsgroups(data_home='.cache')\ndb = EllamaDB(\"20newsgroups\")\ndb.add_documents([\n    Document(raw.data[i], id=str(i), metadata={'name': raw.target_names[raw.target[i]]})\n    for i in range(200)])\n\nfor group in ['alt.atheism', 'comp', 'misc.forsale', 'rec', 'rec.sport', 'sci',\n              'soc.religion', 'talk.politics', 'talk.religion']:\n    db.plot('t-SNE', label=group,\n            filter=lambda metadata: metadata['name'].startswith(f'{group}.'))\n\nplt.title(f\"Newsgroup {db.embeddings.model} embeddings t-SNE\")\nplt.legend()\nplt.show()\n```\n\n![](https://img.cdcl.ml/ellama-plot.svg)\n\n## fine-tune\n\nLow-rank adaptation (LoRA) re-using the newsgroups database created above:\n\n```py\nfrom ellama import EllamaDB\ndb = EllamaDB(\"20newsgroups\")\ndb.lora(['name'], \"ellama/lora:news\", epochs=600)\n\n# create new database using the new model\ndb_lora = EllamaDB(\"20newsgroups_lora\", \"ellama/lora:news\")\ndb_lora.add_documents(db.get_docs({}))\n```\n\n![](https://img.cdcl.ml/ellama-lora.svg)\n\n## install\n\n### `pip` (CPU)\n\n```sh\npip install \"ellama[cpu]\"           # basic\npip install \"ellama[cpu,plot]\"      # plot('PCA' or 't-SNE')\npip install \"ellama[cpu,plot,umap]\" # plot('UMAP')\npip install \"ellama[lora]\"          # fine-tuning\n```\n\n### `conda` (GPU)\n\n```yml\nname: ellama\nchannels: [pytorch, nvidia, conda-forge]\ndependencies:\n- langchain 1.*\n- langchain-community\n- faiss-gpu\n- requests\n- tqdm\n#- matplotlib   # ellama plot()\n#- scikit-learn # ellama plot()\n#- umap-learn   # ellama plot('UMAP')\n#- unsloth      # ellama lora()\n- pip\n- pip:\n  - ellama\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasperdcl%2Fellama","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcasperdcl%2Fellama","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasperdcl%2Fellama/lists"}