{"id":13656668,"url":"https://github.com/jina-ai/finetuner","last_synced_at":"2025-10-07T11:31:56.851Z","repository":{"id":37080951,"uuid":"394994747","full_name":"jina-ai/finetuner","owner":"jina-ai","description":":dart: Task-oriented embedding tuning for BERT, CLIP, etc.","archived":true,"fork":false,"pushed_at":"2024-03-11T08:05:13.000Z","size":74984,"stargazers_count":1486,"open_issues_count":8,"forks_count":67,"subscribers_count":28,"default_branch":"main","last_synced_at":"2025-01-20T09:46:57.738Z","etag":null,"topics":["bert","few-shot-learning","fine-tuning","finetuning","jina","metric-learning","negative-sampling","neural-search","openai-clip","pretrained-models","siamese-network","similarity-learning","transfer-learning","triplet-loss"],"latest_commit_sha":null,"homepage":"https://finetuner.jina.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jina-ai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2021-08-11T13:15:43.000Z","updated_at":"2025-01-20T03:38:24.000Z","dependencies_parsed_at":"2023-02-14T07:45:56.707Z","dependency_job_id":"18835f7a-7a9f-41bd-ac19-9f74816dd778","html_url":"https://github.com/jina-ai/finetuner","commit_stats":{"total_commits":642,"total_committers":40,"mean_commits":16.05,"dds":0.7476635514018692,"last_synced_commit":"9ef750be420abec77c70e06e0ef6708f56d62cc2"},"previous_names":[],"tags_count":43,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Ffinetuner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Ffinetuner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Ffinetuner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Ffinetuner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jina-ai","download_url":"https://codeload.github.com/jina-ai/finetuner/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235627412,"owners_count":19020520,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","few-shot-learning","fine-tuning","finetuning","jina","metric-learning","negative-sampling","neural-search","openai-clip","pretrained-models","siamese-network","similarity-learning","transfer-learning","triplet-loss"],"created_at":"2024-08-02T05:00:30.053Z","updated_at":"2025-10-07T11:31:56.262Z","avatar_url":"https://github.com/jina-ai.png","language":"Python","funding_links":[],"categories":["参数优化","Open-Source Tools","Python","Neural Search and Retrieval"],"sub_categories":[],"readme":"\u003cbr\u003e\u003cbr\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://github.com/jina-ai/finetuner/blob/main/docs/_static/finetuner-logo-ani.svg?raw=true\" alt=\"Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications.\" width=\"150px\"\u003e\n\u003c/p\u003e\n\n\n\u003cp align=\"center\"\u003e\n\u003cb\u003eTask-oriented finetuning for better embeddings on neural search\u003c/b\u003e\n\u003c/p\u003e\n\n\u003cp align=center\u003e\n\u003ca href=\"https://pypi.org/project/finetuner/\"\u003e\u003cimg alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/finetuner?label=Release\u0026style=flat-square\"\u003e\u003c/a\u003e\n\u003ca href=\"https://codecov.io/gh/jina-ai/finetuner\"\u003e\u003cimg alt=\"Codecov branch\" src=\"https://img.shields.io/codecov/c/github/jina-ai/finetuner/main?logo=Codecov\u0026logoColor=white\u0026style=flat-square\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypistats.org/packages/finetuner\"\u003e\u003cimg alt=\"PyPI - Downloads from official pypistats\" src=\"https://img.shields.io/pypi/dm/finetuner?style=flat-square\"\u003e\u003c/a\u003e\n\u003ca href=\"https://discord.jina.ai\"\u003e\u003cimg src=\"https://img.shields.io/discord/1106542220112302130?logo=discord\u0026logoColor=white\u0026style=flat-square\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003c!-- start elevator-pitch --\u003e\n\nFine-tuning is an effective way to improve performance on [neural search](https://jina.ai/news/what-is-neural-search-and-learn-to-build-a-neural-search-engine/) tasks.\nHowever, setting up and performing fine-tuning can be very time-consuming and resource-intensive.\n\nJina AI's Finetuner makes fine-tuning easier and faster by streamlining the workflow and handling all the complexity and infrastructure in the cloud.\nWith Finetuner, you can easily enhance the performance of pre-trained models,\nmaking them production-ready [without extensive labeling](https://jina.ai/news/fine-tuning-with-low-budget-and-high-expectations/) or expensive hardware.\n\n🎏 **Better embeddings**: Create high-quality embeddings for semantic search, visual similarity search, cross-modal text\u003c-\u003eimage search, recommendation systems,\nclustering, duplication detection, anomaly detection, or other uses.\n\n⏰ **Low budget, high expectations**: Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.\n\n📈 **Performance promise**: Enhance the performance of pre-trained models so that they deliver state-of-the-art performance on \ndomain-specific applications.\n\n🔱 **Simple yet powerful**: Easy access to 40+ mainstream loss functions, 10+ optimizers, layer pruning, weight\nfreezing, dimensionality reduction, hard-negative mining, cross-modal models, and distributed training. \n\n☁ **All-in-cloud**: Train using our GPU infrastructure, manage runs, experiments, and artifacts on Jina AI Cloud\nwithout worrying about resource availability, complex integration, or infrastructure costs.\n\n\u003c!-- end elevator-pitch --\u003e\n\n## [Documentation](https://finetuner.jina.ai/)\n\n## Pretrained Text Embedding Models\n\n| name                   | parameter | dimension | Huggingface                                            |\n|------------------------|-----------|-----------|--------------------------------------------------------|\n| jina-embedding-t-en-v1 | 14m       | 312             | [link](https://huggingface.co/jinaai/jina-embedding-t-en-v1) |\n| jina-embedding-s-en-v1 | 35m       | 512             | [link](https://huggingface.co/jinaai/jina-embedding-s-en-v1) |\n| jina-embedding-b-en-v1 | 110m      | 768             | [link](https://huggingface.co/jinaai/jina-embedding-b-en-v1) |\n| jina-embedding-l-en-v1 | 330m      | 1024            | [link](https://huggingface.co/jinaai/jina-embedding-l-en-v1) |\n\n## Benchmarks\n\n\u003ctable\u003e\n\u003cthead\u003e\n  \u003ctr\u003e\n    \u003cth\u003eModel\u003c/th\u003e\n    \u003cth\u003eTask\u003c/th\u003e\n    \u003cth\u003eMetric\u003c/th\u003e\n    \u003cth\u003ePretrained\u003c/th\u003e\n    \u003cth\u003eFinetuned\u003c/th\u003e\n    \u003cth\u003eDelta\u003c/th\u003e\n    \u003cth\u003eRun it!\u003c/th\u003e\n  \u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n  \u003ctr\u003e\n    \u003ctd rowspan=\"2\"\u003eBERT\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003ca href=\"https://www.kaggle.com/c/quora-question-pairs\"\u003eQuora\u003c/a\u003e Question Answering\u003c/td\u003e\n    \u003ctd\u003emRR\u003c/td\u003e\n    \u003ctd\u003e0.835\u003c/td\u003e\n    \u003ctd\u003e0.967\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e15.8%\u003c/span\u003e\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003cp align=center\u003e\u003ca href=\"https://colab.research.google.com/drive/1Ui3Gw3ZL785I7AuzlHv3I0-jTvFFxJ4_?usp=sharing\"\u003e\u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\u003c/p\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eRecall\u003c/td\u003e\n    \u003ctd\u003e0.915\u003c/td\u003e\n    \u003ctd\u003e0.963\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e5.3%\u003c/span\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd rowspan=\"2\"\u003eResNet\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003eVisual similarity search on \u003ca href=\"https://sites.google.com/view/totally-looks-like-dataset\"\u003eTLL\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003emAP\u003c/td\u003e\n    \u003ctd\u003e0.110\u003c/td\u003e\n    \u003ctd\u003e0.196\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e78.2%\u003c/span\u003e\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003cp align=center\u003e\u003ca href=\"https://colab.research.google.com/drive/1QuUTy3iVR-kTPljkwplKYaJ-NTCgPEc_?usp=sharing\"\u003e\u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\u003c/p\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eRecall\u003c/td\u003e\n    \u003ctd\u003e0.249\u003c/td\u003e\n    \u003ctd\u003e0.460\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e84.7%\u003c/span\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd rowspan=\"2\"\u003eCLIP\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003ca href=\"https://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html\"\u003eDeep Fashion\u003c/a\u003e text-to-image search\u003c/td\u003e\n    \u003ctd\u003emRR\u003c/td\u003e\n    \u003ctd\u003e0.575\u003c/td\u003e\n    \u003ctd\u003e0.676\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e17.4%\u003c/span\u003e\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003cp align=center\u003e\u003ca href=\"https://colab.research.google.com/drive/1yKnmy2Qotrh3OhgwWRsMWPFwOSAecBxg?usp=sharing\"\u003e\u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\u003c/p\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eRecall\u003c/td\u003e\n    \u003ctd\u003e0.473\u003c/td\u003e\n    \u003ctd\u003e0.564\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e19.2%\u003c/span\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd rowspan=\"2\"\u003eM-CLIP\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003ca href=\"https://xmrec.github.io/\"\u003eCross market\u003c/a\u003e product recommendation (German)\u003c/td\u003e\n    \u003ctd\u003emRR\u003c/td\u003e\n    \u003ctd\u003e0.430\u003c/td\u003e\n    \u003ctd\u003e0.648\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e50.7%\u003c/span\u003e\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003cp align=center\u003e\u003ca href=\"https://colab.research.google.com/drive/10Wldbu0Zugj7NmQyZwZzuorZ6SSAhtIo\"\u003e\u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\u003c/p\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eRecall\u003c/td\u003e\n    \u003ctd\u003e0.247\u003c/td\u003e\n    \u003ctd\u003e0.340\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e37.7%\u003c/span\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd rowspan=\"2\"\u003ePointNet++\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003ca href=\"https://modelnet.cs.princeton.edu/\"\u003eModelNet40\u003c/a\u003e 3D Mesh Search\u003c/td\u003e\n    \u003ctd\u003emRR\u003c/td\u003e\n    \u003ctd\u003e0.791\u003c/td\u003e\n    \u003ctd\u003e0.891\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e12.7%\u003c/span\u003e\u003c/td\u003e\n    \u003ctd rowspan=\"2\"\u003e\u003cp align=center\u003e\u003ca href=\"https://colab.research.google.com/drive/1lIMDFkUVsWMshU-akJ_hwzBfJ37zLFzU?usp=sharing\"\u003e\u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\u003c/a\u003e\u003c/p\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eRecall\u003c/td\u003e\n    \u003ctd\u003e0.154\u003c/td\u003e\n    \u003ctd\u003e0.242\u003c/td\u003e\n    \u003ctd\u003e\u003cspan style=\"color:green\"\u003e57.1%\u003c/span\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n\u003csub\u003e\u003csup\u003eAll metrics were evaluated for k@20 after training for 5 epochs using the Adam optimizer with learning rates of 1e-4 for ResNet, 1e-7 for CLIP and 1e-5 for the BERT models, 5e-4 for PointNet++\u003c/sup\u003e\u003c/sub\u003e\n\n\u003c!-- start install-instruction --\u003e\n\n## Install\n\nMake sure you have Python 3.8+ installed. Finetuner can be installed via `pip` by executing:\n\n```bash\npip install -U finetuner\n```\n\nIf you want to submit a fine-tuning job on the cloud, please use\n\n```bash\npip install \"finetuner[full]\"\n```\n\n\u003c!-- end install-instruction --\u003e\n\n\u003e ⚠️ Starting with version 0.5.0, Finetuner computing is performed on Jina AI Cloud. The last local version is `0.4.1`. \n\u003e This version is still available for installation via `pip`. See [Finetuner git tags and releases](https://github.com/jina-ai/finetuner/releases).\n\n\u003c!-- start finetuner-articles --\u003e\n## Articles about Finetuner\n\nCheck out our published blogposts and tutorials to see Finetuner in action!\n\n- [Fine-tuning with Low Budget and High Expectations](https://jina.ai/news/fine-tuning-with-low-budget-and-high-expectations/)\n- [Hype and Hybrids: Search is more than Keywords and Vectors](https://jina.ai/news/hype-and-hybrids-multimodal-search-means-more-than-keywords-and-vectors-2/)\n- [Improving Search Quality for Non-English Queries with Fine-tuned Multilingual CLIP Models](https://jina.ai/news/improving-search-quality-non-english-queries-fine-tuned-multilingual-clip-models/)\n- [How Much Do We Get by Finetuning CLIP?](https://jina.ai/news/applying-jina-ai-finetuner-to-clip-less-data-smaller-models-higher-performance/)\n\n\u003c!-- end finetuner-articles --\u003e\n\n\u003c!-- start citations --\u003e\nIf you find Jina Embeddings useful in your research, please cite the following paper:\n\n```text\n@misc{günther2023jina,\n      title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models}, \n      author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},\n      year={2023},\n      eprint={2307.11224},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n\n```\n\u003c!-- end citations --\u003e\n\n\u003c!-- start support-pitch --\u003e\n## Support\n\n- Use [Discussions](https://github.com/jina-ai/finetuner/discussions) to talk about your use cases, questions, and\n  support queries.\n- Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.\n- Join our [Engineering All Hands](https://youtube.com/playlist?list=PL3UBBWOUVhFYRUa_gpYYKBqEAkO4sxmne) meet-up to discuss your use case and learn Jina AI new features.\n    - **When?** The second Tuesday of every month\n    - **Where?**\n      Zoom ([see our public events calendar](https://calendar.google.com/calendar/embed?src=c_1t5ogfp2d45v8fit981j08mcm4%40group.calendar.google.com\u0026ctz=Europe%2FBerlin)/[.ical](https://calendar.google.com/calendar/ical/c_1t5ogfp2d45v8fit981j08mcm4%40group.calendar.google.com/public/basic.ics))\n      and [live stream on YouTube](https://youtube.com/c/jina-ai)\n- Subscribe to the latest video tutorials on our [YouTube channel](https://youtube.com/c/jina-ai)\n\n## Join Us\n\nFinetuner is backed by [Jina AI](https://jina.ai) and licensed under [Apache-2.0](./LICENSE). \n\n[We are actively hiring](https://jobs.jina.ai) AI engineers and solution engineers to build the next generation of\nopen-source AI ecosystems.\n\n\u003c!-- end support-pitch --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjina-ai%2Ffinetuner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjina-ai%2Ffinetuner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjina-ai%2Ffinetuner/lists"}