{"id":18400681,"url":"https://github.com/databricks/databricks-ml-examples","last_synced_at":"2025-04-05T18:04:59.929Z","repository":{"id":66099377,"uuid":"92883248","full_name":"databricks/databricks-ml-examples","owner":"databricks","description":null,"archived":false,"fork":false,"pushed_at":"2024-03-28T23:50:34.000Z","size":2298,"stargazers_count":349,"open_issues_count":9,"forks_count":121,"subscribers_count":35,"default_branch":"master","last_synced_at":"2025-03-29T17:02:13.294Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databricks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-30T22:57:50.000Z","updated_at":"2025-03-21T03:51:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"d54a9802-c091-4b43-8e77-27dc6e17cd06","html_url":"https://github.com/databricks/databricks-ml-examples","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fdatabricks-ml-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fdatabricks-ml-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fdatabricks-ml-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fdatabricks-ml-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databricks","download_url":"https://codeload.github.com/databricks/databricks-ml-examples/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247378135,"owners_count":20929296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T02:36:00.113Z","updated_at":"2025-04-05T18:04:59.902Z","avatar_url":"https://github.com/databricks.png","language":"Python","readme":"\n\n# databricks-ml-examples\n\n`databricks/databricks-ml-examples` is a repository to show machine learning examples on Databricks platforms.\n\nCurrently this repository contains:\n- `llm-models/`: Example notebooks to use different **State of the art (SOTA) models on Databricks**.\n- `llm-fine-tuning/`: Fine tuning scripts and notebooks to fine tune **State of the art (SOTA) models on Databricks**.\n\n## SOTA LLM examples\n\nDatabricks works with thousands of customers to build generative AI applications. While you can use Databricks to work with any generative AI model, including commercial and research, the table below lists our current model recommendations for popular use cases. **Note:** The table only lists open source models that are for free commercial use. \n\n\u003c!---\n\u003cstyle\u003e\ntable th:first-of-type {\n    width: 10%;\n}\ntable th:nth-of-type(2) {\n    width: 30%;\n}\ntable th:nth-of-type(3) {\n    width: 30%;\n}\ntable th:nth-of-type(4) {\n    width: 30%;\n}\n\u003c/style\u003e\n--\u003e\n\n| Use case                               | Quality-optimized                                                                                                                                                                                                                                                 | Balanced                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Speed-optimized                                                                                          |\n|----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|\n| Text generation following instructions | [Mixtral-8x7B-Instruct-v0.1](llm-models/mixtral-8x7b)  \u003cbr\u003e \u003cbr\u003e [Llama-2-70b-chat-hf](llm-models/llamav2/llamav2-70b)                                                                                                                                                       | [mistral-7b](llm-models/mistral/mistral-7b) \u003cbr\u003e\u003cbr\u003e [MPT-7B-Instruct](llm-models/mpt/mpt-7b) \u003cbr\u003e [MPT-7B-8k-Instruct](llm-models/mpt/mpt-7b-8k) \u003cbr\u003e \u003cbr\u003e [Llama-2-7b-chat-hf](llm-models/llamav2/llamav2-7b) \u003cbr\u003e [Llama-2-13b-chat-hf](llm-models/llamav2/llamav2-13b)                                                                                                                                                                                                                                                         |  [phi-2](llm-models/phi-2)                                          |\n| Text embeddings (English only)         |   [e5-mistral-7b-instruct(7B)](llm-models/embedding/e5-mistral-7b-instruct)                                                                                                                                                                                               | [bge-large-en-v1.5(0.3B)](llm-models/embedding/bge) \u003cbr\u003e [e5-large-v2 (0.3B)](llm-models/embedding/e5-v2)                                                                                                                                                                                                                                                                                                                                                                                                                | [bge-base-en-v1.5 (0.1B)](llm-models/embedding/bge) \u003cbr\u003e [e5-base-v2 (0.1B)](llm-models/embedding/e5-v2) |\n| Transcription (speech to text)         |                                                                                                                                                                                                                                                                   | [whisper-large-v2](llm-models/transcription/whisper)(1.6B) \u003cbr\u003e [whisper-medium](llm-models/transcription/whisper) (0.8B)                                                                                                                                                                                                                                                                                                                                                                                                          |                                                                                                          |\n| Image generation                       |                                                                                                                                                                                                                                                                   | [stable-diffusion-xl](llm-models/image_generation/stable_diffusion)                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                          |\n| Code generation                        | [CodeLlama-70b-hf](llm-models/code_generation/codellama/codellama-70b) \u003cbr\u003e [CodeLlama-70b-Instruct-hf](llm-models/code_generation/codellama/codellama-70b) \u003cbr\u003e [CodeLlama-70b-Python-hf](llm-models/code_generation/codellama/codellama-70b) (Python optimized) \u003cbr\u003e[CodeLlama-34b-hf](llm-models/code_generation/codellama/codellama-34b) \u003cbr\u003e [CodeLlama-34b-Instruct-hf](llm-models/code_generation/codellama/codellama-34b) \u003cbr\u003e [CodeLlama-34b-Python-hf](llm-models/code_generation/codellama/codellama-34b) (Python optimized) | [CodeLlama-13b-hf](llm-models/code_generation/codellama/codellama-13b) \u003cbr\u003e [CodeLlama-13b-Instruct-hf](llm-models/code_generation/codellama/codellama-13b) \u003cbr\u003e [CodeLlama-13b-Python-hf](llm-models/code_generation/codellama/codellama-13b) (Python optimized) \u003cbr\u003e [CodeLlama-7b-hf](llm-models/code_generation/codellama/codellama-7b) \u003cbr\u003e [CodeLlama-7b-Instruct-hf](llm-models/code_generation/codellama/codellama-7b) \u003cbr\u003e [CodeLlama-7b-Python-hf](llm-models/code_generation/codellama/codellama-7b) (Python optimized) |                                                                                                          |\n\n* To get a better performance on instructor-xl, you may follow [the unified template to write instructions](https://huggingface.co/hkunlp/instructor-xl#calculate-embeddings-for-your-customized-texts).\n\n## Model Evaluation Leaderboard\n**Text generation models**\n\nThe model evaluation results presented below are measured by the [Mosaic Eval Gauntlet](https://www.mosaicml.com/llm-evaluation) framework. This framework comprises a series of tasks specifically designed to assess the performance of language models, including widely-adopted benchmarks such as MMLU, Big-Bench, HellaSwag, and more.\n\n| Model Name                                                                            |   Core Average |   World Knowledge |   Commonsense Reasoning |   Language Understanding |   Symbolic Problem Solving |   Reading Comprehension |\n|:--------------------------------------------------------------------------------------|---------------:|------------------:|------------------------:|-------------------------:|---------------------------:|------------------------:|\n| [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)                   |          0.522 |             0.558 |                   0.513 |                    0.555 |                      0.342 |                   0.641 |\n| [falcon-40b](https://huggingface.co/tiiuae/falcon-40b)                                |          0.501 |             0.556 |                   0.55  |                    0.535 |                      0.269 |                   0.597 |\n| [falcon-40b-instruct](https://huggingface.co/tiiuae/falcon-40b-instruct)              |          0.5   |             0.542 |                   0.571 |                    0.544 |                      0.264 |                   0.58  |\n| [Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)                    |          0.479 |             0.515 |                   0.482 |                    0.52  |                      0.279 |                   0.597 |\n| [Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)          |          0.476 |             0.522 |                   0.512 |                    0.514 |                      0.271 |                   0.559 |\n| [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) |          0.469 |             0.48  |                   0.502 |                    0.492 |                      0.266 |                   0.604 |\n| [mpt-30b-instruct](https://huggingface.co/mosaicml/mpt-30b-instruct)                  |          0.465 |             0.48  |                   0.513 |                    0.494 |                      0.238 |                   0.599 |\n| [mpt-30b](https://huggingface.co/mosaicml/mpt-30b)                                    |          0.431 |             0.494 |                   0.47  |                    0.477 |                      0.234 |                   0.481 |\n| [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)            |          0.42  |             0.476 |                   0.447 |                    0.478 |                      0.221 |                   0.478 |\n| [Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)                      |          0.401 |             0.457 |                   0.41  |                    0.454 |                      0.217 |                   0.465 |\n| [mpt-7b-8k-instruct](https://huggingface.co/mosaicml/mpt-7b-8k-instruct)              |          0.36  |             0.363 |                   0.41  |                    0.405 |                      0.165 |                   0.458 |\n| [mpt-7b-instruct](https://huggingface.co/mosaicml/mpt-7b-instruct)                    |          0.354 |             0.399 |                   0.415 |                    0.372 |                      0.171 |                   0.415 |\n| [mpt-7b-8k](https://huggingface.co/mosaicml/mpt-7b-8k)                                |          0.354 |             0.427 |                   0.368 |                    0.426 |                      0.171 |                   0.378 |\n| [falcon-7b](https://huggingface.co/tiiuae/falcon-7b)                                  |          0.335 |             0.371 |                   0.421 |                    0.37  |                      0.159 |                   0.355 |\n| [mpt-7b](https://huggingface.co/mosaicml/mpt-7b)                                      |          0.324 |             0.356 |                   0.384 |                    0.38  |                      0.163 |                   0.336 |\n| [falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)                |          0.307 |             0.34  |                   0.372 |                    0.333 |                      0.108 |                   0.38  |\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/databricks/databricks-ml-examples/assets/12763339/acdfb7ce-c233-4ede-884c-4e0b4ce0a4f6\" /\u003e\n\u003c/p\u003e\n\n## Other examples:\n\n- [DIY LLM QA Bot Accelerator](https://github.com/databricks-industry-solutions/diy-llm-qa-bot)\n- [Biomedical Question Answering over Custom Datasets with LangChain and Llama 2 from Hugging Face](https://github.com/databricks-industry-solutions/hls-llm-doc-qa)\n- [DIY QA LLM BOT](https://github.com/puneet-jain159/DSS_LLM_QA_Retrieval_Session/tree/main)\n- [Tuning the Finetuning: An exploration of achieving success with QLoRA](https://github.com/avisoori-databricks/Tuning-the-Finetuning)\n- [databricks-llm-fine-tuning](https://github.com/mshtelma/databricks-llm-fine-tuning)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fdatabricks-ml-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabricks%2Fdatabricks-ml-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fdatabricks-ml-examples/lists"}