{"id":14483080,"url":"https://github.com/Blaizzy/mlx-embeddings","last_synced_at":"2025-08-30T03:32:57.704Z","repository":{"id":253564747,"uuid":"829521860","full_name":"Blaizzy/mlx-embeddings","owner":"Blaizzy","description":"MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.","archived":false,"fork":false,"pushed_at":"2024-08-20T21:30:13.000Z","size":74,"stargazers_count":51,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-08-20T23:22:26.053Z","etag":null,"topics":["chatbot","embeddings","llms","rag","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Blaizzy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-16T15:44:10.000Z","updated_at":"2024-08-20T19:45:39.000Z","dependencies_parsed_at":"2024-08-20T23:25:58.009Z","dependency_job_id":null,"html_url":"https://github.com/Blaizzy/mlx-embeddings","commit_stats":null,"previous_names":["blaizzy/mlx-embeddings"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2Fmlx-embeddings","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2Fmlx-embeddings/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2Fmlx-embeddings/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2Fmlx-embeddings/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Blaizzy","download_url":"https://codeload.github.com/Blaizzy/mlx-embeddings/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":217593012,"owners_count":16201561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","embeddings","llms","rag","retrieval-augmented-generation"],"created_at":"2024-09-03T00:01:29.083Z","updated_at":"2025-08-30T03:32:57.679Z","avatar_url":"https://github.com/Blaizzy.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# MLX-Embeddings\n\n[![image](https://img.shields.io/pypi/v/mlx-embeddings.svg)](https://pypi.python.org/pypi/mlx-embeddings)\n\n**MLX-Embeddings is a package for running Vision and Language Embedding models locally on your Mac using MLX.**\n\n- Free software: GNU General Public License v3\n\n## Features\n\n- Generate embeddings for text and images using MLX models\n- Support for single-item and batch processing\n- Utilities for comparing text similarities\n\n## Supported Models Archictectures\nMLX-Embeddings supports a variety of model architectures for text embedding tasks. Here's a breakdown of the currently supported architectures:\n- XLM-RoBERTa (Cross-lingual Language Model - Robustly Optimized BERT Approach)\n- BERT (Bidirectional Encoder Representations from Transformers)\n- ModernBERT (modernized bidirectional encoder-only Transformer model)\n- Qwen3 (Qwen3's embedding model)\n\nWe're continuously working to expand our support for additional model architectures. Check our GitHub repository or documentation for the most up-to-date list of supported models and their specific versions.\n\n## Installation\n\nYou can install mlx-embeddings using pip:\n\n```bash\npip install mlx-embeddings\n```\n\n## Usage\n\n### Single Item Embedding\n\n\n#### Text Embedding\nTo generate an embedding for a single piece of text:\n\n```python\nfrom mlx_embeddings.utils import load\n\n# Load the model and tokenizer\nmodel_name = \"mlx-community/all-MiniLM-L6-v2-4bit\"\nmodel, tokenizer = load(model_name)\n\n# Prepare the text\ntext = \"I like reading\"\n\n# Tokenize and generate embedding\ninput_ids = tokenizer.encode(text, return_tensors=\"mlx\")\noutputs = model(input_ids)\nraw_embeds = outputs.last_hidden_state[:, 0, :] # CLS token\ntext_embeds = outputs.text_embeds # mean pooled and normalized embeddings\n```\n\nNote : text-embeds use mean pooling for bert and xlm-robert. For modernbert, pooling strategy is set through the config file, defaulting to mean\n\n#### Masked Language Modeling\n\nTo generate embeddings for masked language modeling tasks:\n\n```python\nfrom mlx_embeddings.utils import load\n\n# Load ModernBERT model and tokenizer\nmodel, tokenizer = load(\"mlx-community/answerdotai-ModernBERT-base-4bit\")\n\n# Masked Language Modeling example\ntext = \"The capital of France is [MASK].\"\ninputs = tokenizer.encode(text, return_tensors=\"mlx\")\noutputs = model(inputs)\n\n# Get predictions for the masked token\nmasked_index = inputs.tolist()[0].index(tokenizer.mask_token_id)\npredicted_token_id = mx.argmax(outputs.pooler_output[0, masked_index]).tolist()\npredicted_token = tokenizer.decode(predicted_token_id)\nprint(\"Predicted token:\", predicted_token)  # Should output: Paris\n```\n\n#### Sequence classification\n```python\nfrom mlx_embeddings.utils import load\n\n# Load ModernBERT model and tokenizer\nmodel, tokenizer = load(\n    \"NousResearch/Minos-v1\",\n)\n\nid2label=model.config.id2label\n\n# Masked Language Modeling example\ntext = \"\u003c|user|\u003e Explain the theory of relativity in simple terms. \u003c|assistant|\u003e Imagine space and time are like a stretchy fabric. Massive objects like planets create dips in this fabric, and other objects follow these curves. That's gravity! Also, the faster you move, the slower time passes for you compared to someone standing still\"\ninputs = tokenizer.encode(text, return_tensors=\"mlx\")\noutputs = model(inputs)\n\n# Get predictions for the masked token\npredictions = outputs.pooler_output[0] # Shape: (num_label,)\nprint(text)\n\n# Print results\nprint(\"\\nTop predictions for classification:\")\nfor idx, logit in enumerate(predictions.tolist()):\n    label = id2label[str(idx)]\n    print(f\"{label}: {logit:.3f}\")\n```\n\n### Batch Processing\n\n#### Multiple Texts Comparison\n\nTo embed multiple texts and compare them using their embeddings:\n\n```python\nfrom sklearn.metrics.pairwise import cosine_similarity\nimport matplotlib.pyplot as plt\nimport seaborn as sns\nimport mlx.core as mx\nfrom mlx_embeddings.utils import load\n\n# Load the model and tokenizer\nmodel, tokenizer = load(\"mlx-community/all-MiniLM-L6-v2-4bit\")\n\ndef get_embedding(texts, model, tokenizer):\n    inputs = tokenizer.batch_encode_plus(texts, return_tensors=\"mlx\", padding=True, truncation=True, max_length=512)\n    outputs = model(\n        inputs[\"input_ids\"],\n        attention_mask=inputs[\"attention_mask\"]\n    )\n    return outputs.text_embeds # mean pooled and normalized embeddings\n\ndef compute_and_print_similarity(embeddings):\n    B, _ = embeddings.shape\n    similarity_matrix = cosine_similarity(embeddings)\n    print(\"Similarity matrix between sequences:\")\n    print(similarity_matrix)\n    print(\"\\n\")\n\n    for i in range(B):\n        for j in range(i+1, B):\n            print(f\"Similarity between sequence {i+1} and sequence {j+1}: {similarity_matrix[i][j]:.4f}\")\n\n    return similarity_matrix\n\n# Visualize results\ndef plot_similarity_matrix(similarity_matrix, labels):\n    plt.figure(figsize=(5, 4))\n    sns.heatmap(similarity_matrix, annot=True, cmap='coolwarm', xticklabels=labels, yticklabels=labels)\n    plt.title('Similarity Matrix Heatmap')\n    plt.tight_layout()\n    plt.show()\n\n# Sample texts\ntexts = [\n    \"I like grapes\",\n    \"I like fruits\",\n    \"The slow green turtle crawls under the busy ant.\"\n]\n\nembeddings = get_embedding(texts, model, tokenizer)\nsimilarity_matrix = compute_and_print_similarity(embeddings)\n\n# Visualize results\nlabels = [f\"Text {i+1}\" for i in range(len(texts))]\nplot_similarity_matrix(similarity_matrix, labels)\n```\n\n#### Masked Language Modeling\n\nTo get predictions for the masked token in multiple texts:\n\n```python\nimport mlx.core as mx\nfrom mlx_embeddings.utils import load\n\n# Load the model and tokenizer\nmodel, tokenizer = load(\"mlx-community/answerdotai-ModernBERT-base-4bit\")\n\ntext = [\"The capital of France is [MASK].\", \"The capital of Poland is [MASK].\"]\ninputs = tokenizer.batch_encode_plus(text, return_tensors=\"mlx\", padding=True, truncation=True, max_length=512)\noutputs = model(**inputs)\n\n# To get predictions for the mask:\n# Find mask token indices for each sequence in the batch\n# Find mask indices for all sequences in batch\nmask_indices = mx.array([ids.tolist().index(tokenizer.mask_token_id) for ids in inputs[\"input_ids\"]])\n\n# Get predictions for all masked tokens at once\nbatch_indices = mx.arange(len(mask_indices))\npredicted_token_ids = mx.argmax(outputs.pooler_output[batch_indices, mask_indices], axis=-1).tolist()\n\n# Decode the predicted tokens\npredicted_token = tokenizer.batch_decode(predicted_token_ids)\n\nprint(\"Predicted token:\", predicted_token)\n# Predicted token:  Paris, Warsaw\n```\n\n\n## Vision Transformer Models\n\nMLX-Embeddings also supports vision models that can generate embeddings for images or image-text pairs.\n\n### Single Image Processing\n\nTo evaluate how well an image matches different text descriptions:\n\n```python\nimport mlx.core as mx\nfrom mlx_embeddings.utils import load\nimport requests\nfrom PIL import Image\n\n# Load vision model and processor\nmodel, processor = load(\"mlx-community/siglip-so400m-patch14-384\")\n\n# Load an image\nurl = \"http://images.cocodataset.org/val2017/000000039769.jpg\"\nimage = Image.open(requests.get(url, stream=True).raw)\n\n# Create text descriptions to compare with the image\ntexts = [\"a photo of 2 dogs\", \"a photo of 2 cats\"]\n\n# Process inputs\ninputs = processor(text=texts, images=image, padding=\"max_length\", return_tensors=\"pt\")\npixel_values = mx.array(inputs.pixel_values).transpose(0, 2, 3, 1).astype(mx.float32)\ninput_ids = mx.array(inputs.input_ids)\n\n# Generate embeddings and calculate similarity\noutputs = model(pixel_values=pixel_values, input_ids=input_ids)\nlogits_per_image = outputs.logits_per_image\nprobs = mx.sigmoid(logits_per_image)  # probabilities of image matching each text\n\n# Print results\nprint(f\"{probs[0][0]:.1%} that image matches '{texts[0]}'\")\nprint(f\"{probs[0][1]:.1%} that image matches '{texts[1]}'\")\n```\n\n### Batch Processing for Multiple Images comparison\n\nProcess multiple images and compare them with text descriptions:\n\n```python\nimport mlx.core as mx\nfrom mlx_embeddings.utils import load\nimport requests\nfrom PIL import Image\nimport matplotlib.pyplot as plt\nimport seaborn as sns\n\n# Load vision model and processor\nmodel, processor = load(\"mlx-community/siglip-so400m-patch14-384\")\n\n# Load multiple images\nimage_urls = [\n    \"./images/cats.jpg\",  # cats\n    \"./images/desktop_setup.png\"   # desktop setup\n]\nimages = [Image.open(requests.get(url, stream=True).raw) if url.startswith(\"http\") else Image.open(url) for url in image_urls]\n\n# Text descriptions\ntexts = [\"a photo of cats\", \"a photo of a desktop setup\", \"a photo of a person\"]\n\n# Process all image-text pairs\nall_probs = []\n\n\n# Process all image-text pairs in batch\ninputs = processor(text=texts, images=images, padding=\"max_length\", return_tensors=\"pt\")\npixel_values = mx.array(inputs.pixel_values).transpose(0, 2, 3, 1).astype(mx.float32)\ninput_ids = mx.array(inputs.input_ids)\n\n# Generate embeddings and calculate similarity\noutputs = model(pixel_values=pixel_values, input_ids=input_ids)\nlogits_per_image = outputs.logits_per_image\nprobs = mx.sigmoid(logits_per_image) # probabilities for this image\nall_probs.append(probs.tolist())\n\n\n# Print results for this image\nfor i, image in enumerate(images):\n    print(f\"Image {i+1}:\")\n    for j, text in enumerate(texts):\n        print(f\"  {probs[i][j]:.1%} match with '{text}'\")\n    print()\n\n# Visualize results with a heatmap\ndef plot_similarity_matrix(probs_matrix, image_labels, text_labels):\n    # Convert to 2D numpy array if needed\n    import numpy as np\n    probs_matrix = np.array(probs_matrix)\n\n    # Ensure we have a 2D matrix for the heatmap\n    if probs_matrix.ndim \u003e 2:\n        probs_matrix = probs_matrix.squeeze()\n\n    plt.figure(figsize=(8, 5))\n    sns.heatmap(probs_matrix, annot=True, cmap='viridis',\n                xticklabels=text_labels, yticklabels=image_labels,\n                fmt=\".1%\", vmin=0, vmax=1)\n    plt.title('Image-Text Match Probability')\n    plt.tight_layout()\n    plt.show()\n\n# Plot the images for reference\nplt.figure(figsize=(8, 5))\nfor i, image in enumerate(images):\n    plt.subplot(1, len(images), i+1)\n    plt.imshow(image)\n    plt.title(f\"Image {i+1}\")\n    plt.axis('off')\nplt.tight_layout()\nplt.show()\n\nimage_labels = [f\"Image {i+1}\" for i in range(len(images))]\nplot_similarity_matrix(all_probs, image_labels, texts)\n```\n\n### Late Interaction Multimodal Retrival Models (ColPali/ColQwen)\n\n```python\nimport mlx.core as mx\nfrom mlx_embeddings.utils import load\nimport requests\nfrom PIL import Image\nimport torch\n\n# Load vision model and processor\nmodel, processor = load(\"qnguyen3/colqwen2.5-v0.2-mlx\")\n\n# Load an image\n\nurl_1 = \"https://upload.wikimedia.org/wikipedia/commons/8/89/US-original-Declaration-1776.jpg\"\nimage_1 = Image.open(url_1)\n\nurl_2 = \"https://upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Romeoandjuliet1597.jpg/500px-Romeoandjuliet1597.jpg\"\nimage_2 = Image.open(url_2)\n\n# Create text descriptions to compare with the image\ntexts = [\"how many percent of data are books?\", \"evaluation results between models\"]\n\n# Process inputs - text and images need to be processed separately for ColQwen2.5\ntext_inputs = processor(text=texts, padding=True, return_tensors=\"pt\")\nimage_inputs = processor(images=[image_1, image_2], padding=True, return_tensors=\"pt\")\n\n# Convert to MLX arrays\ntext_input_ids = mx.array(text_inputs.input_ids)\n\nimage_input_ids = mx.array(image_inputs.input_ids)\npixel_values = mx.array(image_inputs.pixel_values)\nimage_grid_thw = mx.array(image_inputs.image_grid_thw)\n\ntext_embeddings = model(input_ids=text_input_ids)\nimage_embeddings = model(\n    input_ids=image_input_ids, \n    pixel_values=pixel_values, \n    image_grid_thw=image_grid_thw,\n)\n\nprint(text_embeddings.text_embeds.shape)\nprint(image_embeddings.image_embeds.shape)\n\n## convert to torch\nimport torch\ntext_embeddings = torch.tensor(text_embeddings.text_embeds)\nimage_embeddings = torch.tensor(image_embeddings.image_embeds)\n\nscores = processor.score_retrieval(text_embeddings, image_embeddings)\nprint(scores)\n```\n\n## Contributing\n\nContributions to MLX-Embeddings are welcome! Please refer to our contribution guidelines for more information.\n\n## License\n\nThis project is licensed under the GNU General Public License v3.\n\n## Contact\n\nFor any questions or issues, please open an issue on the [GitHub repository](https://github.com/Blaizzy/mlx-embeddings).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlaizzy%2Fmlx-embeddings","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBlaizzy%2Fmlx-embeddings","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlaizzy%2Fmlx-embeddings/lists"}