https://github.com/divakarkumarp/phi-3-vision-ms-multimodal

Phi-3-Vision-128K-Instruct Demo
https://github.com/divakarkumarp/phi-3-vision-ms-multimodal

phi-3-vision python

Last synced: 7 months ago
JSON representation

Phi-3-Vision-128K-Instruct Demo

Host: GitHub
URL: https://github.com/divakarkumarp/phi-3-vision-ms-multimodal
Owner: divakarkumarp
License: mit
Created: 2024-06-06T17:31:27.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-06-08T12:39:08.000Z (over 1 year ago)
Last Synced: 2025-01-22T08:13:18.324Z (9 months ago)
Topics: phi-3-vision, python
Language: Jupyter Notebook
Homepage:
Size: 604 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Phi-3-Vision-Microsoft-Multimodal

Microsoft Phi-3 Vision-the first Multimodal model By Microsoft, a multimodal model that brings together language and vision capabilities. the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

[Demo with Huggingface🤗](https://github.com/divakarkumarp/Phi-3-Vision-MS-Multimodal/blob/main/Phi_3_vision_128k_instruct.ipynb)

![image](https://github.com/divakarkumarp/Phi-3-Vision-MS-Multimodal/assets/32620288/28e3b588-64d1-423e-881e-8e59384204bd)

* Hugging Face🤗 : [click-1](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct?library-transformers)
* Hugging Face🤗 : [click-2](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k)
* Hugging Face🤗 : [click-3](https://huggingface.co/docs/transformers/main/en/model_doc/llama3)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/divakarkumarp/phi-3-vision-ms-multimodal

Awesome Lists containing this project

README