An open API service indexing awesome lists of open source software.

https://github.com/divakarkumarp/phi-3-vision-ms-multimodal

Phi-3-Vision-128K-Instruct Demo
https://github.com/divakarkumarp/phi-3-vision-ms-multimodal

phi-3-vision python

Last synced: 2 months ago
JSON representation

Phi-3-Vision-128K-Instruct Demo

Awesome Lists containing this project

README

        

# Phi-3-Vision-Microsoft-Multimodal

Microsoft Phi-3 Vision-the first Multimodal model By Microsoft, a multimodal model that brings together language and vision capabilities. the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

[Demo with Huggingface🤗](https://github.com/divakarkumarp/Phi-3-Vision-MS-Multimodal/blob/main/Phi_3_vision_128k_instruct.ipynb)

![image](https://github.com/divakarkumarp/Phi-3-Vision-MS-Multimodal/assets/32620288/28e3b588-64d1-423e-881e-8e59384204bd)

* Hugging Face🤗 : [click-1](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct?library-transformers)
* Hugging Face🤗 : [click-2](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k)
* Hugging Face🤗 : [click-3](https://huggingface.co/docs/transformers/main/en/model_doc/llama3)