Chat with Phi 3.5/3 Vision LLMs

phi-3.5-vision-demo.mp4

Overview

Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision.

This model enables multi-frame image understanding, image comparison, multi-image summarization/storytelling, and video summarization, which have broad applications in office scenarios.

Getting Started

Follow these steps to set up and run the project:

1. Install Dependencies

Ensure all necessary packages are installed by running:

pip install -r requirements.txt

2. Start the API Server

Launch the API server powered by LitServe:

python server.py

3. Launch the Streamlit App

Start the Streamlit application with the following command:

streamlit run app.py

About

This project is developed and maintained with ❤️ by Bhimraj Yadav.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
dogs-playing-in-grassy-field.jpg		dogs-playing-in-grassy-field.jpg
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with Phi 3.5/3 Vision LLMs

Overview

Getting Started

1. Install Dependencies

2. Start the API Server

3. Launch the Streamlit App

About

About

Releases 1

Packages

Languages

License

bhimrazy/chat-with-phi-3-vision

Folders and files

Latest commit

History

Repository files navigation

Chat with Phi 3.5/3 Vision LLMs

Overview

Getting Started

1. Install Dependencies

2. Start the API Server

3. Launch the Streamlit App

About

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages