Microsoft releases multimodal small AI model Phi-3-vision AI NEWS

Home
AInews
Microsoft releases multimodal small AI model Phi-3-vision

Microsoft releases multimodal small AI model Phi-3-vision

2024-05-22

At the 2024 Microsoft Build conference, the technology giant Microsoft announced the newest member of its small-scale open model Phi-3 family. Of particular interest is Phi-3-vision, a multimodal model that combines language and visual capabilities. This model, with 4.2 billion parameters, can generate insights from charts and diagrams, providing powerful tool support for various applications. Key points include: - Phi-3-vision: This is a multimodal model that combines language and visual capabilities, allowing it to understand and generate insights from text and images, including charts and diagrams. - Phi-3-small and Phi-3-medium: These previously announced models are now available on Microsoft Azure, providing developers with powerful tools to build generative AI applications. - Phi-3-mini: As the first model in the Phi-3 family, it is now available through Azure AI's model-as-a-service, making it easier for users to get started. The Phi-3-vision model excels in tasks such as optical character recognition (OCR), chart analysis, and diagram understanding. It is designed to process and reason with real-world images, providing important tools for developers working with visual data. The Phi-3 models demonstrate outstanding performance and cost advantages compared to larger language models. For example, Phi-3-small outperforms models twice its size, including GPT-3.5 Plus, despite having only 7 billion parameters. Phi-3-vision continues this trend by surpassing larger models such as Claude-3 Haiku and Gemini 1.0 Pro V in visual reasoning tasks. The compact design of the Phi-3 models allows them to be deployed on devices, enabling low-latency AI experiences without the need for a network connection, making them an ideal choice. Additionally, these models offer higher cost-effectiveness. According to Sébastien Bubeck, Vice President of GenAI Research at Microsoft, the cost of Phi-3 has been "significantly reduced." As the availability of models continues to evolve, choosing the right model will depend on specific use cases and business needs. The expansion of the Phi-3 family provides developers with a set of versatile tools for building generative AI applications. The advantages of Phi-3 models in performance, cost-effectiveness, and versatility make them an ideal choice for a wide range of use cases, showcasing the immense potential of small-scale language models in the field of AI.

Glambase

Glambase - Create and monetize AI influencers.

Aider Chat

Aider Chat - Pair program with AI in terminal.

Tidio Chat

Tidio Chat - Manage customer communications through live chat, email, and chatbots.

Botpress

Botpress - Build and manage AI chatbots.

Theee AI

Theee AI - Use 50,000 AI tools for free online

Tarotap

Tarotap - Personalized AI tarot readings and predictions

Shortimize

Shortimize - Track, analyze & explore short form content videos and accounts

RECENT AI TOOLS

Face Detector

Glambase

Aider Chat

Tidio Chat

Botpress

RECENT AI NEWS

El Capitan Tops Supercomputer Rankings, Powered by AMD Instinct Chips

Logo Creator: New AI-Powered Design Tool Simplifies Logo Creation Process

AWS Launches Multi-Agent Orchestrator for Managing AI Agents

Microsoft Ignite Conference Unveils Copilot Actions and Multiple AI Enhancements

Microsoft Launches Windows 365 Link, a New Option for Cloud Mini PCs

Niantic Develops Large-Scale Geospatial Models to Redefine Real-World Interactions

Google Gemini Update: Personalized Memory Feature Launched

OpenAI Launches Advanced Voice Mode for ChatGPT Web Version

RECENT AI TOOLS