Average Ratings 0 Ratings
Average Ratings 0 Ratings
Description
GLM-4.5V is an evolution of the GLM-4.5-Air model, incorporating a Mixture-of-Experts (MoE) framework that boasts a remarkable total of 106 billion parameters, with 12 billion specifically dedicated to activation. This model stands out by delivering top-tier performance among open-source vision-language models (VLMs) of comparable scale, demonstrating exceptional capabilities across 42 public benchmarks in diverse contexts such as images, videos, documents, and GUI interactions. It offers an extensive array of multimodal functionalities, encompassing image reasoning tasks like scene understanding, spatial recognition, and multi-image analysis, alongside video comprehension tasks that include segmentation and event recognition. Furthermore, it excels in parsing complex charts and lengthy documents, facilitating GUI-agent workflows through tasks like screen reading and desktop automation, while also providing accurate visual grounding by locating objects and generating bounding boxes. Additionally, the introduction of a "Thinking Mode" switch enhances user experience by allowing the selection of either rapid responses or more thoughtful reasoning based on the situation at hand. This innovative feature makes GLM-4.5V not only versatile but also adaptable to various user needs.
Description
Pixtral Large is an expansive multimodal model featuring 124 billion parameters, crafted by Mistral AI and enhancing their previous Mistral Large 2 framework. This model combines a 123-billion-parameter multimodal decoder with a 1-billion-parameter vision encoder, allowing it to excel in the interpretation of various content types, including documents, charts, and natural images, all while retaining superior text comprehension abilities. With the capability to manage a context window of 128,000 tokens, Pixtral Large can efficiently analyze at least 30 high-resolution images at once. It has achieved remarkable results on benchmarks like MathVista, DocVQA, and VQAv2, outpacing competitors such as GPT-4o and Gemini-1.5 Pro. Available for research and educational purposes under the Mistral Research License, it also has a Mistral Commercial License for business applications. This versatility makes Pixtral Large a valuable tool for both academic research and commercial innovations.
API Access
Has API
API Access
Has API
Screenshots View All
No images available
Integrations
Amazon Bedrock
BlueGPT
Continue
Diaflow
Fleak
GMTech
GaiaNet
Klee
Literal AI
MindMac
Integrations
Amazon Bedrock
BlueGPT
Continue
Diaflow
Fleak
GMTech
GaiaNet
Klee
Literal AI
MindMac
Pricing Details
Free
Free Trial
Free Version
Pricing Details
Free
Free Trial
Free Version
Deployment
Web-Based
On-Premises
iPhone App
iPad App
Android App
Windows
Mac
Linux
Chromebook
Deployment
Web-Based
On-Premises
iPhone App
iPad App
Android App
Windows
Mac
Linux
Chromebook
Customer Support
Business Hours
Live Rep (24/7)
Online Support
Customer Support
Business Hours
Live Rep (24/7)
Online Support
Types of Training
Training Docs
Webinars
Live Training (Online)
In Person
Types of Training
Training Docs
Webinars
Live Training (Online)
In Person
Vendor Details
Company Name
Zhipu AI
Founded
2023
Country
China
Website
chat.z.ai/
Vendor Details
Company Name
Mistral AI
Founded
2023
Country
France
Website
mistral.ai/news/pixtral-large/