Mistral Two important product updates were released in one fell swoop last night: the chat tool - Le Chat and the multimodal model - Pixtral Large.
The new Le Chat
New features added: canvas, web search, image generation, image understanding, still like their base Mistral API The service is the same, keep it free.
Le Chat Canvas
The new content can be generated by selecting parts of the generated content and suggesting changes. On the right side is the modification record and labeled version, and on the left canvas is the main content generated.
Le Chat Web Search
Rely on the Brave search engine to output answer citation source URLs
Le Chat Image Recognition
Image content and meaning recognition calibration, while the Chinese single recognition inaccurate, if the length of the Chinese, can be more accurate recognition, but individual complex Chinese characters will still recognize errors or output meaning similar to the single word is not the same. This is similar to the openai model, do not expect too much.
The following example of an OCR can be a good illustration of the situation: the picture only points to the text recognition error, the original text in the figure is "inducer", indicating that the image recognition of text or rely on "guessing", the Chinese compared to the English slightly guessing does not happen.
Le Chat Image Generation
Check Image generation, enter a prompt with the words create, draw, generate, etc., and the Flux Pro model will be automatically invoked to generate an image.
Wake Up Agent
This is a long released feature... A lot of people don't notice that he can select all of Mistral's released models to customize the intelligent assistant to talk to, and you can type into the dialog box@
Evoke the intelligent assistant. In most tasks, Mistral's large-size model complete all types of tasks is excellent.
Introducing the newly released multimodal large model, Pixtral Large.
- Frontier level multimodal performance
- State-of-the-art on MathVista, DocVQA, VQAv2
- Extends the functionality of Mistral Large 2 without affecting text performance.
- 123B multimodal decoder, 1B parametric visual coder
- 128K context window: supports at least 30 high-resolution images
- Usage:
- Attempts at le Chat
- in our API hit the nail on the head
pixtral-large-latest
- available at here (literary) downloading
Today, we released Pixtral Large, a 124B open-weighted multimodal model built on Mistral Large 2. Pixtral Large is the second model in our multimodal series that demonstrates cutting-edge image understanding capabilities. In particular, the model is able to understand documents, diagrams, and natural images while maintaining the leading text understanding capabilities of Mistral Large 2.
The model is open for research and educational use under the Mistral Research License (MRL); for experimental, testing, and production use for commercial purposes, it may be used under the Mistral Commercial License.
performances
We evaluated the performance of Pixtral Large with the Frontier model on a standard set of multimodal benchmarks using a generalized testing tool. On MathVista, our model achieved 69.4%, outperforming all other models and demonstrating the ability to reason complex math on visual data. To assess its ability to reason on complex charts and documents, we evaluated it using ChartQA and DocVQA, where Pixtral Large outperformed GPT-4o and Gemini-1.5 Pro.
Finally, the Pixtral Large shows competitiveness on the MM-MT-Bench, outperforming the Claude-3.5 Sonnet (new), the Gemini-1.5 Pro, and the GPT-4o (newest).The MM-MT-Bench is a expand one's financial resources A review-based evaluation designed to reflect real-world scenarios of multimodal large language modeling (for more information, see Pixtral 12B Technical Report).
qualitative example
Multilingual OCR and inference...
Cue word:
I bought coffee and sausage. How much should I pay? Plus 18% for the tip.
Answer:
To calculate the total amount including the 18% tip, follow the steps below:
- Calculate the cost of coffee and sausage:
- 2x Latte Macchiato: 2 * 4.50 CHF = 9.00 CHF
- 1x Schweinschnitzel: 22.00 CHF
- Total cost of coffee and sausage: 9.00 CHF + 22.00 CHF = 31.00 CHF
- Calculate the tip for 18%:
- Tip: 31.00 CHF * 0.18 = 5.58 CHF
- Add the tip to the total cost:
- Total due: 31.00 CHF + 5.58 CHF = 36.58 CHF
Therefore, you will pay a total of 36.58 CHF.
Graphical understanding...
Cue word:
When did the Dark Dragon model start having problems?
Answer:
The training loss for the "dark-dragon-50" model begins to show significant instability at about 10,000 steps. There is a clear peak in the loss values at this point, and another major peak at about 20,000 steps. These peaks indicate that the training process of the "dark-dragon-50" model starts to have problems from about 10,000 steps.
More examples...
Cue word:
Which companies use the Mistral AI The model?
Answer:
Companies using the Mistral AI model are shown in the chart and include:
- BNP Paribas
- Brave
- Cloudflare
- CMA CGM (Duffy Shipping Group, France)
- front
One more thing...
With the release of Pixtral LargeMistral Large, our state-of-the-art text model has also been updated. The model has been updated in our API embedded in pixtral-large-latest
Use is also available for research on HuggingFace under a Mistral Research License (MRL) or for commercial use through a commercial license from Mistral AI.
This latest model is a significant upgrade from the previous Mistral Large 24.07, with significant advances in long context understanding, new system hints, and more accurate function calls. The model is well suited for RAG and intelligent body workflows that can be used in enterprise-level application scenarios such as knowledge exploration and sharing, document semantic understanding, task automation, and enhancing customer experience.
Mistral Large 24.11 will go live through our cloud provider partners soon, starting with Google Cloud and Microsoft Azure, and is expected to be available within a week.