Mistral AI introduced a new large language model ‘Mistral Small 3’ in collaboration with Allen Institue, which is a pre-trained and instructed model catered to 80%’ of generative AI tasks.
Mistral Small 3: 3x Faster than Llama 3.3 70B or Qwen 32B
According to the blog post, Mistral Small is currently the most efficient model in its category with over 81% accuracy on MMLU and 150 tokens/s latency.
The model can compete with prominent LLMs such as Llama 3.3 70B or Qwen 32B as it is over 3x faster than its arch rivals. It is also a replacement for opaque proprietary models like GPT4o-mini.
Mistral Small 3 is capable of offering saturate performance at a size suitable for local deployment. The reason behind its faster performance is fewer layers, helping it to reduce time per forward pass. The new model comes with 24 billion parameters, which enables it to run on certain MacBooks with quantization features.
The model went through an unusual development process where developers avoided the subsequent refinement process after building a base model. The unique approach helps users feed their requisite
With fast-response conversational assistance and low-latency function calling, Mistral Small 3 strives to provide numerous use cases for pre-trained models. The model is intended to make waves in three different industries, including financial services, healthcare providers, and robotics & automation.
In a blog post, the Mistral AI team writes, “Our instruction tuned model performs competitively with open weight models three times its size and with proprietary GPT4o-mini model across Code, Math, General knowledge and Instruction following benchmarks.”
Source: https://mistral.ai/news/mistral-small-3/
Latest Stories:
OpenAI Strikes Back: O3-Mini Launches Today to Rival DeepSeek