Best Mathstral Alternatives in 2024
Find the top alternatives to Mathstral currently available. Compare ratings, reviews, pricing, and features of Mathstral alternatives in 2024. Slashdot lists the best Mathstral alternatives on the market that offer competing products that are similar to Mathstral. Sort through Mathstral alternatives below to make the best choice for your needs
-
1
Codestral Mamba
Mistral AI
Codestral Mamba is a Mamba2 model that specializes in code generation. It is available under the Apache 2.0 license. Codestral Mamba represents another step in our efforts to study and provide architectures. We hope that it will open up new perspectives in architecture research. Mamba models have the advantage of linear inference of time and the theoretical ability of modeling sequences of unlimited length. Users can interact with the model in a more extensive way with rapid responses, regardless of the input length. This efficiency is particularly relevant for code productivity use-cases. We trained this model with advanced reasoning and code capabilities, enabling the model to perform at par with SOTA Transformer-based models. -
2
Mistral NeMo
Mistral AI
FreeMistral NeMo, our new best small model. A state-of the-art 12B with 128k context and released under Apache 2.0 license. Mistral NeMo, a 12B-model built in collaboration with NVIDIA, is available. Mistral NeMo has a large context of up to 128k Tokens. Its reasoning, world-knowledge, and coding precision are among the best in its size category. Mistral NeMo, which relies on a standard architecture, is easy to use. It can be used as a replacement for any system that uses Mistral 7B. We have released Apache 2.0 licensed pre-trained checkpoints and instruction-tuned base checkpoints to encourage adoption by researchers and enterprises. Mistral NeMo has been trained with quantization awareness to enable FP8 inferences without performance loss. The model was designed for global applications that are multilingual. It is trained in function calling, and has a large contextual window. It is better than Mistral 7B at following instructions, reasoning and handling multi-turn conversation. -
3
Claude 3.5 Sonnet
Anthropic
FreeClaude 3.5 Sonnet is a new benchmark for the industry in terms of graduate-level reasoning (GPQA), undergrad-level knowledge (MMLU), as well as coding proficiency (HumanEval). It is exceptional in writing high-quality, relatable content that is written with a natural and relatable tone. It also shows marked improvements in understanding nuance, humor and complex instructions. Claude 3.5 Sonnet is twice as fast as Claude 3 Opus. Claude 3.5 Sonnet is ideal for complex tasks, such as providing context-sensitive support to customers and orchestrating workflows. Claude 3.5 Sonnet can be downloaded for free from Claude.ai and Claude iOS, and subscribers to the Claude Pro and Team plans will have access to it at rates that are significantly higher. It is also accessible via the Anthropic AI, Amazon Bedrock and Google Cloud Vertex AI. The model costs $3 for every million input tokens. It costs $15 for every million output tokens. There is a 200K token window. -
4
Galactica
Meta
Information overload is a major barrier to scientific progress. The explosion of scientific literature and data makes it harder to find useful insights among a vast amount of information. Search engines are used to access scientific knowledge today, but they cannot organize it. Galactica is an extensive language model which can store, combine, and reason about scientific information. We train using a large corpus of scientific papers, reference material and knowledge bases, among other sources. We outperform other models in a variety of scientific tasks. Galactica performs better than the latest GPT-3 on technical knowledge probes like LaTeX Equations by 68.2% to 49.0%. Galactica is also good at reasoning. It outperforms Chinchilla in mathematical MMLU with a score between 41.3% and 35.7%. And PaLM 540B in MATH, with a score between 20.4% and 8.8%. -
5
Smaug-72B
Abacus
FreeSmaug 72B is an open-source large-language model (LLM), which is known for its key features. High Performance: It is currently ranked first on the Hugging face Open LLM leaderboard. This model has surpassed models such as GPT-3.5 across a range of benchmarks. This means that it excels in tasks such as understanding, responding to and generating text similar to human speech. Open Source: Smaug-72B, unlike many other advanced LLMs is available to anyone for free use and modification, fostering collaboration, innovation, and creativity in the AI community. Focus on Math and Reasoning: It excels at handling mathematical and reasoning tasks. This is attributed to the unique fine-tuning technologies developed by Abacus, the creators Smaug 72B. Based on Qwen 72B: This is a finely tuned version of another powerful LLM, called Qwen 72B, released by Alibaba. It further improves its capabilities. Smaug-72B is a significant advance in open-source AI. -
6
Claude 3 Opus
Anthropic
FreeOpus, our intelligent model, is superior to its peers in most of the common benchmarks for AI systems. These include undergraduate level expert knowledge, graduate level expert reasoning, basic mathematics, and more. It displays near-human levels in terms of comprehension and fluency when tackling complex tasks. This is at the forefront of general intelligence. All Claude 3 models have increased capabilities for analysis and forecasting. They also offer nuanced content generation, code generation and the ability to converse in non-English language such as Spanish, Japanese and French. -
7
Mistral Large 2
Mistral AI
FreeMistral Large 2 comes with a 128k window that supports dozens of different languages, including French, German and Spanish. It also supports Arabic, Hindi, Russian and Chinese. It also supports 80+ programming languages, including Python, Java and C++. Mistral Large 2 was designed with single-node applications in mind. Its size of 123 million parameters allows it to run fast on a single computer. Mistral Large 2 is released under the Mistral Research License which allows modification and usage for research and noncommercial purposes. -
8
OpenAI o1-mini
OpenAI
OpenAI o1 mini is a new and cost-effective AI designed to enhance reasoning, especially in STEM fields such as mathematics and coding. It is part of the o1 Series, which focuses on solving problems by spending more "thinking" time through solutions. The o1 mini is 80% cheaper and smaller than its sibling. It performs well in coding and mathematical reasoning tasks. -
9
Llama 2
Meta
FreeThe next generation of the large language model. This release includes modelweights and starting code to pretrained and fine tuned Llama languages models, ranging from 7B-70B parameters. Llama 1 models have a context length of 2 trillion tokens. Llama 2 models have a context length double that of Llama 1. The fine-tuned Llama 2 models have been trained using over 1,000,000 human annotations. Llama 2, a new open-source language model, outperforms many other open-source language models in external benchmarks. These include tests of reasoning, coding and proficiency, as well as knowledge tests. Llama 2 has been pre-trained using publicly available online data sources. Llama-2 chat, a fine-tuned version of the model, is based on publicly available instruction datasets, and more than 1 million human annotations. We have a wide range of supporters in the world who are committed to our open approach for today's AI. These companies have provided early feedback and have expressed excitement to build with Llama 2 -
10
Qwen2-VL
Alibaba
FreeQwen2-VL, the latest version in the Qwen model family of vision language models, is based on Qwen2. Qwen2-VL is a newer version of Qwen-VL that has: SoTA understanding of images with different resolutions & ratios: Qwen2-VL reaches state-of-the art performance on visual understanding benchmarks including MathVista DocVQA RealWorldQA MTVQA etc. Understanding videos over 20 min: Qwen2-VL is able to understand videos longer than 20 minutes, allowing for high-quality video-based questions, dialogs, content creation, and more. Agent that can control your mobiles, robotics, etc. Qwen2-VL, with its complex reasoning and decision-making abilities, can be integrated into devices such as mobile phones, robots and other devices for automatic operation using visual environment and text instruction. Multilingual Support - To serve users worldwide, Qwen2-VL supports texts in other languages within images, besides English or Chinese. -
11
Gemma
Google
Gemma is the family of lightweight open models that are built using the same research and technology as the Gemini models. Gemma was developed by Google DeepMind, along with other teams within Google. The name is derived from the Latin gemma meaning "precious stones". We're also releasing new tools to encourage developer innovation, encourage collaboration, and guide responsible use of Gemma model. Gemma models are based on the same infrastructure and technical components as Gemini, Google's largest and most powerful AI model. Gemma 2B, 7B and other open models can achieve the best performance possible for their size. Gemma models can run directly on a desktop or laptop computer for developers. Gemma is able to surpass much larger models in key benchmarks, while adhering our rigorous standards of safe and responsible outputs. -
12
Mistral 7B
Mistral AI
We solve the most difficult problems to make AI models efficient, helpful and reliable. We are the pioneers of open models. We give them to our users, and empower them to share their ideas. Mistral-7B is a powerful, small model that can be adapted to many different use-cases. Mistral 7B outperforms Llama 13B in all benchmarks. It has 8k sequence length, natural coding capabilities, and is faster than Llama 2. It is released under Apache 2.0 License and we made it simple to deploy on any cloud. -
13
Stable LM
Stability AI
FreeStableLM: Stability AI language models StableLM builds upon our experience with open-sourcing previous language models in collaboration with EleutherAI. This nonprofit research hub. These models include GPTJ, GPTNeoX and the Pythia Suite, which were all trained on The Pile dataset. Cerebras GPT and Dolly-2 are two recent open-source models that continue to build upon these efforts. StableLM was trained on a new dataset that is three times bigger than The Pile and contains 1.5 trillion tokens. We will provide more details about the dataset at a later date. StableLM's richness allows it to perform well in conversational and coding challenges, despite the small size of its dataset (3-7 billion parameters, compared to GPT-3's 175 billion). The development of Stable LM 3B broadens the range of applications that are viable on the edge or on home PCs. This means that individuals and companies can now develop cutting-edge technologies with strong conversational capabilities – like creative writing assistance – while keeping costs low and performance high. -
14
Gemma 2
Google
Gemini models are a family of light-open, state-of-the art models that was created using the same research and technology as Gemini models. These models include comprehensive security measures, and help to ensure responsible and reliable AI through selected data sets. Gemma models have exceptional comparative results, even surpassing some larger open models, in their 2B and 7B sizes. Keras 3.0 offers seamless compatibility with JAX TensorFlow PyTorch and JAX. Gemma 2 has been redesigned to deliver unmatched performance and efficiency. It is optimized for inference on a variety of hardware. The Gemma models are available in a variety of models that can be customized to meet your specific needs. The Gemma models consist of large text-to text lightweight language models that have a decoder and are trained on a large set of text, code, or mathematical content. -
15
Jurassic-1
AI21 Labs
Jurassic-1 comes in two sizes. The Jumbo version is the most advanced language model, with 178B parameters. It was released to developers for general use. AI21 Studio, currently in open beta allows anyone to sign up for the service and immediately begin querying Jurassic-1 with our API and interactive website environment. AI21 Labs' mission is to fundamentally change the way humans read and compose by introducing machines as partners in thought. We can only achieve this if we work together. Since the Mesozoic Era, or 2017, we have been researching language models. Jurassic-1 is based on this research and is the first generation we are making available to wide use. -
16
Baichuan-13B
Baichuan Intelligent Technology
FreeBaichuan-13B, a large-scale language model with 13 billion parameters that is open source and available commercially by Baichuan Intelligent, was developed following Baichuan -7B. It has the best results for a language model of the same size in authoritative Chinese and English benchmarks. This release includes two versions of pretraining (Baichuan-13B Base) and alignment (Baichuan-13B Chat). Baichuan-13B has more data and a larger size. It expands the number parameters to 13 billion based on Baichuan -7B, and trains 1.4 trillion coins on high-quality corpus. This is 40% more than LLaMA-13B. It is open source and currently the model with the most training data in 13B size. Support Chinese and English bi-lingual, use ALiBi code, context window is 4096. -
17
Qwen2
Alibaba
FreeQwen2 is a large language model developed by Qwen Team, Alibaba Cloud. Qwen2 is an extensive series of large language model developed by the Qwen Team at Alibaba Cloud. It includes both base models and instruction-tuned versions, with parameters ranging from 0.5 to 72 billion. It also features dense models and a Mixture of Experts model. The Qwen2 Series is designed to surpass previous open-weight models including its predecessor Qwen1.5 and to compete with proprietary model across a wide spectrum of benchmarks, such as language understanding, generation and multilingual capabilities. -
18
Pixtral 12B
Mistral AI
FreePixtral 12B, a multimodal AI model pioneered by Mistral AI and designed to process and understand both text and images data seamlessly, is a groundbreaking AI model. This model represents a significant advance in the integration of data types. It allows for more intuitive interaction and enhanced content creation abilities. Pixtral 12B, which is based on Mistral's NeMo 12B Text Model, incorporates an additional Vision Adapter that adds 400 million parameters. This allows it to handle visual inputs of up to 1024x1024 pixels. This model is capable of a wide range of applications from image analysis to answering visual content questions. Its versatility is demonstrated in real-world scenarios. Pixtral 12B is a powerful tool for developers, as it not only has a large context of 128k tokens, but also uses innovative techniques such as GeLU activation and RoPE 2D for its vision components. -
19
OpenAI o1
OpenAI
OpenAI o1 is a new series AI models developed by OpenAI that focuses on enhanced reasoning abilities. These models, such as o1 preview and o1 mini, are trained with a novel reinforcement-learning approach that allows them to spend more time "thinking through" problems before presenting answers. This allows o1 excel in complex problem solving tasks in areas such as coding, mathematics, or science, outperforming other models like GPT-4o. The o1 series is designed to tackle problems that require deeper thinking processes. This marks a significant step in AI systems that can think more like humans. -
20
OpenAI o1 Pro
OpenAI
$200/month OpenAI o1 pro is an enhanced version of OpenAI’s o1 model. It was designed to handle more complex and demanding tasks, with greater reliability. It has significant performance improvements compared to its predecessor, the OpenAI o1 Preview, with a noticeable 34% reduction in errors and the ability think 50% faster. This model excels at math, physics and coding where it can provide accurate and detailed solutions. The o1 Pro mode is also capable of processing multimodal inputs including text and images. It is especially adept at reasoning tasks requiring deep thought and problem solving. ChatGPT Pro subscriptions offer unlimited usage as well as enhanced capabilities to users who need advanced AI assistance. -
21
Command R+
Cohere
FreeCommand R+, Cohere's latest large language model, is optimized for conversational interactions and tasks with a long context. It is designed to be extremely performant and enable companies to move from proof-of-concept into production. We recommend Command R+ when working with workflows that rely on complex RAG functionality or multi-step tool usage (agents). Command R is better suited for retrieval augmented creation (RAG) tasks and single-step tool usage, or applications where cost is a key consideration. -
22
Azure OpenAI Service
Microsoft
$0.0004 per 1000 tokensYou can use advanced language models and coding to solve a variety of problems. To build cutting-edge applications, leverage large-scale, generative AI models that have deep understandings of code and language to allow for new reasoning and comprehension. These coding and language models can be applied to a variety use cases, including writing assistance, code generation, reasoning over data, and code generation. Access enterprise-grade Azure security and detect and mitigate harmful use. Access generative models that have been pretrained with trillions upon trillions of words. You can use them to create new scenarios, including code, reasoning, inferencing and comprehension. A simple REST API allows you to customize generative models with labeled information for your particular scenario. To improve the accuracy of your outputs, fine-tune the hyperparameters of your model. You can use the API's few-shot learning capability for more relevant results and to provide examples. -
23
Hermes 3
Nous Research
FreeHermes 3 contains advanced long-term context retention and multi-turn conversation capabilities, complex roleplaying and internal monologue abilities, and enhanced agentic function-calling. Hermes 3 has advanced long-term contextual retention, multi-turn conversation capabilities, complex roleplaying, internal monologue, and enhanced agentic functions-calling. Our training data encourages the model in a very aggressive way to follow the system prompts and instructions exactly and in a highly adaptive manner. Hermes 3 was developed by fine-tuning Llama 3.0 8B, 70B and 405B and training with a dataset primarily containing synthetic responses. The model has a performance that is comparable to Llama 3.1, but with deeper reasoning and creative abilities. Hermes 3 is an instruct and tool-use model series with strong reasoning and creativity abilities. -
24
PaLM 2
Google
PaLM 2 is Google's next-generation large language model, which builds on Google’s research and development in machine learning. It excels in advanced reasoning tasks including code and mathematics, classification and question-answering, translation and multilingual competency, and natural-language generation better than previous state-of the-art LLMs including PaLM. It is able to accomplish these tasks due to the way it has been built - combining compute-optimal scale, an improved dataset mix, and model architecture improvement. PaLM 2 is based on Google's approach for building and deploying AI responsibly. It was rigorously evaluated for its potential biases and harms, as well as its capabilities and downstream applications in research and product applications. It is being used to power generative AI tools and features at Google like Bard, the PaLM API, and other state-ofthe-art models like Sec-PaLM and Med-PaLM 2. -
25
GPT-4 Turbo
OpenAI
$0.0200 per 1000 tokens 1 RatingGPT-4, a large multimodal (accepting text and image inputs) model that can solve complex problems with greater accuracy thanks to its advanced reasoning abilities and broader general knowledge than any of our other models. GPT-4 can be found in the OpenAI API for paying customers. GPT-4, like gpt 3.5-turbo is optimized for chat, but also works well with traditional completion tasks using the Chat Completions API. Our GPT guide will teach you how to use GPT-4. GPT-4 is a newer GPT-4 model that features improved instruction following, JSON Mode, reproducible outputs and parallel function calls. Returns up to 4,096 tokens. This preview model has not yet been adapted for production traffic. -
26
EXAONE
LG
EXAONE, a large-scale language model developed by LG AI Research, aims to nurture "Expert AI" across multiple domains. The Expert AI alliance was formed by leading companies from various fields in order to advance EXAONE's capabilities. Partner companies in the alliance will act as mentors and provide EXAONE with skills, knowledge, data, and other resources to help it gain expertise in relevant fields. EXAONE is akin to an advanced college student who has taken elective courses in general. It requires intensive training to become a specialist in a specific area. LG AI Research has already demonstrated EXAONE’s abilities in real-world applications such as Tilda AI human artist, which debuted at New York Fashion Week. AI applications have also been developed to summarize customer service conversations, and extract information from complex academic documents. -
27
Phi-2
Microsoft
Phi-2 is a 2.7-billion-parameter language-model that shows outstanding reasoning and language-understanding capabilities. It represents the state-of-the art performance among language-base models with less than thirteen billion parameters. Phi-2 can match or even outperform models 25x larger on complex benchmarks, thanks to innovations in model scaling. Phi-2's compact size makes it an ideal playground for researchers. It can be used for exploring mechanistic interpretationability, safety improvements or fine-tuning experiments on a variety tasks. We have included Phi-2 in the Azure AI Studio catalog to encourage research and development of language models. -
28
GPT-4o mini
OpenAI
A small model with superior textual Intelligence and multimodal reasoning. GPT-4o Mini's low cost and low latency enable a wide range of tasks, including applications that chain or paralelize multiple model calls (e.g. calling multiple APIs), send a large amount of context to the models (e.g. full code base or history of conversations), or interact with clients through real-time, fast text responses (e.g. customer support chatbots). GPT-4o Mini supports text and vision today in the API. In the future, it will support text, image and video inputs and outputs. The model supports up to 16K outputs tokens per request and has knowledge until October 2023. It has a context of 128K tokens. The improved tokenizer shared by GPT-4o makes it easier to handle non-English text. -
29
Stable Beluga
Stability AI
FreeStability AI, in collaboration with its CarperAI Lab, announces Stable Beluga 1 (formerly codenamed FreeWilly) and its successor Stable Beluga 2 - two powerful, new Large Language Models. Both models show exceptional reasoning abilities across a variety of benchmarks. Stable Beluga 1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Stable Beluga 2 uses the LLaMA 270B foundation model for industry-leading performance. -
30
CodeGemma
Google
CodeGemma consists of powerful lightweight models that are capable of performing a variety coding tasks, including fill-in the middle code completion, code creation, natural language understanding and mathematical reasoning. CodeGemma offers 3 variants: a 7B model that is pre-trained to perform code completion, code generation, and natural language-to code chat. A 7B model that is instruction-tuned for instruction following and natural language-to code chat. You can complete lines, functions, or even entire blocks of code whether you are working locally or with Google Cloud resources. CodeGemma models are trained on 500 billion tokens primarily of English language data taken from web documents, mathematics and code. They generate code that is not only syntactically accurate but also semantically meaningful. This reduces errors and debugging times. -
31
LongLLaMA
LongLLaMA
FreeThis repository contains a research preview of LongLLaMA. It is a large language-model capable of handling contexts up to 256k tokens. LongLLaMA was built on the foundation of OpenLLaMA, and fine-tuned with the Focused Transformer method. LongLLaMA code was built on the foundation of Code Llama. We release a smaller base variant of the LongLLaMA (not instruction-tuned) on a permissive licence (Apache 2.0), and inference code that supports longer contexts for hugging face. Our model weights are a drop-in replacement for LLaMA (for short contexts up to 2048 tokens) in existing implementations. We also provide evaluation results, and comparisons with the original OpenLLaMA model. -
32
Qwen2.5
QwenLM
FreeQwen2.5, an advanced multimodal AI system, is designed to provide highly accurate responses that are context-aware across a variety of applications. It builds on its predecessors' capabilities, integrating cutting edge natural language understanding, enhanced reasoning, creativity and multimodal processing. Qwen2.5 is able to analyze and generate text as well as interpret images and interact with complex data in real-time. It is highly adaptable and excels at personalized assistance, data analytics, creative content creation, and academic research. This makes it a versatile tool that can be used by professionals and everyday users. Its user-centric approach emphasizes transparency, efficiency and alignment with ethical AI. -
33
Claude Pro
Anthropic
$18/month Claude Pro is a large language model that can handle complex tasks with a friendly and accessible demeanor. It is trained on high-quality, extensive data and excels at understanding contexts, interpreting subtleties, and producing well structured, coherent responses to a variety of topics. Claude Pro is able to create detailed reports, write creative content, summarize long documents, and assist with coding tasks by leveraging its robust reasoning capabilities and refined knowledge base. Its adaptive algorithms constantly improve its ability learn from feedback. This ensures that its output is accurate, reliable and helpful. Whether Claude Pro is serving professionals looking for expert support or individuals seeking quick, informative answers - it delivers a versatile, productive conversational experience. -
34
GPT-5
OpenAI
$0.0200 per 1000 tokensGPT-5 is OpenAI's Generative Pretrained Transformer. It is a large-language model (LLM), which is still in development. LLMs have been trained to work with massive amounts of text and can generate realistic and coherent texts, translate languages, create different types of creative content and answer your question in a way that is informative. It's still not available to the public. OpenAI has not announced a release schedule, but some believe it could launch in 2024. It's expected that GPT-5 will be even more powerful. GPT-4 has already proven to be impressive. It is capable of writing creative content, translating languages and generating text of human-quality. GPT-5 will be expected to improve these abilities, with improved reasoning, factual accuracy and ability to follow directions. -
35
Mixtral 8x22B
Mistral AI
FreeMixtral 8x22B is our latest open model. It sets new standards for performance and efficiency in the AI community. It is a sparse Mixture-of-Experts model (SMoE), which uses only 39B active variables out of 141B. This offers unparalleled cost efficiency in relation to its size. It is fluently bilingual in English, French Italian, German and Spanish. It has strong math and coding skills. It is natively able to call functions; this, along with the constrained-output mode implemented on La Plateforme, enables application development at scale and modernization of tech stacks. Its 64K context window allows for precise information retrieval from large documents. We build models with unmatched cost-efficiency for their respective sizes. This allows us to deliver the best performance-tocost ratio among models provided by the Community. Mixtral 8x22B continues our open model family. Its sparse patterns of activation make it faster than any 70B model. -
36
Command R
Cohere
Command's outputs are accompanied by clear citations, which reduce the risk of hallucinations. They also allow for the retrieval of additional context in the source material. Command can help you write product descriptions, draft emails, provide example press releases and more. Ask Command a series of questions about a particular document to assign it a category, extract information, or answer an overall question. Answering a few questions can save you minutes, but doing it for thousands can save an entire company years. This family of scalable AI models balances high accuracy with high efficiency to allow enterprises to move beyond proof-of-concept into production grade AI. -
37
XLNet
XLNet
FreeXLNet, a new unsupervised language representation method, is based on a novel generalized Permutation Language Modeling Objective. XLNet uses Transformer-XL as its backbone model. This model is excellent for language tasks that require long context. Overall, XLNet achieves state of the art (SOTA) results in various downstream language tasks, including question answering, natural languages inference, sentiment analysis and document ranking. -
38
Qwen-7B
Alibaba
FreeQwen-7B, also known as Qwen-7B, is the 7B-parameter variant of the large language models series Qwen. Tongyi Qianwen, proposed by Alibaba Cloud. Qwen-7B, a Transformer-based language model, is pretrained using a large volume data, such as web texts, books, code, etc. Qwen-7B is also used to train Qwen-7B Chat, an AI assistant that uses large models and alignment techniques. The Qwen-7B features include: Pre-trained with high quality data. We have pretrained Qwen-7B using a large-scale, high-quality dataset that we constructed ourselves. The dataset contains over 2.2 trillion tokens. The dataset contains plain texts and codes and covers a wide range domains including general domain data as well as professional domain data. Strong performance. We outperform our competitors in a series benchmark datasets that evaluate natural language understanding, mathematics and coding. And more. -
39
Jamba
AI21 Labs
Jamba is a powerful and efficient long context model that is open to builders, but built for enterprises. Jamba's latency is superior to all other leading models of similar size. Jamba's 256k window is the longest available. Jamba's Mamba Transformer MoE Architecture is designed to increase efficiency and reduce costs. Jamba includes key features from OOTB, including function calls, JSON output, document objects and citation mode. Jamba 1.5 models deliver high performance throughout the entire context window. Jamba 1.5 models score highly in common quality benchmarks. Secure deployment tailored to your enterprise. Start using Jamba immediately on our production-grade SaaS Platform. Our strategic partners can deploy the Jamba model family. For enterprises who require custom solutions, we offer VPC and on-premise deployments. We offer hands-on management and continuous pre-training for enterprises with unique, bespoke needs. -
40
GPT-4 (Generative Pretrained Transformer 4) a large-scale, unsupervised language model that is yet to be released. GPT-4, which is the successor of GPT-3, is part of the GPT -n series of natural-language processing models. It was trained using a dataset of 45TB text to produce text generation and understanding abilities that are human-like. GPT-4 is not dependent on additional training data, unlike other NLP models. It can generate text and answer questions using its own context. GPT-4 has been demonstrated to be capable of performing a wide range of tasks without any task-specific training data, such as translation, summarization and sentiment analysis.
-
41
MPT-7B
MosaicML
FreeIntroducing MPT-7B - the latest addition to our MosaicML Foundation Series. MPT-7B, a transformer that is trained from scratch using 1T tokens of code and text, is the latest entry in our MosaicML Foundation Series. It is open-source, available for commercial purposes, and has the same quality as LLaMA-7B. MPT-7B trained on the MosaicML Platform in 9.5 days, with zero human interaction at a cost $200k. You can now train, fine-tune and deploy your private MPT models. You can either start from one of our checkpoints, or you can start from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens! -
42
LLaVA
LLaVA
FreeLLaVA is a multimodal model that combines a Vicuna language model with a vision encoder to facilitate comprehensive visual-language understanding. LLaVA's chat capabilities are impressive, emulating multimodal functionality of models such as GPT-4. LLaVA 1.5 has achieved the best performance in 11 benchmarks using publicly available data. It completed training on a single 8A100 node in about one day, beating methods that rely upon billion-scale datasets. The development of LLaVA involved the creation of a multimodal instruction-following dataset, generated using language-only GPT-4. This dataset comprises 158,000 unique language-image instruction-following samples, including conversations, detailed descriptions, and complex reasoning tasks. This data has been crucial in training LLaVA for a wide range of visual and linguistic tasks. -
43
Med-PaLM 2
Google Cloud
Through scientific rigor and human insight, healthcare breakthroughs can change the world, bringing hope to humanity. We believe that AI can help in this area, through collaboration between researchers, healthcare organisations, and the wider ecosystem. Today, we are sharing exciting progress in these initiatives with the announcement that Google's large language model (LLM) for medical applications, called Med PaLM 2, will be available to a limited number of customers. In the coming weeks, it will be available to a small group of Google Cloud users for limited testing. We will explore use cases, share feedback, and investigate safe, responsible and meaningful ways to utilize this technology. Med-PaLM 2, which harnesses Google's LLMs aligned with the medical domain, is able to answer medical questions more accurately and safely. Med-PaLM 2 is the first LLM that has performed at an "expert" level on the MedQA dataset consisting of US Medical Licensing Examination-style questions. -
44
Hippocratic AI
Hippocratic AI
Hippocratic AI, the new SOTA model, is outperforming GPT-4 in 105 of 114 healthcare certifications and exams. Hippocratic AI outperformed GPT-4 in 105 of 114 tests, outperforming by a margin greater than five percent on 74 certifications and by a larger margin on 43 certifications. Most language models are pre-trained on the common crawling of the Internet. This may include incorrect or misleading information. Hippocratic AI, unlike these LLMs is heavily investing in legally acquiring evidenced-based healthcare content. We use healthcare professionals to train the model and validate its readiness for deployment. This is called RLHF-HP. Hippocratic AI won't release the model until many of these licensed professionals have deemed it safe. -
45
Jurassic-2
AI21
$29 per monthJurassic-2 is the latest generation AI21 Studio foundation models. It's a game changer in the field AI, with new capabilities and top-tier quality. We're also releasing task-specific APIs with superior reading and writing capabilities. AI21 Studio's focus is to help businesses and developers leverage reading and writing AI in order to build real-world, tangible products. The release of Task-Specific and Jurassic-2 APIs marks two significant milestones. They will enable you to bring generative AI into production. Jurassic-2 (or J2, as we like to call it) is the next generation of our foundation models with significant improvements in quality and new capabilities including zero-shot instruction-following, reduced latency, and multi-language support. Task-specific APIs offer developers industry-leading APIs for performing specialized reading and/or writing tasks. -
46
Phi-3
Microsoft
Small language models (SLMs), a powerful family of small language models, with low cost and low-latency performance. Maximize AI capabilities and lower resource usage, while ensuring cost-effective generative AI implementations across your applications. Accelerate response time in real-time interaction, autonomous systems, low latency apps, and other critical scenarios. Phi-3 can be run in the cloud, on the edge or on the device. This allows for greater flexibility in deployment and operation. Phi-3 models have been developed according to Microsoft AI principles, including accountability, transparency and fairness, reliability, safety and security, privacy, and inclusivity. Operate efficiently in offline environments, where data privacy or connectivity are limited. Expanded context window allows for more accurate, contextually relevant and coherent outputs. Deploy at edge to deliver faster response. -
47
Chinchilla
Google DeepMind
Chinchilla has a large language. Chinchilla has the same compute budget of Gopher, but 70B more parameters and 4x as much data. Chinchilla consistently and significantly outperforms Gopher 280B, GPT-3 175B, Jurassic-1 178B, and Megatron-Turing (530B) in a wide range of downstream evaluation tasks. Chinchilla also uses less compute to perform fine-tuning, inference and other tasks. This makes it easier for downstream users to use. Chinchilla reaches a high-level average accuracy of 67.5% for the MMLU benchmark. This is a greater than 7% improvement compared to Gopher. -
48
Gemini Ultra
Google
Gemini Ultra is an advanced new language model by Google DeepMind. It is the most powerful and largest model in the Gemini Family, which includes Gemini Pro & Gemini Nano. Gemini Ultra was designed to handle highly complex tasks such as machine translation, code generation, and natural language processing. It is the first language model that has outperformed human experts in the Massive Multitask Language Understanding test (MMLU), achieving a score 90%. -
49
Claude 3.5 Haiku
Anthropic
Our fastest model, which delivers advanced coding, tool usage, and reasoning for an affordable price Claude 3.5 Haiku, our next-generation model, is our fastest. Claude 3.5 Haiku is faster than Claude 3 Haiku and has improved in every skill set. It also surpasses Claude 3 Opus on many intelligence benchmarks. Claude 3.5 Haiku can be accessed via our first-party APIs, Amazon Bedrock and Google Cloud Vertex AI. Initially, it is available as a text only model, with image input coming later. -
50
FLAN-T5
Google
FreeFLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.