Best Jurassic-1 Alternatives in 2024
Find the top alternatives to Jurassic-1 currently available. Compare ratings, reviews, pricing, and features of Jurassic-1 alternatives in 2024. Slashdot lists the best Jurassic-1 alternatives on the market that offer competing products that are similar to Jurassic-1. Sort through Jurassic-1 alternatives below to make the best choice for your needs
-
1
Megatron-Turing
NVIDIA
Megatron-Turing Natural Language Generation Model (MT-NLG) is the largest and most powerful monolithic English language model. It has 530 billion parameters. This 105-layer transformer-based MTNLG improves on the previous state-of-the art models in zero, one, and few shot settings. It is unmatched in its accuracy across a wide range of natural language tasks, including Completion prediction and Reading comprehension. NVIDIA has announced an Early Access Program for its managed API service in MT-NLG Mode. This program will allow customers to experiment with, employ and apply a large language models on downstream language tasks. -
2
Jurassic-2
AI21
$29 per monthJurassic-2 is the latest generation AI21 Studio foundation models. It's a game changer in the field AI, with new capabilities and top-tier quality. We're also releasing task-specific APIs with superior reading and writing capabilities. AI21 Studio's focus is to help businesses and developers leverage reading and writing AI in order to build real-world, tangible products. The release of Task-Specific and Jurassic-2 APIs marks two significant milestones. They will enable you to bring generative AI into production. Jurassic-2 (or J2, as we like to call it) is the next generation of our foundation models with significant improvements in quality and new capabilities including zero-shot instruction-following, reduced latency, and multi-language support. Task-specific APIs offer developers industry-leading APIs for performing specialized reading and/or writing tasks. -
3
Phi-2
Microsoft
Phi-2 is a 2.7-billion-parameter language-model that shows outstanding reasoning and language-understanding capabilities. It represents the state-of-the art performance among language-base models with less than thirteen billion parameters. Phi-2 can match or even outperform models 25x larger on complex benchmarks, thanks to innovations in model scaling. Phi-2's compact size makes it an ideal playground for researchers. It can be used for exploring mechanistic interpretationability, safety improvements or fine-tuning experiments on a variety tasks. We have included Phi-2 in the Azure AI Studio catalog to encourage research and development of language models. -
4
Alpaca
Stanford Center for Research on Foundation Models (CRFM)
Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. These models are now used by many users, and some even for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. It is vital that the academic community engages in order to make maximum progress towards addressing these pressing issues. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI's text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta's LLaMA 7B model. -
5
DeepSeek LLM
DeepSeek
Introducing DeepSeek LLM - an advanced language model with 67 billion parameters. It was trained from scratch using a massive dataset of 2 trillion tokens, both in English and Chinese. To encourage research, we made DeepSeek LLM 67B Base and DeepSeek LLM 67B Chat available as open source to the research community. -
6
Gopher
DeepMind
Language and its role as a means of demonstrating and facilitating understanding - or intelligence, as it is sometimes called - are fundamental to being human. It allows people to express themselves, build memories, and communicate ideas. These are the foundational components of social intelligence. Our teams at DeepMind are interested in the language processing and communication aspects, both for artificial agents and humans. As part of an broader portfolio of AI Research, we believe that the development and study more powerful language models, systems that predict and create text, have tremendous potential to build advanced AI systems. These systems can be used safely and effectively to summarise and provide expert advice, and follow instructions using natural language. Research is needed to determine the potential risks and benefits of language models before they can be developed. -
7
Aya
Cohere AI
Aya is an open-source, state-of-the art, massively multilingual large language research model (LLM), which covers 101 different languages. This is more than twice the number of languages that are covered by open-source models. Aya helps researchers unlock LLMs' powerful potential for dozens of cultures and languages that are largely ignored by the most advanced models available today. We open-source both the Aya Model, as well as the most comprehensive multilingual instruction dataset with 513 million words covering 114 different languages. This data collection contains rare annotations by native and fluent speakers from around the world. This ensures that AI technology is able to effectively serve a global audience who have had limited access up until now. -
8
Stable LM
Stability AI
FreeStableLM: Stability AI language models StableLM builds upon our experience with open-sourcing previous language models in collaboration with EleutherAI. This nonprofit research hub. These models include GPTJ, GPTNeoX and the Pythia Suite, which were all trained on The Pile dataset. Cerebras GPT and Dolly-2 are two recent open-source models that continue to build upon these efforts. StableLM was trained on a new dataset that is three times bigger than The Pile and contains 1.5 trillion tokens. We will provide more details about the dataset at a later date. StableLM's richness allows it to perform well in conversational and coding challenges, despite the small size of its dataset (3-7 billion parameters, compared to GPT-3's 175 billion). The development of Stable LM 3B broadens the range of applications that are viable on the edge or on home PCs. This means that individuals and companies can now develop cutting-edge technologies with strong conversational capabilities – like creative writing assistance – while keeping costs low and performance high. -
9
PanGu-Σ
Huawei
The expansion of large language model has led to significant advancements in natural language processing, understanding and generation. This study introduces a new system that uses Ascend 910 AI processing units and the MindSpore framework in order to train a language with over one trillion parameters, 1.085T specifically, called PanGu-Sigma. This model, which builds on the foundation laid down by PanGu-alpha transforms the traditional dense Transformer model into a sparse model using a concept called Random Routed Experts. The model was trained efficiently on a dataset consisting of 329 billion tokens, using a technique known as Expert Computation and Storage Separation. This led to a 6.3 fold increase in training performance via heterogeneous computer. The experiments show that PanGu-Sigma is a new standard for zero-shot learning in various downstream Chinese NLP tasks. -
10
OPT
Meta
The ability of large language models to learn in zero- and few shots, despite being trained for hundreds of thousands or even millions of days, has been remarkable. These models are expensive to replicate, due to their high computational cost. The few models that are available via APIs do not allow access to the full weights of the model, making it difficult to study. Open Pre-trained Transformers is a suite decoder-only pre-trained transforms with parameters ranging from 175B to 125M. We aim to share this fully and responsibly with interested researchers. We show that OPT-175B has a carbon footprint of 1/7th that of GPT-3. We will also release our logbook, which details the infrastructure challenges we encountered, as well as code for experimenting on all of the released model. -
11
AI21 Studio
AI21 Studio
$29 per monthAI21 Studio provides API access to Jurassic-1 large-language-models. Our models are used to generate text and provide comprehension features in thousands upon thousands of applications. You can tackle any language task. Our Jurassic-1 models can follow natural language instructions and only need a few examples to adapt for new tasks. Our APIs are perfect for common tasks such as paraphrasing, summarization, and more. Superior results at a lower price without having to reinvent the wheel Do you need to fine-tune your custom model? Just 3 clicks away. Training is quick, affordable, and models can be deployed immediately. Embed an AI co-writer into your app to give your users superpowers. Features like paraphrasing, long-form draft generation, repurposing, and custom auto-complete can increase user engagement and help you to achieve success. -
12
GPT-J
EleutherAI
FreeGPT-J, a cutting edge language model developed by EleutherAI, is a leading-edge language model. GPT-J's performance is comparable to OpenAI's GPT-3 model on a variety of zero-shot tasks. GPT-J, in particular, has shown that it can surpass GPT-3 at tasks relating to code generation. The latest version of this language model is GPT-J-6B and is built on a linguistic data set called The Pile. This dataset is publically available and contains 825 gibibytes worth of language data organized into 22 subsets. GPT-J has some similarities with ChatGPT. However, GPTJ is not intended to be a chatbot. Its primary function is to predict texts. Databricks made a major development in March 2023 when they introduced Dolly, an Apache-licensed model that follows instructions. -
13
Qwen
Alibaba
FreeQwen LLM is a family of large-language models (LLMs), developed by Damo Academy, an Alibaba Cloud subsidiary. These models are trained using a large dataset of text and codes, allowing them the ability to understand and generate text that is human-like, translate languages, create different types of creative content and answer your question in an informative manner. Here are some of the key features of Qwen LLMs. Variety of sizes: Qwen's series includes sizes ranging from 1.8 billion parameters to 72 billion, offering options that meet different needs and performance levels. Open source: Certain versions of Qwen have open-source code, which is available to anyone for use and modification. Qwen is multilingual and can translate multiple languages including English, Chinese and Japanese. Qwen models are capable of a wide range of tasks, including text summarization and code generation, as well as generation and translation. -
14
Codestral Mamba
Mistral AI
Codestral Mamba is a Mamba2 model that specializes in code generation. It is available under the Apache 2.0 license. Codestral Mamba represents another step in our efforts to study and provide architectures. We hope that it will open up new perspectives in architecture research. Mamba models have the advantage of linear inference of time and the theoretical ability of modeling sequences of unlimited length. Users can interact with the model in a more extensive way with rapid responses, regardless of the input length. This efficiency is particularly relevant for code productivity use-cases. We trained this model with advanced reasoning and code capabilities, enabling the model to perform at par with SOTA Transformer-based models. -
15
Llama 2
Meta
FreeThe next generation of the large language model. This release includes modelweights and starting code to pretrained and fine tuned Llama languages models, ranging from 7B-70B parameters. Llama 1 models have a context length of 2 trillion tokens. Llama 2 models have a context length double that of Llama 1. The fine-tuned Llama 2 models have been trained using over 1,000,000 human annotations. Llama 2, a new open-source language model, outperforms many other open-source language models in external benchmarks. These include tests of reasoning, coding and proficiency, as well as knowledge tests. Llama 2 has been pre-trained using publicly available online data sources. Llama-2 chat, a fine-tuned version of the model, is based on publicly available instruction datasets, and more than 1 million human annotations. We have a wide range of supporters in the world who are committed to our open approach for today's AI. These companies have provided early feedback and have expressed excitement to build with Llama 2 -
16
Baichuan-13B
Baichuan Intelligent Technology
FreeBaichuan-13B, a large-scale language model with 13 billion parameters that is open source and available commercially by Baichuan Intelligent, was developed following Baichuan -7B. It has the best results for a language model of the same size in authoritative Chinese and English benchmarks. This release includes two versions of pretraining (Baichuan-13B Base) and alignment (Baichuan-13B Chat). Baichuan-13B has more data and a larger size. It expands the number parameters to 13 billion based on Baichuan -7B, and trains 1.4 trillion coins on high-quality corpus. This is 40% more than LLaMA-13B. It is open source and currently the model with the most training data in 13B size. Support Chinese and English bi-lingual, use ALiBi code, context window is 4096. -
17
OpenScholar
Ai2
Ai2 OpenScholar, a collaboration between the University of Washington's Allen Institute for AI and the University of Washington, is designed to help scientists navigate and synthesize the vast expanse of the scientific literature. OpenScholar uses a retrieval-augmented model of language to answer user queries. It does this by identifying relevant papers and then generating answers based on those sources. This ensures that information is accurate and linked directly to existing research. OpenScholar-8B set new standards for factuality and accuracy of citations on the ScholarQABench benchmark. OpenScholar-8B, for example, maintains a solid grounding in real retrieved articles in the biomedical domain. This is in contrast to models like GPT-4 which tend to hallucinate references. Twenty scientists from computer science, biomedicine and physics evaluated OpenScholar's answers against expert-written responses to evaluate its real-world application. -
18
Qwen-7B
Alibaba
FreeQwen-7B, also known as Qwen-7B, is the 7B-parameter variant of the large language models series Qwen. Tongyi Qianwen, proposed by Alibaba Cloud. Qwen-7B, a Transformer-based language model, is pretrained using a large volume data, such as web texts, books, code, etc. Qwen-7B is also used to train Qwen-7B Chat, an AI assistant that uses large models and alignment techniques. The Qwen-7B features include: Pre-trained with high quality data. We have pretrained Qwen-7B using a large-scale, high-quality dataset that we constructed ourselves. The dataset contains over 2.2 trillion tokens. The dataset contains plain texts and codes and covers a wide range domains including general domain data as well as professional domain data. Strong performance. We outperform our competitors in a series benchmark datasets that evaluate natural language understanding, mathematics and coding. And more. -
19
Qwen2
Alibaba
FreeQwen2 is a large language model developed by Qwen Team, Alibaba Cloud. Qwen2 is an extensive series of large language model developed by the Qwen Team at Alibaba Cloud. It includes both base models and instruction-tuned versions, with parameters ranging from 0.5 to 72 billion. It also features dense models and a Mixture of Experts model. The Qwen2 Series is designed to surpass previous open-weight models including its predecessor Qwen1.5 and to compete with proprietary model across a wide spectrum of benchmarks, such as language understanding, generation and multilingual capabilities. -
20
Smaug-72B
Abacus
FreeSmaug 72B is an open-source large-language model (LLM), which is known for its key features. High Performance: It is currently ranked first on the Hugging face Open LLM leaderboard. This model has surpassed models such as GPT-3.5 across a range of benchmarks. This means that it excels in tasks such as understanding, responding to and generating text similar to human speech. Open Source: Smaug-72B, unlike many other advanced LLMs is available to anyone for free use and modification, fostering collaboration, innovation, and creativity in the AI community. Focus on Math and Reasoning: It excels at handling mathematical and reasoning tasks. This is attributed to the unique fine-tuning technologies developed by Abacus, the creators Smaug 72B. Based on Qwen 72B: This is a finely tuned version of another powerful LLM, called Qwen 72B, released by Alibaba. It further improves its capabilities. Smaug-72B is a significant advance in open-source AI. -
21
Gemini Ultra
Google
Gemini Ultra is an advanced new language model by Google DeepMind. It is the most powerful and largest model in the Gemini Family, which includes Gemini Pro & Gemini Nano. Gemini Ultra was designed to handle highly complex tasks such as machine translation, code generation, and natural language processing. It is the first language model that has outperformed human experts in the Massive Multitask Language Understanding test (MMLU), achieving a score 90%. -
22
ERNIE 3.0 Titan
Baidu
Pre-trained models of language have achieved state-of the-art results for various Natural Language Processing (NLP). GPT-3 has demonstrated that scaling up language models pre-trained can further exploit their immense potential. Recently, a framework named ERNIE 3.0 for pre-training large knowledge enhanced models was proposed. This framework trained a model that had 10 billion parameters. ERNIE 3.0 performed better than the current state-of-the art models on a variety of NLP tasks. In order to explore the performance of scaling up ERNIE 3.0, we train a hundred-billion-parameter model called ERNIE 3.0 Titan with up to 260 billion parameters on the PaddlePaddle platform. We also design a self supervised adversarial and a controllable model language loss to make ERNIE Titan generate credible texts. -
23
Gemma
Google
Gemma is the family of lightweight open models that are built using the same research and technology as the Gemini models. Gemma was developed by Google DeepMind, along with other teams within Google. The name is derived from the Latin gemma meaning "precious stones". We're also releasing new tools to encourage developer innovation, encourage collaboration, and guide responsible use of Gemma model. Gemma models are based on the same infrastructure and technical components as Gemini, Google's largest and most powerful AI model. Gemma 2B, 7B and other open models can achieve the best performance possible for their size. Gemma models can run directly on a desktop or laptop computer for developers. Gemma is able to surpass much larger models in key benchmarks, while adhering our rigorous standards of safe and responsible outputs. -
24
LLaMA
Meta
LLaMA (Large Language Model meta AI) is a state of the art foundational large language model that was created to aid researchers in this subfield. LLaMA allows researchers to use smaller, more efficient models to study these models. This furtherdemocratizes access to this rapidly-changing field. Because it takes far less computing power and resources than large language models, such as LLaMA, to test new approaches, validate other's work, and explore new uses, training smaller foundation models like LLaMA can be a desirable option. Foundation models are trained on large amounts of unlabeled data. This makes them perfect for fine-tuning for many tasks. We make LLaMA available in several sizes (7B-13B, 33B and 65B parameters), and also share a LLaMA card that explains how the model was built in line with our Responsible AI practices. -
25
Cerebras-GPT
Cerebras
FreeThe training of state-of-the art language models is extremely difficult. They require large compute budgets, complex distributed computing techniques and deep ML knowledge. Few organizations are able to train large language models from scratch. The number of organizations that do not open source their results is increasing, even though they have the expertise and resources to do so. We at Cerebras believe in open access to the latest models. Cerebras is proud to announce that Cerebras GPT, a family GPT models with 111 million to thirteen billion parameters, has been released to the open-source community. These models are trained using the Chinchilla Formula and provide the highest accuracy within a given computing budget. Cerebras GPT has faster training times and lower training costs. It also consumes less power than any other publicly available model. -
26
Alpa
Alpa
FreeAlpa aims automate large-scale distributed training. Alpa was originally developed by people at UC Berkeley's Sky Lab. Alpa's advanced techniques were described in a paper published by OSDI'2022. Google is adding new members to the Alpa community. A language model is a probabilistic distribution of probability over a sequence of words. It uses all the words it has seen to predict the next word. It is useful in a variety AI applications, including the auto-completion of your email or chatbot service. You can find more information on the language model Wikipedia page. GPT-3 is a large language model with 175 billion parameters that uses deep learning to produce text that looks human-like. GPT-3 was described by many researchers and news articles as "one the most important and interesting AI systems ever created." GPT-3 is being used as a backbone for the latest NLP research. -
27
VideoPoet
Google
VideoPoet, a simple modeling technique, can convert any large language model or autoregressive model into a high quality video generator. It is composed of a few components. The autoregressive model learns from video, image, text, and audio modalities in order to predict the next audio or video token in the sequence. The LLM training framework introduces a mixture of multimodal generative objectives, including text to video, text to image, image-to video, video frame continuation and inpainting/outpainting, styled video, and video-to audio. Moreover, these tasks can be combined to provide additional zero-shot capabilities. This simple recipe shows how language models can edit and synthesize videos with a high level of temporal consistency. -
28
NVIDIA Nemotron
NVIDIA
NVIDIA Nemotron, a family open-source models created by NVIDIA is designed to generate synthetic language data for commercial applications. The Nemotron-4 model 340B is an important release by NVIDIA. It offers developers a powerful tool for generating high-quality data, and filtering it based upon various attributes, using a reward system. -
29
Mistral Large 2
Mistral AI
FreeMistral Large 2 comes with a 128k window that supports dozens of different languages, including French, German and Spanish. It also supports Arabic, Hindi, Russian and Chinese. It also supports 80+ programming languages, including Python, Java and C++. Mistral Large 2 was designed with single-node applications in mind. Its size of 123 million parameters allows it to run fast on a single computer. Mistral Large 2 is released under the Mistral Research License which allows modification and usage for research and noncommercial purposes. -
30
OpenGPT-X
OpenGPT-X
FreeOpenGPT is a German initiative that focuses on developing large AI languages models tailored to European requirements, with an emphasis on versatility, trustworthiness and multilingual capabilities. It also emphasizes open-source accessibility. The project brings together partners to cover the whole generative AI value-chain, from scalable GPU-based infrastructure to data for training large language model to model design, practical applications, and prototypes and proofs-of concept. OpenGPT-X aims at advancing cutting-edge research, with a focus on business applications. This will accelerate the adoption of generative AI within the German economy. The project also stresses responsible AI development to ensure that the models are reliable and aligned with European values and laws. The project provides resources, such as the LLM Workbook and a three part reference guide with examples and resources to help users better understand the key features and characteristics of large AI language model. -
31
Code Llama
Meta
FreeCode Llama, a large-language model (LLM), can generate code using text prompts. Code Llama, the most advanced publicly available LLM for code tasks, has the potential to improve workflows for developers and reduce the barrier for those learning to code. Code Llama can be used to improve productivity and educate programmers to create more robust, well documented software. Code Llama, a state-of the-art LLM, is capable of generating both code, and natural languages about code, based on both code and natural-language prompts. Code Llama can be used for free in research and commercial purposes. Code Llama is a new model that is built on Llama 2. It is available in 3 models: Code Llama is the foundational model of code; Codel Llama is a Python-specific language. Code Llama-Instruct is a finely tuned natural language instruction interpreter. -
32
ChatGLM
Zhipu AI
FreeChatGLM-6B, a Chinese-English bilingual dialogue model based on General Language Model architecture (GLM), has 6.2 billion parameters. Users can deploy model quantization locally on consumer-grade graphic cards (only 6GB video memory required at INT4 quantization levels). ChatGLM-6B is based on technology similar to ChatGPT and optimized for Chinese dialogue and Q&A. After approximately 1T identifiers for Chinese and English bilingual training and supplemented with supervision and fine-tuning as well as feedback self-help and human feedback reinforcement learning, ChatGLM-6B, with 6.2 billion parameters, has been able generate answers that are in line with human preference. -
33
Giga ML
Giga ML
We have just launched the X1 large model series. Giga ML’s most powerful model can be used for pre-training, fine-tuning and on-prem deployment. We are Open AI compliant, so your existing integrations, such as long chain, llama index, and others, will work seamlessly. You can continue to pre-train LLM's using domain-specific databooks or docs, or company documents. The world of large-scale language models (LLMs), which offer unprecedented opportunities for natural language process across different domains, is rapidly expanding. Despite this, there are still some critical challenges that remain unresolved. Giga ML proudly introduces the X1 Large model 32k, a pioneering LLM solution on-premise that addresses these critical challenges. -
34
LTM-1
Magic AI
Magic's LTM-1 provides context windows 50x larger than transformers. Magic has trained a Large Language Model that can take in huge amounts of context to generate suggestions. Magic, our coding assistant can now see all of your code. AI models can refer to more factual and explicit information with larger context windows. They can also reference their own actions history. This research will hopefully improve reliability and coherence. -
35
Medical LLM
John Snow Labs
John Snow Labs Medical LLM is a domain-specific large langauge model (LLM) that revolutionizes the way healthcare organizations harness artificial intelligence. This innovative platform was designed specifically for the healthcare sector, combining cutting edge natural language processing capabilities with a profound understanding of medical terminology and clinical workflows. The result is an innovative tool that allows healthcare providers, researchers and administrators to unlock new insight, improve patient outcomes and drive operational efficiency. The Healthcare LLM's comprehensive training is at the core of its functionality. This includes a vast amount of healthcare data such as clinical notes, research papers and regulatory documents. This specialized training allows for the model to accurately generate and interpret medical text. It is an invaluable tool for tasks such clinical documentation, automated coding and medical research. -
36
LongLLaMA
LongLLaMA
FreeThis repository contains a research preview of LongLLaMA. It is a large language-model capable of handling contexts up to 256k tokens. LongLLaMA was built on the foundation of OpenLLaMA, and fine-tuned with the Focused Transformer method. LongLLaMA code was built on the foundation of Code Llama. We release a smaller base variant of the LongLLaMA (not instruction-tuned) on a permissive licence (Apache 2.0), and inference code that supports longer contexts for hugging face. Our model weights are a drop-in replacement for LLaMA (for short contexts up to 2048 tokens) in existing implementations. We also provide evaluation results, and comparisons with the original OpenLLaMA model. -
37
GPT-4 (Generative Pretrained Transformer 4) a large-scale, unsupervised language model that is yet to be released. GPT-4, which is the successor of GPT-3, is part of the GPT -n series of natural-language processing models. It was trained using a dataset of 45TB text to produce text generation and understanding abilities that are human-like. GPT-4 is not dependent on additional training data, unlike other NLP models. It can generate text and answer questions using its own context. GPT-4 has been demonstrated to be capable of performing a wide range of tasks without any task-specific training data, such as translation, summarization and sentiment analysis.
-
38
Codestral
Mistral AI
FreeWe are proud to introduce Codestral, the first code model we have ever created. Codestral is a generative AI model that is open-weight and specifically designed for code generation. It allows developers to interact and write code using a shared API endpoint for instructions and completion. It can be used for advanced AI applications by software developers as it is able to master both code and English. Codestral has been trained on a large dataset of 80+ languages, including some of the most popular, such as Python and Java. It also includes C, C++ JavaScript, Bash, C, C++. It also performs well with more specific ones, such as Swift and Fortran. Codestral's broad language base allows it to assist developers in a variety of coding environments and projects. -
39
With just a few lines, you can integrate natural language understanding and generation into the product. The Cohere API allows you to access models that can read billions upon billions of pages and learn the meaning, sentiment, intent, and intent of every word we use. You can use the Cohere API for human-like text. Simply fill in a prompt or complete blanks. You can create code, write copy, summarize text, and much more. Calculate the likelihood of text, and retrieve representations from your model. You can filter text using the likelihood API based on selected criteria or categories. You can create your own downstream models for a variety of domain-specific natural languages tasks by using representations. The Cohere API is able to compute the similarity of pieces of text and make categorical predictions based on the likelihood of different text options. The model can see ideas through multiple lenses so it can identify abstract similarities between concepts as distinct from DNA and computers.
-
40
Claude is an artificial intelligence language model that can generate text with human-like processing. Anthropic is an AI safety company and research firm that focuses on building reliable, interpretable and steerable AI systems. While large, general systems can provide significant benefits, they can also be unpredictable, unreliable and opaque. Our goal is to make progress in these areas. We are currently focusing on research to achieve these goals. However, we see many opportunities for our work in the future to create value both commercially and for the public good.
-
41
Med-PaLM 2
Google Cloud
Through scientific rigor and human insight, healthcare breakthroughs can change the world, bringing hope to humanity. We believe that AI can help in this area, through collaboration between researchers, healthcare organisations, and the wider ecosystem. Today, we are sharing exciting progress in these initiatives with the announcement that Google's large language model (LLM) for medical applications, called Med PaLM 2, will be available to a limited number of customers. In the coming weeks, it will be available to a small group of Google Cloud users for limited testing. We will explore use cases, share feedback, and investigate safe, responsible and meaningful ways to utilize this technology. Med-PaLM 2, which harnesses Google's LLMs aligned with the medical domain, is able to answer medical questions more accurately and safely. Med-PaLM 2 is the first LLM that has performed at an "expert" level on the MedQA dataset consisting of US Medical Licensing Examination-style questions. -
42
PaLM 2
Google
PaLM 2 is Google's next-generation large language model, which builds on Google’s research and development in machine learning. It excels in advanced reasoning tasks including code and mathematics, classification and question-answering, translation and multilingual competency, and natural-language generation better than previous state-of the-art LLMs including PaLM. It is able to accomplish these tasks due to the way it has been built - combining compute-optimal scale, an improved dataset mix, and model architecture improvement. PaLM 2 is based on Google's approach for building and deploying AI responsibly. It was rigorously evaluated for its potential biases and harms, as well as its capabilities and downstream applications in research and product applications. It is being used to power generative AI tools and features at Google like Bard, the PaLM API, and other state-ofthe-art models like Sec-PaLM and Med-PaLM 2. -
43
Granite Code
IBM
FreeWe introduce the Granite family of decoder only code models for code generation tasks (e.g. fixing bugs, explaining codes, documenting codes), trained with code in 116 programming language. The Granite Code family has been evaluated on a variety of tasks and demonstrates that the models are consistently at the top of their game among open source code LLMs. Granite Code models have a number of key advantages. Granite Code models are able to perform at a competitive level or even at the cutting edge of technology in a variety of code-related tasks including code generation, explanations, fixing, translation, editing, and more. Demonstrating the ability to solve a variety of coding tasks. IBM's Corporate Legal team guides all models for trustworthy enterprise use. All models are trained using license-permissible datasets collected according to IBM's AI Ethics Principles. -
44
Ntropy
Ntropy
Integrate our Python SDK and Rest API within minutes to ship faster. No data formatting or setup required. As soon as your first customer and data are in, you can start using the system. We have developed and fine-tuned our custom language models in order to recognize entities, crawl the web in real time and select the best match. We can also assign labels with superhuman precision in a fraction the time. Everyone has a data-enrichment model that tries to excel at one thing - whether it's for the US or Europe, or business or consumers. These models are not able to generalize and cannot produce output at the level of a human. You can embed the largest and most efficient models in your products at a fractional cost and time. -
45
PygmalionAI
PygmalionAI
FreePygmalionAI, a community of open-source projects based upon EleutherAI’s GPT-J 6B models and Meta’s LLaMA model, was founded in 2009. Pygmalion AI is designed for roleplaying and chatting. The 7B variant of the Pygmalion AI is currently actively supported. It is based on Meta AI’s LLaMA AI model. Pygmalion's chat capabilities are superior to larger language models that require much more resources. Our curated datasets of high-quality data on roleplaying ensure that your bot is the best RP partner. The model weights as well as the code used to train the model are both open-source. You can modify/re-distribute them for any purpose you like. Pygmalion and other language models run on GPUs because they require fast memory and massive processing to produce coherent text at a reasonable speed. -
46
Gemini 2.0
Google
FreeGemini 2.0, an advanced AI model developed by Google is designed to offer groundbreaking capabilities for natural language understanding, reasoning and multimodal interaction. Gemini 2.0 builds on the success of Gemini's predecessor by integrating large language processing and enhanced problem-solving, decision-making, and interpretation abilities. This allows it to interpret and produce human-like responses more accurately and nuanced. Gemini 2.0, unlike traditional AI models, is trained to handle a variety of data types at once, including text, code, images, etc. This makes it a versatile tool that can be used in research, education, business and creative industries. Its core improvements are better contextual understanding, reduced biased, and a more effective architecture that ensures quicker, more reliable results. Gemini 2.0 is positioned to be a major step in the evolution AI, pushing the limits of human-computer interactions. -
47
DBRX
Databricks
Databricks has created an open, general purpose LLM called DBRX. DBRX is the new benchmark for open LLMs. It also provides open communities and enterprises that are building their own LLMs capabilities that were previously only available through closed model APIs. According to our measurements, DBRX surpasses GPT 3.5 and is competitive with Gemini 1.0 Pro. It is a code model that is more capable than specialized models such as CodeLLaMA 70B, and it also has the strength of a general-purpose LLM. This state-of the-art quality is accompanied by marked improvements in both training and inference performances. DBRX is the most efficient open model thanks to its finely-grained architecture of mixtures of experts (MoE). Inference is 2x faster than LLaMA2-70B and DBRX has about 40% less parameters in total and active count compared to Grok-1. -
48
Qwen2.5
QwenLM
FreeQwen2.5, an advanced multimodal AI system, is designed to provide highly accurate responses that are context-aware across a variety of applications. It builds on its predecessors' capabilities, integrating cutting edge natural language understanding, enhanced reasoning, creativity and multimodal processing. Qwen2.5 is able to analyze and generate text as well as interpret images and interact with complex data in real-time. It is highly adaptable and excels at personalized assistance, data analytics, creative content creation, and academic research. This makes it a versatile tool that can be used by professionals and everyday users. Its user-centric approach emphasizes transparency, efficiency and alignment with ethical AI. -
49
EXAONE
LG
EXAONE, a large-scale language model developed by LG AI Research, aims to nurture "Expert AI" across multiple domains. The Expert AI alliance was formed by leading companies from various fields in order to advance EXAONE's capabilities. Partner companies in the alliance will act as mentors and provide EXAONE with skills, knowledge, data, and other resources to help it gain expertise in relevant fields. EXAONE is akin to an advanced college student who has taken elective courses in general. It requires intensive training to become a specialist in a specific area. LG AI Research has already demonstrated EXAONE’s abilities in real-world applications such as Tilda AI human artist, which debuted at New York Fashion Week. AI applications have also been developed to summarize customer service conversations, and extract information from complex academic documents. -
50
GPT-5
OpenAI
$0.0200 per 1000 tokensGPT-5 is OpenAI's Generative Pretrained Transformer. It is a large-language model (LLM), which is still in development. LLMs have been trained to work with massive amounts of text and can generate realistic and coherent texts, translate languages, create different types of creative content and answer your question in a way that is informative. It's still not available to the public. OpenAI has not announced a release schedule, but some believe it could launch in 2024. It's expected that GPT-5 will be even more powerful. GPT-4 has already proven to be impressive. It is capable of writing creative content, translating languages and generating text of human-quality. GPT-5 will be expected to improve these abilities, with improved reasoning, factual accuracy and ability to follow directions.