Best Dolly Alternatives in 2024

Find the top alternatives to Dolly currently available. Compare ratings, reviews, pricing, and features of Dolly alternatives in 2024. Slashdot lists the best Dolly alternatives on the market that offer competing products that are similar to Dolly. Sort through Dolly alternatives below to make the best choice for your needs

  • 1
    MosaicML Reviews
    With a single command, you can train and serve large AI models in scale. You can simply point to your S3 bucket. We take care of the rest: orchestration, efficiency and node failures. Simple and scalable. MosaicML allows you to train and deploy large AI model on your data in a secure environment. Keep up with the latest techniques, recipes, and foundation models. Our research team has developed and rigorously tested these recipes. In just a few easy steps, you can deploy your private cloud. Your data and models will never leave the firewalls. You can start in one cloud and continue in another without missing a beat. Own the model trained on your data. Model decisions can be better explained by examining them. Filter content and data according to your business needs. Integrate seamlessly with your existing data pipelines and experiment trackers. We are cloud-agnostic and enterprise-proven.
  • 2
    PaLM Reviews
    PaLM API allows you to easily and safely build on top our best language models. We are currently making an efficient model, both in terms of size, and capabilities, available today. We will soon add more sizes. MakerSuite is an intuitive tool that allows you to quickly prototype ideas. Over time, it will include features for prompt engineering and synthetic data generation. It also supports custom-model tuning. All of this is supported by robust safety tools. Only a few developers have access to the PaLM API and MakerSuite in private preview today. Stay tuned for our waitlist.
  • 3
    GPT-J Reviews
    GPT-J, a cutting edge language model developed by EleutherAI, is a leading-edge language model. GPT-J's performance is comparable to OpenAI's GPT-3 model on a variety of zero-shot tasks. GPT-J, in particular, has shown that it can surpass GPT-3 at tasks relating to code generation. The latest version of this language model is GPT-J-6B and is built on a linguistic data set called The Pile. This dataset is publically available and contains 825 gibibytes worth of language data organized into 22 subsets. GPT-J has some similarities with ChatGPT. However, GPTJ is not intended to be a chatbot. Its primary function is to predict texts. Databricks made a major development in March 2023 when they introduced Dolly, an Apache-licensed model that follows instructions.
  • 4
    mT5 Reviews
    Multilingual T5 is a massively pretrained text-totext transformer model that has been trained using a similar recipe to T5. This repo can used to reproduce the experiments described in the mT5 article. The mC4 corpus covers 101 languages. Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, and more.
  • 5
    Alpaca Reviews

    Alpaca

    Stanford Center for Research on Foundation Models (CRFM)

    Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. These models are now used by many users, and some even for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. It is vital that the academic community engages in order to make maximum progress towards addressing these pressing issues. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI's text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta's LLaMA 7B model.
  • 6
    Stable LM Reviews
    StableLM: Stability AI language models StableLM builds upon our experience with open-sourcing previous language models in collaboration with EleutherAI. This nonprofit research hub. These models include GPTJ, GPTNeoX and the Pythia Suite, which were all trained on The Pile dataset. Cerebras GPT and Dolly-2 are two recent open-source models that continue to build upon these efforts. StableLM was trained on a new dataset that is three times bigger than The Pile and contains 1.5 trillion tokens. We will provide more details about the dataset at a later date. StableLM's richness allows it to perform well in conversational and coding challenges, despite the small size of its dataset (3-7 billion parameters, compared to GPT-3's 175 billion). The development of Stable LM 3B broadens the range of applications that are viable on the edge or on home PCs. This means that individuals and companies can now develop cutting-edge technologies with strong conversational capabilities – like creative writing assistance – while keeping costs low and performance high.
  • 7
    Baichuan-13B Reviews

    Baichuan-13B

    Baichuan Intelligent Technology

    Free
    Baichuan-13B, a large-scale language model with 13 billion parameters that is open source and available commercially by Baichuan Intelligent, was developed following Baichuan -7B. It has the best results for a language model of the same size in authoritative Chinese and English benchmarks. This release includes two versions of pretraining (Baichuan-13B Base) and alignment (Baichuan-13B Chat). Baichuan-13B has more data and a larger size. It expands the number parameters to 13 billion based on Baichuan -7B, and trains 1.4 trillion coins on high-quality corpus. This is 40% more than LLaMA-13B. It is open source and currently the model with the most training data in 13B size. Support Chinese and English bi-lingual, use ALiBi code, context window is 4096.
  • 8
    PygmalionAI Reviews
    PygmalionAI, a community of open-source projects based upon EleutherAI’s GPT-J 6B models and Meta’s LLaMA model, was founded in 2009. Pygmalion AI is designed for roleplaying and chatting. The 7B variant of the Pygmalion AI is currently actively supported. It is based on Meta AI’s LLaMA AI model. Pygmalion's chat capabilities are superior to larger language models that require much more resources. Our curated datasets of high-quality data on roleplaying ensure that your bot is the best RP partner. The model weights as well as the code used to train the model are both open-source. You can modify/re-distribute them for any purpose you like. Pygmalion and other language models run on GPUs because they require fast memory and massive processing to produce coherent text at a reasonable speed.
  • 9
    Falcon-40B Reviews

    Falcon-40B

    Technology Innovation Institute (TII)

    Free
    Falcon-40B is a 40B parameter causal decoder model, built by TII. It was trained on 1,000B tokens from RefinedWeb enhanced by curated corpora. It is available under the Apache 2.0 licence. Why use Falcon-40B Falcon-40B is the best open source model available. Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc. OpenLLM Leaderboard. It has an architecture optimized for inference with FlashAttention, multiquery and multiquery. It is available under an Apache 2.0 license that allows commercial use without any restrictions or royalties. This is a raw model that should be finetuned to fit most uses. If you're looking for a model that can take generic instructions in chat format, we suggest Falcon-40B Instruct.
  • 10
    Vicuna Reviews
    Vicuna-13B, an open-source chatbot, is trained by fine-tuning LLaMA using user-shared conversations from ShareGPT. Vicuna-13B's preliminary evaluation using GPT-4, as a judge, shows that it achieves a quality of more than 90%* for OpenAI ChatGPT or Google Bard and outperforms other models such as LLaMA or Stanford Alpaca. Vicuna-13B costs around $300 to train. The online demo and the code, along with weights, are available to non-commercial users.
  • 11
    Llama 2 Reviews
    The next generation of the large language model. This release includes modelweights and starting code to pretrained and fine tuned Llama languages models, ranging from 7B-70B parameters. Llama 1 models have a context length of 2 trillion tokens. Llama 2 models have a context length double that of Llama 1. The fine-tuned Llama 2 models have been trained using over 1,000,000 human annotations. Llama 2, a new open-source language model, outperforms many other open-source language models in external benchmarks. These include tests of reasoning, coding and proficiency, as well as knowledge tests. Llama 2 has been pre-trained using publicly available online data sources. Llama-2 chat, a fine-tuned version of the model, is based on publicly available instruction datasets, and more than 1 million human annotations. We have a wide range of supporters in the world who are committed to our open approach for today's AI. These companies have provided early feedback and have expressed excitement to build with Llama 2
  • 12
    GPT-4 Turbo Reviews

    GPT-4 Turbo

    OpenAI

    $0.0200 per 1000 tokens
    1 Rating
    GPT-4, a large multimodal (accepting text and image inputs) model that can solve complex problems with greater accuracy thanks to its advanced reasoning abilities and broader general knowledge than any of our other models. GPT-4 can be found in the OpenAI API for paying customers. GPT-4, like gpt 3.5-turbo is optimized for chat, but also works well with traditional completion tasks using the Chat Completions API. Our GPT guide will teach you how to use GPT-4. GPT-4 is a newer GPT-4 model that features improved instruction following, JSON Mode, reproducible outputs and parallel function calls. Returns up to 4,096 tokens. This preview model has not yet been adapted for production traffic.
  • 13
    ChatGLM-6B Reviews
    ChatGLM-6B, a Chinese-English bilingual dialogue model based on General Language Model architecture (GLM), has 6.2 billion parameters. Users can deploy model quantization locally on consumer-grade graphic cards (only 6GB video memory required at INT4 quantization levels). ChatGLM-6B is based on technology similar to ChatGPT and optimized for Chinese dialogue and Q&A. After approximately 1T identifiers for Chinese and English bilingual training and supplemented with supervision and fine-tuning as well as feedback self-help and human feedback reinforcement learning, ChatGLM-6B, with 6.2 billion parameters, has been able generate answers that are in line with human preference.
  • 14
    GPT-5 Reviews

    GPT-5

    OpenAI

    $0.0200 per 1000 tokens
    GPT-5 is OpenAI's Generative Pretrained Transformer. It is a large-language model (LLM), which is still in development. LLMs have been trained to work with massive amounts of text and can generate realistic and coherent texts, translate languages, create different types of creative content and answer your question in a way that is informative. It's still not available to the public. OpenAI has not announced a release schedule, but some believe it could launch in 2024. It's expected that GPT-5 will be even more powerful. GPT-4 has already proven to be impressive. It is capable of writing creative content, translating languages and generating text of human-quality. GPT-5 will be expected to improve these abilities, with improved reasoning, factual accuracy and ability to follow directions.
  • 15
    FreeWilly Reviews
    Stability AI, in collaboration with its CarperAI Lab, is proud to announce FreeWilly1 (and its successor FreeWilly2), two powerful, new Large Language Models. Both models show exceptional reasoning abilities across a variety of benchmarks. FreeWilly1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. FreeWilly2 uses the LLaMA 70B foundation model in order to achieve a performance that is comparable with GPT-3.5 on some tasks. The FreeWilly models were inspired by Microsoft's "Orca: Progressive Learning from Complex Explanation traces of GPT-4" paper. While our data generation processes are similar, our data sources differ.
  • 16
    Qwen Reviews
    Qwen LLM is a family of large-language models (LLMs), developed by Damo Academy, an Alibaba Cloud subsidiary. These models are trained using a large dataset of text and codes, allowing them the ability to understand and generate text that is human-like, translate languages, create different types of creative content and answer your question in an informative manner. Here are some of the key features of Qwen LLMs. Variety of sizes: Qwen's series includes sizes ranging from 1.8 billion parameters to 72 billion, offering options that meet different needs and performance levels. Open source: Certain versions of Qwen have open-source code, which is available to anyone for use and modification. Qwen is multilingual and can translate multiple languages including English, Chinese and Japanese. Qwen models are capable of a wide range of tasks, including text summarization and code generation, as well as generation and translation.
  • 17
    ChatGPT Plus Reviews
    We've developed a model, called ChatGPT, that interacts in a conversational manner. ChatGPT can use the dialogue format to answer questions, admit mistakes, challenge incorrect premises and reject inappropriate requests. ChatGPT is the sibling model of InstructGPT. InstructGPT is trained to follow a prompt, and then provide a detailed answer. ChatGPT Plus, a subscription plan to ChatGPT, a conversational AI. ChatGPT Plus is $20/month and subscribers receive a variety of benefits. - ChatGPT is available to all users, even at peak times - Faster response time Access to GPT-4 ChatGPT plugins Chat with Web-browsingGPT - Priority access for new features and improvements ChatGPT Plus will be available to all customers in the United States. We will begin inviting people on our waitlist within the next few weeks. We plan to extend access and support to other countries and regions in the near future.
  • 18
    InstructGPT Reviews

    InstructGPT

    OpenAI

    $0.0200 per 1000 tokens
    InstructGPT is an open source framework that trains language models to generate natural language instruction from visual input. It uses a generative, pre-trained transformer model (GPT) and the state of the art object detector Mask R-CNN to detect objects in images. Natural language sentences are then generated that describe the image. InstructGPT has been designed to be useful in all domains including robotics, gaming, and education. It can help robots navigate complex tasks using natural language instructions or it can help students learn by giving descriptive explanations of events or processes.
  • 19
    GPT-3.5 Reviews

    GPT-3.5

    OpenAI

    $0.0200 per 1000 tokens
    1 Rating
    GPT-3.5 is the next evolution to GPT 3 large language model, OpenAI. GPT-3.5 models are able to understand and generate natural languages. There are four main models available with different power levels that can be used for different tasks. The main GPT-3.5 models can be used with the text completion endpoint. There are models that can be used with other endpoints. Davinci is the most versatile model family. It can perform all tasks that other models can do, often with less instruction. Davinci is the best choice for applications that require a deep understanding of the content. This includes summarizations for specific audiences and creative content generation. These higher capabilities mean that Davinci is more expensive per API call and takes longer to process than other models.
  • 20
    ChatGPT Reviews
    ChatGPT is an OpenAI language model. It can generate human-like responses to a variety prompts, and has been trained on a wide range of internet texts. ChatGPT can be used to perform natural language processing tasks such as conversation, question answering, and text generation. ChatGPT is a pretrained language model that uses deep-learning algorithms to generate text. It was trained using large amounts of text data. This allows it to respond to a wide variety of prompts with human-like ease. It has a transformer architecture that has been proven to be efficient in many NLP tasks. ChatGPT can generate text in addition to answering questions, text classification and language translation. This allows developers to create powerful NLP applications that can do specific tasks more accurately. ChatGPT can also process code and generate it.
  • 21
    Aya Reviews
    Aya is an open-source, state-of-the art, massively multilingual large language research model (LLM), which covers 101 different languages. This is more than twice the number of languages that are covered by open-source models. Aya helps researchers unlock LLMs' powerful potential for dozens of cultures and languages that are largely ignored by the most advanced models available today. We open-source both the Aya Model, as well as the most comprehensive multilingual instruction dataset with 513 million words covering 114 different languages. This data collection contains rare annotations by native and fluent speakers from around the world. This ensures that AI technology is able to effectively serve a global audience who have had limited access up until now.
  • 22
    GPT4All Reviews
    GPT4All provides an ecosystem for training and deploying large language models, which run locally on consumer CPUs. The goal is to be the best assistant-style language models that anyone or any enterprise can freely use and distribute. A GPT4All is a 3GB to 8GB file you can download and plug in the GPT4All ecosystem software. Nomic AI maintains and supports this software ecosystem in order to enforce quality and safety, and to enable any person or company to easily train and deploy large language models on the edge. Data is a key ingredient in building a powerful and general-purpose large-language model. The GPT4All Community has created the GPT4All Open Source Data Lake as a staging area for contributing instruction and assistance tuning data for future GPT4All Model Trains.
  • 23
    Samsung Gauss Reviews
    Samsung Gauss, a new AI-model developed by Samsung Electronics, is a powerful AI tool. It is a large-language model (LLM) which has been trained using a massive dataset. Samsung Gauss can generate text, translate different languages, create creative content and answer questions in a helpful way. Samsung Gauss, which is still in development, has already mastered many tasks, including Follow instructions and complete requests with care. Answering questions in an informative and comprehensive way, even when they are open-ended, challenging or strange. Creating different creative text formats such as poems, code, musical pieces, emails, letters, etc. Here are some examples to show what Samsung Gauss is capable of: Translation: Samsung Gauss is able to translate text between many languages, including English and German, as well as Spanish, Chinese, Japanese and Korean. Coding: Samsung Gauss can generate code.
  • 24
    LongLLaMA Reviews
    This repository contains a research preview of LongLLaMA. It is a large language-model capable of handling contexts up to 256k tokens. LongLLaMA was built on the foundation of OpenLLaMA, and fine-tuned with the Focused Transformer method. LongLLaMA code was built on the foundation of Code Llama. We release a smaller base variant of the LongLLaMA (not instruction-tuned) on a permissive licence (Apache 2.0), and inference code that supports longer contexts for hugging face. Our model weights are a drop-in replacement for LLaMA (for short contexts up to 2048 tokens) in existing implementations. We also provide evaluation results, and comparisons with the original OpenLLaMA model.
  • 25
    Jurassic-2 Reviews
    Jurassic-2 is the latest generation AI21 Studio foundation models. It's a game changer in the field AI, with new capabilities and top-tier quality. We're also releasing task-specific APIs with superior reading and writing capabilities. AI21 Studio's focus is to help businesses and developers leverage reading and writing AI in order to build real-world, tangible products. The release of Task-Specific and Jurassic-2 APIs marks two significant milestones. They will enable you to bring generative AI into production. Jurassic-2 (or J2, as we like to call it) is the next generation of our foundation models with significant improvements in quality and new capabilities including zero-shot instruction-following, reduced latency, and multi-language support. Task-specific APIs offer developers industry-leading APIs for performing specialized reading and/or writing tasks.
  • 26
    MPT-7B Reviews
    Introducing MPT-7B - the latest addition to our MosaicML Foundation Series. MPT-7B, a transformer that is trained from scratch using 1T tokens of code and text, is the latest entry in our MosaicML Foundation Series. It is open-source, available for commercial purposes, and has the same quality as LLaMA-7B. MPT-7B trained on the MosaicML Platform in 9.5 days, with zero human interaction at a cost $200k. You can now train, fine-tune and deploy your private MPT models. You can either start from one of our checkpoints, or you can start from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!
  • 27
    Cerebras-GPT Reviews
    The training of state-of-the art language models is extremely difficult. They require large compute budgets, complex distributed computing techniques and deep ML knowledge. Few organizations are able to train large language models from scratch. The number of organizations that do not open source their results is increasing, even though they have the expertise and resources to do so. We at Cerebras believe in open access to the latest models. Cerebras is proud to announce that Cerebras GPT, a family GPT models with 111 million to thirteen billion parameters, has been released to the open-source community. These models are trained using the Chinchilla Formula and provide the highest accuracy within a given computing budget. Cerebras GPT has faster training times and lower training costs. It also consumes less power than any other publicly available model.
  • 28
    GPT-NeoX Reviews
    A model parallel autoregressive transformator implementation on GPUs based on the DeepSpeed Library. This repository contains EleutherAI’s library for training large language models on GPUs. Our current framework is based upon NVIDIA's Megatron Language Model, and has been enhanced with techniques from DeepSpeed, as well as some novel improvements. This repo is intended to be a central and accessible place for techniques to train large-scale autoregressive models and to accelerate research into large scale training.
  • 29
    OPT Reviews
    The ability of large language models to learn in zero- and few shots, despite being trained for hundreds of thousands or even millions of days, has been remarkable. These models are expensive to replicate, due to their high computational cost. The few models that are available via APIs do not allow access to the full weights of the model, making it difficult to study. Open Pre-trained Transformers is a suite decoder-only pre-trained transforms with parameters ranging from 175B to 125M. We aim to share this fully and responsibly with interested researchers. We show that OPT-175B has a carbon footprint of 1/7th that of GPT-3. We will also release our logbook, which details the infrastructure challenges we encountered, as well as code for experimenting on all of the released model.
  • 30
    RedPajama Reviews
    GPT-4 and other foundation models have accelerated AI's development. The most powerful models, however, are closed commercial models or partially open. RedPajama aims to create a set leading, open-source models. Today, we're excited to announce that the first phase of this project is complete: the reproduction of LLaMA's training dataset of more than 1.2 trillion tokens. The most capable foundations models are currently closed behind commercial APIs. This limits research, customization and their use with sensitive information. If the open community can bridge the quality gap between closed and open models, fully open-source models could be the answer to these limitations. Recent progress has been made in this area. AI is in many ways having its Linux moment. Stable Diffusion demonstrated that open-source software can not only compete with commercial offerings such as DALL-E, but also lead to incredible creative results from community participation.
  • 31
    DeepSeek LLM Reviews
    Introducing DeepSeek LLM - an advanced language model with 67 billion parameters. It was trained from scratch using a massive dataset of 2 trillion tokens, both in English and Chinese. To encourage research, we made DeepSeek LLM 67B Base and DeepSeek LLM 67B Chat available as open source to the research community.
  • 32
    DBRX Reviews
    Databricks has created an open, general purpose LLM called DBRX. DBRX is the new benchmark for open LLMs. It also provides open communities and enterprises that are building their own LLMs capabilities that were previously only available through closed model APIs. According to our measurements, DBRX surpasses GPT 3.5 and is competitive with Gemini 1.0 Pro. It is a code model that is more capable than specialized models such as CodeLLaMA 70B, and it also has the strength of a general-purpose LLM. This state-of the-art quality is accompanied by marked improvements in both training and inference performances. DBRX is the most efficient open model thanks to its finely-grained architecture of mixtures of experts (MoE). Inference is 2x faster than LLaMA2-70B and DBRX has about 40% less parameters in total and active count compared to Grok-1.
  • 33
    Smaug-72B Reviews
    Smaug 72B is an open-source large-language model (LLM), which is known for its key features. High Performance: It is currently ranked first on the Hugging face Open LLM leaderboard. This model has surpassed models such as GPT-3.5 across a range of benchmarks. This means that it excels in tasks such as understanding, responding to and generating text similar to human speech. Open Source: Smaug-72B, unlike many other advanced LLMs is available to anyone for free use and modification, fostering collaboration, innovation, and creativity in the AI community. Focus on Math and Reasoning: It excels at handling mathematical and reasoning tasks. This is attributed to the unique fine-tuning technologies developed by Abacus, the creators Smaug 72B. Based on Qwen 72B: This is a finely tuned version of another powerful LLM, called Qwen 72B, released by Alibaba. It further improves its capabilities. Smaug-72B is a significant advance in open-source AI.
  • 34
    Falcon-7B Reviews

    Falcon-7B

    Technology Innovation Institute (TII)

    Free
    Falcon-7B is a 7B parameter causal decoder model, built by TII. It was trained on 1,500B tokens from RefinedWeb enhanced by curated corpora. It is available under the Apache 2.0 licence. Why use Falcon-7B Falcon-7B? It outperforms similar open-source models, such as MPT-7B StableLM RedPajama, etc. It is a result of being trained using 1,500B tokens from RefinedWeb enhanced by curated corpora. OpenLLM Leaderboard. It has an architecture optimized for inference with FlashAttention, multiquery and multiquery. It is available under an Apache 2.0 license that allows commercial use without any restrictions or royalties.
  • 35
    GPT-3 Reviews

    GPT-3

    OpenAI

    $0.0200 per 1000 tokens
    1 Rating
    GPT-3 models are capable of understanding and generating natural language. There are four main models available, each with a different level of power and suitable for different tasks. Ada is the fastest and most capable model while Davinci is our most powerful. GPT-3 models are designed to be used in conjunction with the text completion endpoint. There are models that can be used with other endpoints. Davinci is the most versatile model family. It can perform all tasks that other models can do, often with less instruction. Davinci is the best choice for applications that require a deep understanding of the content. This includes summarizations for specific audiences and creative content generation. These higher capabilities mean that Davinci is more expensive per API call and takes longer to process than other models.
  • 36
    Code Llama Reviews
    Code Llama, a large-language model (LLM), can generate code using text prompts. Code Llama, the most advanced publicly available LLM for code tasks, has the potential to improve workflows for developers and reduce the barrier for those learning to code. Code Llama can be used to improve productivity and educate programmers to create more robust, well documented software. Code Llama, a state-of the-art LLM, is capable of generating both code, and natural languages about code, based on both code and natural-language prompts. Code Llama can be used for free in research and commercial purposes. Code Llama is a new model that is built on Llama 2. It is available in 3 models: Code Llama is the foundational model of code; Codel Llama is a Python-specific language. Code Llama-Instruct is a finely tuned natural language instruction interpreter.
  • 37
    Ferret Reviews
    A MLLM system that accepts any form of referral and grounds anything in response. Ferret Model- Hybrid Region representation + Spatial-aware visual sampler allows for fine-grained and open vocabulary referring and grounding. GRIT Dataset - A large-scale, hierarchical, robust ground-and refer instruction tuning dataset. Ferret Bench - A multimodal benchmark that requires Referring/Grounding as well as Semantics, Knowledge and Reasoning.
  • 38
    AI21 Studio Reviews

    AI21 Studio

    AI21 Studio

    $29 per month
    AI21 Studio provides API access to Jurassic-1 large-language-models. Our models are used to generate text and provide comprehension features in thousands upon thousands of applications. You can tackle any language task. Our Jurassic-1 models can follow natural language instructions and only need a few examples to adapt for new tasks. Our APIs are perfect for common tasks such as paraphrasing, summarization, and more. Superior results at a lower price without having to reinvent the wheel Do you need to fine-tune your custom model? Just 3 clicks away. Training is quick, affordable, and models can be deployed immediately. Embed an AI co-writer into your app to give your users superpowers. Features like paraphrasing, long-form draft generation, repurposing, and custom auto-complete can increase user engagement and help you to achieve success.
  • 39
    Gopher Reviews
    Language and its role as a means of demonstrating and facilitating understanding - or intelligence, as it is sometimes called - are fundamental to being human. It allows people to express themselves, build memories, and communicate ideas. These are the foundational components of social intelligence. Our teams at DeepMind are interested in the language processing and communication aspects, both for artificial agents and humans. As part of an broader portfolio of AI Research, we believe that the development and study more powerful language models, systems that predict and create text, have tremendous potential to build advanced AI systems. These systems can be used safely and effectively to summarise and provide expert advice, and follow instructions using natural language. Research is needed to determine the potential risks and benefits of language models before they can be developed.
  • 40
    Codestral Reviews
    We are proud to introduce Codestral, the first code model we have ever created. Codestral is a generative AI model that is open-weight and specifically designed for code generation. It allows developers to interact and write code using a shared API endpoint for instructions and completion. It can be used for advanced AI applications by software developers as it is able to master both code and English. Codestral has been trained on a large dataset of 80+ languages, including some of the most popular, such as Python and Java. It also includes C, C++ JavaScript, Bash, C, C++. It also performs well with more specific ones, such as Swift and Fortran. Codestral's broad language base allows it to assist developers in a variety of coding environments and projects.
  • 41
    CodeGen Reviews
    CodeGen is a model for program synthesis that is open-source. Trained on TPU v4. OpenAI Codex is competitive with TPU-v4.
  • 42
    Galactica Reviews
    Information overload is a major barrier to scientific progress. The explosion of scientific literature and data makes it harder to find useful insights among a vast amount of information. Search engines are used to access scientific knowledge today, but they cannot organize it. Galactica is an extensive language model which can store, combine, and reason about scientific information. We train using a large corpus of scientific papers, reference material and knowledge bases, among other sources. We outperform other models in a variety of scientific tasks. Galactica performs better than the latest GPT-3 on technical knowledge probes like LaTeX Equations by 68.2% to 49.0%. Galactica is also good at reasoning. It outperforms Chinchilla in mathematical MMLU with a score between 41.3% and 35.7%. And PaLM 540B in MATH, with a score between 20.4% and 8.8%.
  • 43
    Gemma Reviews
    Gemma is the family of lightweight open models that are built using the same research and technology as the Gemini models. Gemma was developed by Google DeepMind, along with other teams within Google. The name is derived from the Latin gemma meaning "precious stones". We're also releasing new tools to encourage developer innovation, encourage collaboration, and guide responsible use of Gemma model. Gemma models are based on the same infrastructure and technical components as Gemini, Google's largest and most powerful AI model. Gemma 2B, 7B and other open models can achieve the best performance possible for their size. Gemma models can run directly on a desktop or laptop computer for developers. Gemma is able to surpass much larger models in key benchmarks, while adhering our rigorous standards of safe and responsible outputs.
  • 44
    GPT-4V (Vision) Reviews
    GPT-4 with Vision (GPT-4V), our latest capability, allows users to instruct GPT-4 on how to analyze images input by the user. Some researchers and developers of artificial intelligence consider the incorporation of additional modalities, such as image inputs, into large language models. Multimodal LLMs can be used to expand the impact of existing language-only systems by providing them with novel interfaces, capabilities and experiences. In this system card we analyze the GPT-4V safety properties. We have built on the safety work for GPT-4V and here we go deeper into the evaluations and preparations for image inputs.
  • 45
    StarCoder Reviews
    StarCoderBase and StarCoder are Large Language Models (Code LLMs), trained on permissively-licensed data from GitHub. This includes data from 80+ programming language, Git commits and issues, Jupyter Notebooks, and Git commits. We trained a 15B-parameter model for 1 trillion tokens, similar to LLaMA. We refined the StarCoderBase for 35B Python tokens. The result is a new model we call StarCoder. StarCoderBase is a model that outperforms other open Code LLMs in popular programming benchmarks. It also matches or exceeds closed models like code-cushman001 from OpenAI, the original Codex model which powered early versions GitHub Copilot. StarCoder models are able to process more input with a context length over 8,000 tokens than any other open LLM. This allows for a variety of interesting applications. By prompting the StarCoder model with a series dialogues, we allowed them to act like a technical assistant.
  • 46
    Phi-2 Reviews
    Phi-2 is a 2.7-billion-parameter language-model that shows outstanding reasoning and language-understanding capabilities. It represents the state-of-the art performance among language-base models with less than thirteen billion parameters. Phi-2 can match or even outperform models 25x larger on complex benchmarks, thanks to innovations in model scaling. Phi-2's compact size makes it an ideal playground for researchers. It can be used for exploring mechanistic interpretationability, safety improvements or fine-tuning experiments on a variety tasks. We have included Phi-2 in the Azure AI Studio catalog to encourage research and development of language models.
  • 47
    Jurassic-1 Reviews
    Jurassic-1 comes in two sizes. The Jumbo version is the most advanced language model, with 178B parameters. It was released to developers for general use. AI21 Studio, currently in open beta allows anyone to sign up for the service and immediately begin querying Jurassic-1 with our API and interactive website environment. AI21 Labs' mission is to fundamentally change the way humans read and compose by introducing machines as partners in thought. We can only achieve this if we work together. Since the Mesozoic Era, or 2017, we have been researching language models. Jurassic-1 is based on this research and is the first generation we are making available to wide use.
  • 48
    VideoPoet Reviews
    VideoPoet, a simple modeling technique, can convert any large language model or autoregressive model into a high quality video generator. It is composed of a few components. The autoregressive model learns from video, image, text, and audio modalities in order to predict the next audio or video token in the sequence. The LLM training framework introduces a mixture of multimodal generative objectives, including text to video, text to image, image-to video, video frame continuation and inpainting/outpainting, styled video, and video-to audio. Moreover, these tasks can be combined to provide additional zero-shot capabilities. This simple recipe shows how language models can edit and synthesize videos with a high level of temporal consistency.
  • 49
    OpenLLaMA Reviews
    OpenLLaMA, a permissively-licensed open source reproduction of Meta AI’s LLaMA 7B, is trained on the RedPajama data set. Our model weights are a drop-in replacement for LLaMA7B in existing implementations. We also offer a smaller 3B version of the LLaMA Model.
  • 50
    NLP Cloud Reviews

    NLP Cloud

    NLP Cloud

    $29 per month
    Production-ready AI models that are fast and accurate. High-availability inference API that leverages the most advanced NVIDIA GPUs. We have selected the most popular open-source natural language processing models (NLP) and deployed them for the community. You can fine-tune your models (including GPT-J) or upload your custom models. Then, deploy them to production. Upload your AI models, including GPT-J, to your dashboard and immediately use them in production.