TinyLlama Description
The TinyLlama Project aims to pretrain an 1.1B Llama on 3 trillion tokens. We can achieve this in "just" 90 day using 16 A100-40G graphics cards with some optimization. We used the exact same architecture and tokenizers as Llama 2 TinyLlama is compatible with many open-source Llama projects. TinyLlama has only 1.1B of parameters. This compactness allows TinyLlama to be used for a variety of applications that require a small computation and memory footprint.
TinyLlama Alternatives
OLMo 2
OLMo 2 is an open language model family developed by the Allen Institute for AI. It provides researchers and developers with open-source code and reproducible training recipes. These models can be trained with up to 5 trillion tokens, and they are competitive against other open-weight models such as Llama 3.0 on English academic benchmarks. OLMo 2 focuses on training stability by implementing techniques that prevent loss spikes in long training runs. It also uses staged training interventions to address capability deficits during late pretraining. The models incorporate the latest post-training methods from AI2's Tulu 3 resulting in OLMo 2-Instruct. The Open Language Modeling Evaluation System, or OLMES, was created to guide improvements throughout the development stages. It consists of 20 evaluation benchmarks assessing key capabilities.
Learn more
Llama 2
The next generation of the large language model. This release includes modelweights and starting code to pretrained and fine tuned Llama languages models, ranging from 7B-70B parameters.
Llama 1 models have a context length of 2 trillion tokens. Llama 2 models have a context length double that of Llama 1. The fine-tuned Llama 2 models have been trained using over 1,000,000 human annotations.
Llama 2, a new open-source language model, outperforms many other open-source language models in external benchmarks. These include tests of reasoning, coding and proficiency, as well as knowledge tests.
Llama 2 has been pre-trained using publicly available online data sources. Llama-2 chat, a fine-tuned version of the model, is based on publicly available instruction datasets, and more than 1 million human annotations.
We have a wide range of supporters in the world who are committed to our open approach for today's AI. These companies have provided early feedback and have expressed excitement to build with Llama 2
Learn more
Baichuan-13B
Baichuan-13B, a large-scale language model with 13 billion parameters that is open source and available commercially by Baichuan Intelligent, was developed following Baichuan -7B. It has the best results for a language model of the same size in authoritative Chinese and English benchmarks. This release includes two versions of pretraining (Baichuan-13B Base) and alignment (Baichuan-13B Chat).
Baichuan-13B has more data and a larger size. It expands the number parameters to 13 billion based on Baichuan -7B, and trains 1.4 trillion coins on high-quality corpus. This is 40% more than LLaMA-13B. It is open source and currently the model with the most training data in 13B size. Support Chinese and English bi-lingual, use ALiBi code, context window is 4096.
Learn more
Falcon-40B
Falcon-40B is a 40B parameter causal decoder model, built by TII. It was trained on 1,000B tokens from RefinedWeb enhanced by curated corpora. It is available under the Apache 2.0 licence.
Why use Falcon-40B
Falcon-40B is the best open source model available. Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc. OpenLLM Leaderboard.
It has an architecture optimized for inference with FlashAttention, multiquery and multiquery.
It is available under an Apache 2.0 license that allows commercial use without any restrictions or royalties.
This is a raw model that should be finetuned to fit most uses. If you're looking for a model that can take generic instructions in chat format, we suggest Falcon-40B Instruct.
Learn more
Pricing
Pricing Starts At:
Free
Pricing Information:
Open source
Free Version:
Yes
Integrations
No Integrations at this time
Company Details
Company:
TinyLlama
Website:
github.com/jzhang38/TinyLlama
Media
Recommended Products
Test your software product anywhere in the world
Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.
Product Details
Platforms
Windows
Mac
Linux
Type of Training
Documentation
TinyLlama Features and Options
TinyLlama Lists
TinyLlama User Reviews
Write a Review- Previous
- Next