StableVicuna Description
StableVicuna, the first large-scale chatbot to be trained using RHLF (reinforced learning from human feedback), is an open-source chatbot that has been fine-tuned and RLHF trained. StableVicuna, a further fine-tuned and RLHF-trained version of Vicuna v0-13b is a fine-tuned LLaMA model.
To achieve StableVicuna’s strong performance we use Vicuna as a base model and follow Steinnon et. al.’s typical three-stage RLHF pipeline. and Ouyang et al. Concretely, using a combination of three datasets, we further train the Vicuna base model with supervised refinement (SFT).
OpenAssistant Dataset (OASST1) is a corpus of human-generated and human-annotated assistant style conversation data, comprising 161,443 message distributed across 66 497 conversation trees in 35 different languages.
GPT4All Generating Prompts, a dataset containing 437,605 prompts generated by GPT-3.5;
Alpaca is a dataset of over 52,000 instructions and demos generated by OpenAI’s text-davinci 003.
Pricing
Integrations
Company Details
Product Details
StableVicuna Features and Options
StableVicuna User Reviews
Write a Review- Previous
- Next