Karlo Description
Karlo is a model that generates images from text prompts. It builds on OpenAI's unCLIP architecture, but goes a step beyond by enhancing the super-resolution standard model, allowing for intricate details to be recovered at a remarkable 256px resolution, all while minimising noise through a small number of denoising stages.
We embarked on a lengthy training process to create Karlo. We started from the beginning, using a large dataset of 115 millions image-text pairs. This included COYO100M, CC3M and CC12M. For the Prior and Decoder components we used the power of ViT/L/14, an OpenAI CLIP text encoder. We made a significant change to the unCLIP implementation in order to optimize efficiency. We integrated the text encoder of ViT-L/14 into the decoder instead of using a trainable transform.
Pricing
Integrations
Company Details
Product Details
Karlo Features and Options
Karlo User Reviews
Write a Review- Previous
- Next