Featherless, an AI model provider, offers its subscribers access to an ever-expanding library of Hugging Faces. You need dedicated tools to keep pace with the hype. With hundreds of models being added daily, you will need dedicated tools. Featherless lets you find and use the latest AI models, no matter what your use case is. LLaMA-3 models are supported, including LLaMA-3, QWEN-2, and LLaMA-3. Note that QWEN-2 models can only be supported up to 16 000 context length. Soon, we plan to add new architectures to the list of supported architectures. As new models become available on Hugging Face, we continue to add them. As we grow, our goal is to automate the process so that all Hugging Face models available publicly with compatible architecture are included. To ensure fair account usage, the number of concurrent requests is limited based on the plan selected. The output is delivered between 10-40 tokens/second, depending on the prompt size and model.