Zyphra is proud to announce the release and beta version of Zonos-v0.1, which features two expressive real-time text-to speech models with high-fidelity cloning. We are releasing the 1.6B Transformer and 1.6B Hybrid under an Apache 2.0 License. It is difficult for audio quality to be quantified; however, we found that Zonos' generation was equal or better than that of the leading proprietary TTS models providers. We also believe that releasing these models in a public manner will have a significant impact on TTS research. Zonos model weights can be found on Huggingface and sample inference codes for the models are available on our GitHub. You can also access the Zonos model through our API and model playground with a simple and competitive flat rate pricing. We found that quantitative evaluations are unable to accurately measure the output quality in the audio domain. For demonstration purposes, we have provided a number samples of Zonos and both proprietary models.