An API powered by Google's AI technology allows you to accurately convert speech into text. You can accurately caption your content, provide a better user experience with products using voice commands, and gain insight from customer interactions to improve your service. Google's deep learning neural network algorithms are the most advanced in automatic speech recognition (ASR). Speech-to-Text allows for experimentation, creation, management, and customization of custom resources. You can deploy speech recognition wherever you need it, whether it's in the cloud using the API or on-premises using Speech-to-Text O-Prem. You can customize speech recognition to translate domain-specific terms or rare words. Automated conversion of spoken numbers into addresses, years and currencies. Our user interface makes it easy to experiment with your speech audio.
Learn more

Get quality translations for your app, website, game, supporting documentation, and on. Invite your own translation team or work with professional translation agencies within Crowdin.
Features that ensure quality translations and speed up the process
• Glossary – create a list of terms to get consistent translations
• Translation Memory (TM) – no need to translate identical strings
• Screenshots – tag source strings to get context-relevant translations
• Integrations – set up integration with GitHub, Google Play, API, CLI, Android Studio, and on
• QA checks – make sure that all the translations have the same meaning and functions as the source strings
• In-Context – proofreading within the actual web application
• Machine Translations (MT) – pre-translate via translation engine
• Reports – get insights, plan and manage the project
Crowdin supports more than 30 file formats for mobile, software, documents, subtitles, graphics and assets:
.xml, .strings, .json, .html, .xliff, .csv, .php, .resx, .yaml, .xml, .strings and on.
Learn more
GPT-Realtime-Translate
OpenAI’s GPT-Realtime-Translate is a dynamic translation model aimed at facilitating multilingual voice interactions, enabling individuals to converse in their chosen languages while receiving immediate translations and transcriptions. With a capacity to accommodate over 70 input languages and 13 output languages, it proves invaluable for various applications, including customer service, international sales, educational settings, events, media, and platforms catering to diverse global audiences. Its design focuses on maintaining the integrity of the original message while adapting to the speaker's pace, handling natural speech patterns, context shifts, regional accents, and specialized terminology. By integrating low-latency responses and enhanced fluency, GPT-Realtime-Translate offers a seamless API workflow for real-time speech translation, fostering more organic cross-lingual dialogues. This technology not only translates conversations in real time but also ensures that spoken information is readily accessible to diverse audiences, enhancing overall communication effectiveness. Ultimately, the model aims to bridge language gaps, making interactions smoother and more inclusive for everyone involved.
Learn more
Vavus AI
Vavus AI serves as a comprehensive translation and dictation solution tailored for individuals, healthcare professionals, and corporate teams alike. This innovative app seamlessly integrates live two-way voice translation, translated phone and video calls, secure messaging with individual message translation, document and image translation utilizing OCR, speech-to-text capabilities, and a translating keyboard that functions within any application, covering over 200 languages across iPhone, Android, web, and desktop platforms. By enabling users to speak instead of type, it allows for productivity gains of up to four times. Additionally, it is designed with a strong focus on privacy, incorporating client-side encryption and offering HIPAA-compliant healthcare account options, ensuring that user data remains secure and confidential. With these features, Vavus AI stands out as a versatile tool for effective communication in a diverse array of settings.
Learn more