Google Cloud Text-to-Speech
Google's AI technology allows you to convert text into natural-sounding voice using an API. Google's AI technologies can be used to generate speech that has a human-like intonation. The API is based on DeepMind’s speech synthesis expertise and delivers voices with human-like intonation. Choose from 220+ voices in 40+ languages, including Mandarin, Hindi Spanish, Arabic, Russian and more. Choose the voice that best suits your user and application. Create a voice that is unique to your brand and use it across all customer touchpoints. Don't use a voice that is shared by other organizations. You can create a more natural-sounding voice by training a custom model with your own audio recordings. You can choose and define the voice profile for your organization, and quickly adapt to changes in voice requirements without having to record new phrases.
Learn more
Rev
Rev offers premium on-demand, manual, and automated transcription, closed captioning, and foreign subtitling services. Rev has 170,000+ clients, ranging from freelance journalists to global corporations. Rev processes more audio/video than any other provider, and can scale to meet any customer's requirements. Pricing is straightforward, starting at $0.25 per audio/video min for automated speech-to text services and $1.25/min manual with 99% accuracy. Rev.ai is a speech recognition engine available to companies who request it.
Learn more
Deepgram
You can use accurate speech recognition at scale and continuously improve model performance by labeling data, training and labeling from one console. We provide state-of the-art speech recognition and understanding at large scale. We do this by offering cutting-edge model training, data-labeling, and flexible deployment options. Our platform recognizes multiple languages and accents. It dynamically adapts to your business' needs with each training session. Enterprise-specific speech transcription software that is fast, accurate, reliable, and scalable. ASR has been reinvented with 100% deep learning, which allows companies to improve their accuracy. Stop waiting for big tech companies to improve their software. Instead, force your developers to manually increase accuracy by using keywords in every API call. You can train your speech model now and reap the benefits in weeks, instead of months or even years.
Learn more
Speechmatics
Speechmatics is the most accurate and inclusive speech-to-text API ever released.
Speechmatics is the world’s leading expert in Speech Technology, combining the latest breakthroughs in AI and ML to unlock the business value in human speech.
Businesses use Speechmatics worldwide to accurately understand and transcribe human-level speech into text regardless of demographic, age, gender, accent, dialect, or location in real-time and on recorded media. Combining these transcripts with the latest AI-driven speech capabilities, businesses build products that utilize summarization, topic detection, sentiment analysis, translation, and more.
How is Speechmatics different?
* The most accurate speech recognition on the market
* 55 languages with vast accent and dialect coverage
* Cloud-based or on-premises deployment options for data security
* Real-time transcription with low latency and high accuracy
* Real-time translation with 69 language pairs
* Speech Understanding features such as Summaries, Sentiment, Topic Detection, Chapters, Audio Events
* Fast and secure transcriptions for pre-recorded audio
* Automatic translation and language identification
* A culture of R&D in deep learning and speech recognition
Learn more