Vocapia’s VoxSigma Speech-to-Text software suite is a leading edge speech processing technology that offers large vocabulary continuous speech recognition in multiple languages for a variety of audio data types. It enables the transcription of large quantities of audio and video documents such as broadcast data, either in batch mode or in real-time.
It also provides audio segmentation and partitioning, speaker identification and language recognition. The software suite is available as a web service via a REST Speech-to-Text API, offering full speech transcription, audio indexing and speech-text alignment capabilities via a REST API over HTTPS.
Additionally, the software offers advanced language technologies such as language identification and speaker diarization to transform raw audio data into structured and searchable XML documents, enabling users to access content in video documents.
It is used for applications such as broadcast and telephone data mining, speech analytics, media monitoring, media asset management, speech transcription, subtitling and more. The speech recognition software is available for over 82 languages and clients can create models for their desired language set.
More details about Vocapia
Is real time transcription possible with Vocapia?
Yes, with Vocapia, real-time transcription of large quantities of audio and video documents such as broadcast data is possible. It can transcribe in batch mode or in real-time.
How many languages does the Vocapia support?
Vocapia supports over 82 languages. It offers speech to text transcription for languages including Arabic, Cantonese, Czech, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Italian, Latvian, Lithuanian, Mandarin, Pashto, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish, Ukrainian and Urdu, among others.
What type of data does the Vocapia’s software handle?
Vocapia’s VoxSigma software suite handles various types of audio data, including but not limited to, broadcast data, parliamentary hearings, conversational data, telephone data and call-center data.