VOOZH about

URL: https://apify.com/vivid_astronaut/pronunciation-assessment-mcp

โ‡ฑ Speech AI MCP Server - Pronunciation, STT, TTS ยท Apify


Pricing

Pay per usage

Go to Apify Store

Speech AI MCP Server

Speech AI MCP server with 9 tools: pronunciation scoring (0-100 at phoneme/word/sentence level), speech-to-text with timestamps, text-to-speech with 12 English voices, and multilingual Whisper transcription (99 languages + speaker diarization). Sub-300ms latency. Pay-per-use: $0.02/call.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Fabio Suizu

Fabio Suizu

Maintained by Community

Actor stats

0

Bookmarked

8

Total users

0

Monthly active users

3 months ago

Last modified

Categories

Share

AI-powered speech tools for MCP-enabled AI agents: pronunciation scoring, speech-to-text, text-to-speech, and multilingual transcription.

Tools

ToolDescription
assess_pronunciationScore English pronunciation from audio (0-100 at overall, sentence, word, phoneme levels)
transcribe_audioConvert spoken English to text with word-level timestamps
synthesize_speechGenerate natural speech from text (12 voices, American & British)
transcribe_audio_proWhisper Large V3 Turbo: 99 languages, speaker diarization
list_tts_voicesList available text-to-speech voices
check_pronunciation_serviceHealth check for pronunciation backend
check_stt_serviceHealth check for STT backend
check_tts_serviceHealth check for TTS backend
check_whisper_serviceHealth check for Whisper backend

Pronunciation Scoring

Returns scores (0-100) at four granularity levels:

LevelDescription
OverallGlobal pronunciation quality
SentenceSentence-level fluency and accuracy
WordPer-word pronunciation scores
PhonemeIndividual sound accuracy (IPA + ARPAbet)

Performance

  • Accuracy: Exceeds human inter-annotator agreement (PCC 0.576 vs 0.555)
  • Validated: 9,259 utterances across 7 L1 backgrounds, zero errors
  • Latency: Sub-300ms for pronunciation and STT

How to Use

MCP Endpoint

https://Ym2gS88TksnTdTcPq.apify.actor/mcp?token=YOUR_APIFY_TOKEN

Example: Pronunciation Assessment

{
"audio_base64":"<base64-encoded-audio>",
"text":"The quick brown fox jumps over the lazy dog"
}

Example: Text-to-Speech

{
"text":"Hello, how are you today?",
"voice":"af_heart",
"speed":1.0
}

Pricing

$0.02 per tool call (pay-per-event).

Technical Details

  • Pronunciation Model: Conformer-CTC Small (17MB, INT8 quantized)
  • TTS Model: Kokoro-82M (12 English voices, 24kHz WAV)
  • STT Pro: Whisper Large V3 Turbo (99 languages, speaker diarization)
  • Audio: Supports WAV, MP3, OGG, FLAC, WebM
  • Backend: Azure Container Apps, auto-scaling

Links

You might also like

Hugging Face Audio AI

alizarin_refrigerator-owner/hugging-face-audio-ai

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

Text to speech generator

akash9078/advanced-text-to-speech

Professional-grade Text-to-Speech (TTS) actor powered by advanced AI models. Convert any text into natural, human-like speech with 50+ premium voices across 9 languages. Perfect for content creation, accessibility, voiceovers, audiobooks, podcasts, and multilingual applications.

๐Ÿ‘ User avatar

Akash Kumar Naik

21

Text to Speech Generator

moving_beacon-owner1/my-actor-30

Convert text into natural-sounding speech in multiple languages with ease.

23

Google Free Text to Speech

jupri/google-speech

Use free Google Text to Speech to translate text into voice

Speech-to-Text Transcription

hgservices/speech-to-text

Transcribe audio and video from YouTube, TikTok, podcasts, X, and 1,000+ other sites or any direct media URL into accurate, speaker-labeled text. Uses World's best speech to text AI models with automatic language detection, multilingual support, and smart formatting.

82

5.0

Speech Lang Pathologist Email Scraper

contacts-api/speech-lang-pathologist-email-scraper

Speech-language pathologist email scraper to extract verified speech therapist emails from clinics, hospitals, rehabilitation centers, schools, and healthcare directories ๐Ÿ“ง๐Ÿ—ฃ๏ธ Perfect for healthcare outreach, recruitment, and speech therapy lead generation.