Pronunciation API

The Dictionary Pronunciation API provides developers with the ability to generate realistic speech audio from text. Whether you need single word pronunciations, full sentence readings, or custom voice configurations, our API makes it simple and fast.

💡

Tip: New here? Start with the Quick Start guide to make your first API call in under 2 minutes.

Key Features

100+ Languages — From English and Spanish to Japanese and Arabic
500+ Natural Voices — Male, female, and neutral voice options
Multiple Formats — MP3, WAV, OGG, and FLAC output
SSML Support — Fine-grained control with Speech Synthesis Markup Language
Sub-200ms Latency — Optimized for real-time applications
Phonetic Precision — IPA transcription included in responses

Quick Start

Follow these steps to make your first Pronunciation API call.

Step 1: Get Your API Key

Step 2: Make Your First Request

Here's a simple curl example to pronounce the word "serendipity":

Bash / curl

curl https://api.dictionary.com/v2/pronunciation \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "serendipity",
    "voice": "en-US-AriaNeural",
    "format": "mp3"
  }'

Step 3: Handle the Response

The API returns an audio URL along with phonetic transcription:

200 OK

JSON Response

{
  "word": "serendipity",
  "phonetic": "ˌser.ənˈdɪp.ə.t̬i",
  "audio_url": "https://cdn.dictionary.com/audio/pron_8f3a2b1c.mp3",
  "format": "mp3",
  "voice": "en-US-AriaNeural",
  "duration_ms": 1840,
  "cache_key": "pron_8f3a2b1c"
}

✅

That's it! You can now stream or download the audio from the audio_url. Cached audio URLs remain valid for 24 hours.

Authentication

All API requests require authentication using a Bearer token. Include your API key in the Authorization header.

HTTP Header

Authorization: Bearer sk_live_your_api_key_here

Key Types

Prefix	Type	Usage
`sk_live_`	Production	Use in your deployed applications
`sk_test_`	Sandbox	Use during development and testing
`sk_pub_`	Public	Client-side only (rate limited to 10 req/min)

⚠️

Security: Never expose sk_live_ or sk_test_ keys in client-side code. Use sk_pub_ keys or a backend proxy for browser-based applications.

Base URL

All API requests are made to the following base URL:

Base URL https://api.dictionary.com/v2

ℹ️

All endpoints are relative to the base URL. We use HTTPS exclusively — HTTP requests will be rejected.

Rate Limits

Rate limits depend on your subscription tier. All limits are calculated per API key.

Plan	Requests / Minute	Requests / Day	Max Audio Length
Free	30	1,000	10 seconds
Pro	300	50,000	60 seconds
Business	2,000	500,000	300 seconds
Enterprise	Custom	Unlimited	Custom

Rate Limit Headers

Every response includes rate limit information in headers:

Header	Description
`X-RateLimit-Limit`	Maximum requests per window
`X-RateLimit-Remaining`	Remaining requests in current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`Retry-After`	Seconds to wait when rate limited (429)

Error Handling

The API uses standard HTTP status codes and returns structured error responses.

Code	Status	Description
400	Bad Request	Invalid parameters or malformed request body
401	Unauthorized	Missing or invalid API key
403	Forbidden	API key revoked or insufficient permissions
404	Not Found	Requested voice or language not available
429	Too Many Requests	Rate limit exceeded
500	Internal Error	Server-side error — retry with exponential backoff
503	Service Unavailable	Temporary maintenance or overload

Error Response Format

JSON Error Response

{
  "error": {
    "code": "invalid_voice",
    "message": "Voice 'de-DE-MaxxNeural' is not available for the requested language",
    "status": 400,
    "details": [
      {
        "field": "voice",
        "value": "de-DE-MaxxNeural",
        "reason": "Voice does not support language 'fr-FR'"
      }
    ]
  }
}

Pronounce a Word

Generate pronunciation audio for a single word with phonetic transcription.

GET /v2/pronunciation/word/{word}

Path Parameters

Parameter	Type	Required	Description
`word`	string	Required	The word to pronounce (1-50 characters)

Query Parameters

Parameter	Type	Required	Default	Description
`lang`	string	Optional	`auto`	Language code (e.g., `en-US`, `fr-FR`, `ja-JP`)
`voice`	string	Optional	`default`	Specific voice ID. See List Voices
`format`	string	Optional	`mp3`	Audio format: `mp3`, `wav`, `ogg`, `flac`
`speed`	number	Optional	`1.0`	Speech rate from `0.5` (slow) to `2.0` (fast)
`pitch`	number	Optional	`0`	Pitch adjustment from `-50` (low) to `+50` (high)
`include_phonetic`	boolean	Optional	`true`	Include IPA phonetic transcription in response

Try It

▶ Interactive Tester GET

200 OK

Response

{
  "word": "serendipity",
  "phonetic": "ˌser.ənˈdɪp.ə.t̬i",
  "audio_url": "https://cdn.dictionary.com/audio/pron_8f3a2b1c.mp3",
  "format": "mp3",
  "voice": "en-US-AriaNeural",
  "duration_ms": 1840,
  "cache_key": "pron_8f3a2b1c",
  "language": "en-US"
}

Response Example

200 OK

JSON

{
  "word": "serendipity",
  "phonetic": "ˌser.ənˈdɪp.ə.t̬i",
  "audio_url": "https://cdn.dictionary.com/audio/pron_8f3a2b1c.mp3",
  "format": "mp3",
  "voice": "en-US-AriaNeural",
  "duration_ms": 1840,
  "cache_key": "pron_8f3a2b1c",
  "language": "en-US",
  "syllables": 5,
  "stress_pattern": "01000"
}

Pronounce Text

Generate speech audio for arbitrary text, sentences, or paragraphs.

POST /v2/pronunciation/text

Request Body

Field	Type	Required	Description
`text`	string	Required	Text to pronounce (1-5000 characters). Plain text or SSML.
`language`	string	Required	BCP-47 language tag (e.g., `en-US`)
`voice`	string	Optional	Specific voice ID
`output_format`	string	Optional	`mp3`, `wav`, `ogg`, `flac` (default: `mp3`)
`speed`	number	Optional	Speech rate (0.5–2.0, default: 1.0)
`pitch`	number	Optional	Pitch adjustment (-50 to +50, default: 0)
`volume`	number	Optional	Volume level (0.0–1.0, default: 1.0)

Code Examples

cURL

curl -X POST https://api.dictionary.com/v2/pronunciation/text \
  -H "Authorization: Bearer sk_live_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The quick brown fox jumps over the lazy dog.",
    "language": "en-US",
    "voice": "en-US-GuyNeural",
    "output_format": "mp3",
    "speed": 0.9
  }'

Python

import requests

response = requests.post(
    "https://api.dictionary.com/v2/pronunciation/text",
    headers={
        "Authorization": "Bearer sk_live_your_api_key",
        "Content-Type": "application/json"
    },
    json={
        "text": "The quick brown fox jumps over the lazy dog.",
        "language": "en-US",
        "voice": "en-US-GuyNeural",
        "speed": 0.9
    }
)

print(response.json["audio_url"])

Node.js

const axios = require('axios');

const response = await axios.post(
  'https://api.dictionary.com/v2/pronunciation/text',
  {
    text: 'The quick brown fox jumps over the lazy dog.',
    language: 'en-US',
    voice: 'en-US-GuyNeural',
    speed: 0.9
  },
  {
    headers: {
      Authorization: `Bearer sk_live_your_api_key`
    }
  }
);

console.log(response.data.audio_url);

JavaScript (Fetch)

const response = await fetch(
  'https://api.dictionary.com/v2/pronunciation/text',
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer sk_live_your_api_key`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      text: 'The quick brown fox jumps over the lazy dog.',
      language: 'en-US',
      voice: 'en-US-GuyNeural',
      speed: 0.9
    })
  }
);

const data = await response.json();
console.log(data.audio_url);

Response

200 OK

JSON

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "audio_url": "https://cdn.dictionary.com/audio/pron_a1b2c3d4.mp3",
  "format": "mp3",
  "voice": "en-US-GuyNeural",
  "language": "en-US",
  "duration_ms": 3200,
  "character_count": 43,
  "cache_key": "pron_a1b2c3d4"
}

Phonetic Lookup

Retrieve IPA phonetic transcription without generating audio.

GET /v2/phonetics/{word}

Request

curl "https://api.dictionary.com/v2/phonetics/serendipity?lang=en-US" \
  -H "Authorization: Bearer sk_live_your_api_key"

200 OK

Response

{
  "word": "serendipity",
  "language": "en-US",
  "ipa": "ˌser.ənˈdɪp.ə.t̬i",
  "arpabet": "S EH0 R AH0 N D IH1 P AH0 T IY0",
  "syllables": [
    { "text": "ser", "stress": 0 },
    { "text": "en", "stress": 0 },
    { "text": "dip", "stress": 1 },
    { "text": "i", "stress": 0 },
    { "text": "ty", "stress": 0 }
  ],
  "stress_pattern": "00100",
  "primary_stress": 2
}

Compare Pronunciation

Upload an audio recording and compare it against the correct pronunciation. Returns a similarity score and feedback.

POST /v2/pronunciation/compare

🎙️

Audio Requirements: Upload WAV or MP3 files. Maximum 30 seconds. Recommended sample rate: 16kHz, mono, 16-bit PCM for WAV.

Request Body (multipart/form-data)

Field	Type	Required	Description
`audio`	file	Required	User's audio recording (WAV/MP3, max 30s)
`word`	string	Required	The word that was being pronounced
`language`	string	Required	BCP-47 language tag

cURL

curl -X POST https://api.dictionary.com/v2/pronunciation/compare \
  -H "Authorization: Bearer sk_live_your_api_key" \
  -F "audio=@recording.wav" \
  -F "word=beautiful" \
  -F "language=en-US"

200 OK

Response

{
  "word": "beautiful",
  "overall_score": 0.87,
  "ratings": {
    "accuracy": 0.89,
    "fluency": 0.92,
    "completeness": 1.0,
    "pronunciation": 0.83
  },
  "feedback": [
    {
      "phoneme": "juː",
      "score": 0.71,
      "suggestion": "The 'yu' sound needs more lip rounding"
    }
  ],
  "reference_audio": "https://cdn.dictionary.com/audio/ref_b3e4f5a2.mp3"
}

List Available Voices

Retrieve all available voices for a specific language or across all languages.

GET /v2/voices

Query Parameters

Parameter	Type	Description
`language`	string	Filter by language code (e.g., `en-US`)
`gender`	string	Filter by gender: `male`, `female`, `neutral`
`style`	string	Filter by style: `natural`, `formal`, `casual`, `newscast`
`limit`	integer	Max results per page (default: 50, max: 200)

Request

curl "https://api.dictionary.com/v2/voices?language=en-US&gender=female" \
  -H "Authorization: Bearer sk_live_your_api_key"

200 OK

Response

{
  "voices": [
    {
      "voice_id": "en-US-AriaNeural",
      "name": "Aria",
      "language": "en-US",
      "gender": "female",
      "style": "natural",
      "sample_url": "https://cdn.dictionary.com/voice-samples/en-US-AriaNeural.mp3",
      "locales": ["en-US"]
    },
    {
      "voice_id": "en-US-JennyNeural",
      "name": "Jenny",
      "language": "en-US",
      "gender": "female",
      "style": "newscast",
      "sample_url": "https://cdn.dictionary.com/voice-samples/en-US-JennyNeural.mp3",
      "locales": ["en-US"]
    }
  ],
  "total": 24,
  "page": 1,
  "per_page": 50
}

Custom Pronunciation

Create and manage custom pronunciation rules for proper nouns, brand names, and domain-specific terminology.

POST /v2/custom-pronunciations

Request Body

{
  "name": "Brand Names Dictionary",
  "entries": [
    {
      "text": "Kodak",
      "phoneme": "kˈoʊ.dæɡ",
      "language": "en-US"
    },
    {
      "text": "Nissan",
      "phoneme": "ˈniː.sən",
      "language": "en-US"
    }
  ]
}

Using Custom Dictionaries

Reference your custom dictionary in pronunciation requests using the custom_dict_id parameter:

Request

curl -X POST https://api.dictionary.com/v2/pronunciation/text \
  -H "Authorization: Bearer sk_live_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Visit Kodak for your camera needs.",
    "language": "en-US",
    "custom_dict_id": "cd_abc123"
  }'

Batch Pronounce

Generate pronunciations for up to 100 words in a single request. Reduces API call overhead and cost.

POST /v2/pronunciation/batch

💡

Batch Pricing: Each word in a batch counts as 0.5 API calls, making it 50% more cost-effective than individual requests.

Request Body

{
  "words": [
    { "text": "serendipity", "language": "en-US" },
    { "text": "ephemeral", "language": "en-US" },
    { "text": "ubiquitous", "language": "en-US" }
  ],
  "format": "mp3",
  "voice": "en-US-AriaNeural"
}

200 OK

Response

{
  "results": [
    {
      "word": "serendipity",
      "audio_url": "https://cdn.dictionary.com/audio/pron_8f3a2b1c.mp3",
      "phonetic": "ˌser.ənˈdɪp.ə.t̬i",
      "status": "success"
    },
    {
      "word": "ephemeral",
      "audio_url": "https://cdn.dictionary.com/audio/pron_2d4e6f8a.mp3",
      "phonetic": "əˈfem.ər.əl",
      "status": "success"
    },
    {
      "word": "ubiquitous",
      "audio_url": "https://cdn.dictionary.com/audio/pron_9c1b3d5e.mp3",
      "phonetic": "juːˈbɪk.wə.t̬əs",
      "status": "success"
    }
  ],
  "batch_id": "batch_x7y8z9",
  "credits_used": 1.5
}

SDKs & Libraries

We provide official SDKs for popular programming languages to make integration seamless.

Language	Package	Install Command	Repository
Python	`dictionary-pronunciation`	`pip install dictionary-pronunciation`	GitHub →
JavaScript	`@dictionary/pronunciation`	`npm install @dictionary/pronunciation`	GitHub →
Java	`com.dictionary:pronunciation-sdk`	Maven / Gradle	GitHub →
Swift	`DictionaryPronunciation`	`spm add DictionaryPronunciation`	GitHub →
Kotlin	`com.dictionary:pronunciation-kotlin`	Gradle	GitHub →
Go	`github.com/dictionary/pronunciation-go`	`go get github.com/dictionary/pronunciation-go`	GitHub →
Ruby	`dictionary-pronunciation`	`gem install dictionary-pronunciation`	GitHub →
PHP	`dictionary/pronunciation-php`	`composer require dictionary/pronunciation-php`	GitHub →

Postman Collection

Import our ready-to-use Postman collection to test all Pronunciation API endpoints interactively.

Import

# Direct URL for Postman import
https://api.dictionary.com/openapi/postman-collection.json

# Or use the OpenAPI 3.0 spec
https://api.dictionary.com/openapi/spec.json

📦

The Postman collection includes pre-configured environment variables. Set your API_KEY variable after importing.

Changelog

v2.4 Latest — December 2025 ▼

Added pitch and volume parameters to /pronunciation/text
New batch endpoint: POST /pronunciation/batch (up to 100 words)
15 new voices for Japanese (ja-JP) and Korean (ko-KR)
SSML support in text pronunciation endpoint

v2.3 — October 2025 ▼

Custom pronunciation dictionary support
Phonetic lookup now returns syllable breakdowns
Improved audio quality with neural voice models
FLAC output format added

v2.2 — August 2025 ▼

Pronunciation comparison endpoint with similarity scoring
Added stress_pattern to phonetic responses
Rate limit headers now included in all responses

v2.1 — June 2025 ▼

Initial v2 release with breaking changes from v1
New authentication model (Bearer tokens)
500+ voices across 100+ languages
Sub-200ms latency improvements

Support

Need help with the Pronunciation API? Here are the best ways to get support:

Channel	Response Time	Best For
Developer Forum	Community-driven	General questions, code examples
Email Support	< 4 hours	Technical issues, billing
Slack Community	Real-time	Quick questions, networking
Enterprise SLA	< 1 hour	Critical issues (Enterprise plan)

🚀

Ready to start building? Get your free API key and make your first call in under 2 minutes.

← Previous: Quick Start Pronunciation API v2.4 Next: SDKs & Libraries →

Pronunciation API Documentation

Key Features

Quick Start

Step 1: Get Your API Key

Step 2: Make Your First Request

Step 3: Handle the Response

Authentication

Key Types

Base URL

Rate Limits

Rate Limit Headers

Error Handling

Error Response Format

Pronounce a Word

Path Parameters

Query Parameters

Try It

Response Example

Pronounce Text

Request Body

Code Examples

Response

Phonetic Lookup

Compare Pronunciation

Request Body (multipart/form-data)

List Available Voices

Query Parameters

Custom Pronunciation

Using Custom Dictionaries

Batch Pronounce

SDKs & Libraries

Postman Collection

Changelog

Support