Getting started
Make your first API request in minutes.
Output reference
Learn the JSON schema and fields.
API overview
Langcraft Speech API provides pronunciation assessment and prosody analysis in a single call. It’s designed for EdTech, speech therapy, and linguistic analysis, and returns structured JSON you can use directly in your app. Key capabilities:- Word and phoneme alignment with millisecond timing
- Phoneme‑level scoring and error detection
- Automatic transcription with word‑level timestamps when no reference text is provided
- Multilingual support (40+ languages)
- Model selection: send
model=aurora-1for languages other than English, or leavemodelunset for standard English analysis
Highlights
- Per‑phoneme scores with timestamps
- Per‑word rollups and summaries
- Pitch and stress contours at the phoneme and word levels
- Alignment metadata to connect audio, phones, and text
Inputs
You can analyze speech with any of:- A reference text (
reference_text) plus language code (lang) — the API runs grapheme‑to‑phoneme generation to derive canonical phones - A direct IPA phone sequence (
reference_phones) — bypasses G2P, useful for pronunciation contrast tests - Audio only — the API runs automatic transcription and uses the transcript as the reference
reference_text accepts the alias text. reference_phones accepts the alias ipa.
Model selection
For non-English speech, set the official public model selector:aurora-1 is an experimental public model selector designed for languages other than English. It is currently recommended for German, French, and Spanish, and is the only non-default model selector currently documented for public integrations.