Submit a Transcription Job
POST /transcription
Submits a call recording for processing. Returns a job_id immediately; use it to poll for results.
Request Body
A publicly accessible URL pointing to the audio file to transcribe. You can use a pre-signed URL from cloud storage (S3, GCS, Azure Blob). The URL must remain accessible for at least 15 minutes after submission.
BCP-47 language code for the recording. This determines the speech recognition model. Examples:
hi-IN, ta-IN, en-IN, te-IN, mr-IN.Expected number of speakers in the recording for diarization. Accepted values:
1 to 10. For most call recordings, the default of 2 (agent + customer) is correct.List of analysis features to run on the recording. Including more features increases processing time slightly. Available values:
transcription— speaker-diarized, timestamped speech-to-text (always included)summary— a plain-language summary of the call (2-4 sentences)sentiment— sentiment scores per speaker and overall (-1 to 1 scale)intent— primary customer intent detected from the conversationqa_scoring— evaluate the call against a QA scorecard (requiresqa_scorecard_id)
The ID of the QA scorecard to evaluate against. Required when
qa_scoring is included in features. Scorecards are created and managed in the NeuronLens dashboard under QA → Scorecards.Optional key-value pairs attached to the job. These are passed through unchanged to all response and webhook payloads — useful for correlating jobs with records in your own system. For example:
{"crm_ticket_id": "TKT-4492", "agent_id": "agt_001"}.Example Request
Example Response
Processing time depends on audio duration and the features requested. A typical 3-minute call with all features enabled completes in under 60 seconds. Long recordings (30+ minutes) may take a few minutes. Poll
GET /transcription/{job_id} or listen for a transcription.completed webhook event.Get Job Status and Results
GET /transcription/{job_id}
Poll this endpoint to check the status of a submitted job and retrieve results once processing is complete.
Response Fields
The unique identifier for this transcription job.
Current job status:
pending (queued), processing (actively being analyzed), completed (results available), or failed (processing error — check error field for details).Array of transcript segments, each with speaker label and timing. Present only when
status is completed.A 2-4 sentence plain-language summary of the call. Present when
summary was included in features. null otherwise.Sentiment analysis results. Present when
sentiment was included in features.The primary customer intent detected from the conversation. Examples:
loan_renewal_interest, complaint, payment_query, dnd_request. Present when intent was included in features. null otherwise.QA evaluation results. Present when
qa_scoring was included in features.The metadata object you submitted with the job, returned unchanged.
ISO 8601 timestamp of when the job was submitted.
ISO 8601 timestamp of when processing finished.
null if still in progress.Example Response (Completed Job)
Supported Audio Formats
NeuronLens accepts the following audio formats:| Format | Extension | Notes |
|---|---|---|
| WAV | .wav | Recommended for best accuracy. Uncompressed PCM preferred. |
| MP3 | .mp3 | Common for telephony recordings. |
| OGG | .ogg | Ogg Vorbis and Ogg Opus both supported. |
| FLAC | .flac | Lossless — good accuracy, larger file size. |
| M4A | .m4a | AAC audio in MPEG-4 container. |