← Back to docs

Whisper Transcriptions

Language: EN | EN | SV

Whisper Transcriptions

Tools provides a queue-based Whisper transcription service for media URLs and uploaded audio/video files.

This guide focuses on the public contract: what the feature does, how users interact with it, which endpoints exist, and what clients should expect in requests and responses.

What the feature does

Whisper in Tools can:

  • queue media for transcription
  • show live queue/job progress
  • store the finished transcript
  • optionally generate transcript analysis
  • optionally generate transcript translations
  • optionally attach estimated speaker labels when available
  • create a public transcript share page

Access models

Signed-in web/JWT API

The ordinary Whisper UI and authenticated API use:

  • signed-in web session auth, or
  • JWT bearer auth from POST /api/account/login

User permission requirements:

  • whisper.use for ordinary queue access
  • whisper.manage for full-queue/admin actions such as run-now and all-user visibility
  • provider_openai when transcript analysis/translations should run for a non-admin user

Token-authenticated transcribe API

Tools now also exposes a separate server-to-server transcription API for token-based integrations.

Auth requirements:

  • an active personal token with provider provider_whisper_api
  • recommended transport: Authorization: Bearer YOUR_API_TOKEN
  • token owner must have whisper.api
  • token owner must also have normal Whisper access (whisper.use)
  • admin users bypass ordinary permission checks

Legacy X-Api-Key or apikey transport may still exist for backwards compatibility, but new integrations should use the Authorization header.

Queue behavior

Whisper jobs are processed asynchronously.

Job statuses:

  • queued
  • downloading
  • transcribing
  • finalizing
  • completed
  • failed

Jobs can also expose a queue origin:

  • queue_channel="web"
  • queue_channel="api"

Signed-in queue/detail views and authenticated /api/whisper/jobs* payloads can therefore show whether a job came from the regular user queue or the token-authenticated API queue.

Admin-owned jobs are prioritized ahead of non-admin jobs when queued work is claimed.

Web UI

/whisper

The signed-in queue UI lets users:

  • submit a media URL
  • upload a media file
  • choose model and language hints
  • set an optional title/label and free-text note
  • select analysis/translation language preferences
  • follow live queue progress
  • open job detail pages

/whisper/jobs/{jobId}

The signed-in detail page shows:

  • current status and progress
  • source title/description
  • runtime log
  • transcript
  • transcript analysis
  • transcript translations
  • speaker-aware transcript when available
  • public share status

Completed jobs can create a public transcript share page.

Public transcript share page

Completed transcripts can be exposed through a tokenized public page under:

  • /shared/whisper/transcript/{token}

The share page is intended for reading/transcript sharing, not queue administration.

For token-authenticated API submissions, Tools can now create that share automatically when the transcript completes successfully, and the callback payload includes the direct share URL.

Authenticated Whisper API (/api/whisper/*)

These endpoints use signed-in web/JWT auth, not the dedicated Whisper API token.

GET /api/whisper/status

Returns queue counters and capability flags.

Typical response shape:

{
  "ok": true,
  "summary": {
    "queued": 3,
    "processing": 1,
    "completed": 21,
    "failed": 2
  },
  "can_manage_all": false,
  "config": {
    "enabled": true,
    "default_model": "large",
    "upload_max_mb": 64,
    "upload_limit": {
      "configured_mb": 200,
      "php_upload_max_mb": 64,
      "php_post_max_mb": 128,
      "effective_max_mb": 64,
      "effective_max_label": "64 MB",
      "limited_by_php": true
    },
    "ytdlp_configured": true
  }
}

upload_max_mb now reflects the practical/effective limit for uploaded Whisper media on the current host, and additive config.upload_limit can explain when PHP upload/body limits are lower than Whisper's own configured cap.

GET /api/whisper/jobs?limit=100

Returns visible Whisper jobs for the authenticated user.

POST /api/whisper/jobs

Queues a new Whisper job.

Supported request styles:

  • JSON/form body with source_url
  • multipart/form-data with media_file

Important rule:

  • send either source_url or media_file, not both
  • if the uploaded file is too large for the current host, uploaded only partially, or is blocked by temporary-storage/PHP upload errors, the endpoint now returns a clearer 422 validation error under media_file instead of only the generic “failed to upload” wording

Example JSON body:

{
  "source_url": "https://example.com/audio.mp3",
  "source_label": "Interview with customer",
  "source_note": "Recorded support follow-up call.",
  "model": "large",
  "language": "sv",
  "analysis_language": "sv",
  "translation_target_languages": ["en"]
}

GET /api/whisper/jobs/{jobId}

Returns one visible Whisper job.

Additive job fields now include:

  • queue_channel
  • queue_channel_label
  • source_type
  • source_label
  • source_note
  • source_mime
  • source_size_bytes
  • source_duration_seconds
  • source_duration_human
  • stage_label
  • stage_detail
  • runtime_log[]
  • liveness
  • analysis
  • translations[]
  • diarization
  • share
  • callback (primarily relevant for API-queue jobs)

POST /api/whisper/jobs/{jobId}/analyze

Runs transcript analysis for a completed transcript.

Guardrails:

  • transcript must already exist
  • non-admin users must have OpenAI access

POST /api/whisper/jobs/{jobId}/cancel

Requests cooperative cancellation for an actively processing job.

POST /api/whisper/jobs/{jobId}/restart

Queues a failed/queued job for retry.

DELETE /api/whisper/jobs/{jobId}

Deletes a non-processing job.

POST /api/whisper/run-now

Admin/manager helper endpoint.

Request body can include:

{
  "limit": 1,
  "reset_failed": true
}

Token-authenticated transcribe API (/api/whisper/transcribe/*)

This is the dedicated server-to-server callback API.

GET /api/whisper/transcribe/status

Returns queue counters for the token-authenticated API queue channel.

GET /api/whisper/transcribe/jobs?limit=100

Returns visible jobs from the API queue channel.

GET /api/whisper/transcribe/jobs/{jobId}

Returns one visible API-queue job.

POST /api/whisper/transcribe

Queues a new token-authenticated Whisper job.

Required field:

  • callback_url

Supported submission styles:

  • URL jobs using source_url
  • multipart file jobs using media_file

Upload validation guidance:

  • when the uploaded file is larger than the current practical host limit, the API can now return 422 with errors.media_file[] explaining the effective Whisper upload limit
  • the same media_file validation path is also used for partial uploads, missing temp-folder failures, write failures, and other PHP upload transport errors before the job is queued

The token API accepts the same additive metadata as the ordinary queue endpoint, including:

  • source_label
  • source_note
  • model
  • language
  • analysis_language
  • translation_target_languages[]
  • disable_diarization

Example JSON body:

{
  "source_url": "https://example.com/audio.mp3",
  "callback_url": "https://api.example.test/whisper/callback",
  "source_label": "Customer interview",
  "source_note": "Transcribe and send the final result back to our integration.",
  "model": "large",
  "language": "en",
  "analysis_language": "en",
  "translation_target_languages": ["sv"]
}

Example success response:

{
  "ok": true,
  "message": "Whisper API job queued. A callback will be sent when the job reaches a terminal state.",
  "job": {
    "id": 123,
    "queue_channel": "api",
    "queue_channel_label": "API queue",
    "status": "queued",
    "callback": {
      "url": "https://api.example.test/whisper/callback",
      "status": "pending",
      "http_status": null,
      "last_attempt_at": null,
      "delivered_at": null,
      "error": null
    },
    "share": null
  }
}

Callback contract

When a token-authenticated Whisper API job reaches terminal completed or failed, Tools sends one JSON POST to the submitted callback_url.

Callback envelope:

{
  "ok": true,
  "event": "whisper.job.completed",
  "job": {
    "job_id": 123,
    "status": "completed",
    "status_label": "Completed",
    "queue_channel": "api",
    "queue_channel_label": "API queue",
    "source": "Customer interview",
    "model": "large",
    "language": "en",
    "job_url": "https://tools.example.test/whisper/jobs/123",
    "share_url": "https://tools.example.test/shared/whisper/transcript/example-token-redacted",
    "transcript_text": "...",
    "analysis_text": "...",
    "translations": [],
    "share": {
      "url": "https://tools.example.test/shared/whisper/transcript/example-token-redacted"
    }
  }
}

Failure callbacks use event="whisper.job.failed" and can include failure_error instead of transcript/share data.

Client guidance:

  • treat callbacks as asynchronous terminal-state notifications
  • store them idempotently by job.job_id
  • do not assume a share URL exists on failed jobs
  • do not assume transcript analysis/translations are always present for every account

Error handling

Typical error classes:

  • 401 unauthenticated / token rejected
  • 403 missing permission
  • 404 job not found or not visible to the caller
  • 422 validation or business-rule failure
  • 429 throttled
  • 5xx temporary backend/provider failure

Rate limiting

Whisper API routes use a general throttle:120,1 policy.

Clients should still implement normal backoff for repeated polling or transient failures.

Safe client recommendations

  • Prefer Authorization: Bearer YOUR_API_TOKEN
  • Treat callback_url as required for token-authenticated submissions
  • Expect jobs to finish asynchronously
  • Surface queue_channel and queue_channel_label in operator/debug UIs
  • Treat transcript_text as the primary result and speaker_aware_transcript as additive helper output
  • Treat share.url as public access and handle it carefully