Deepgram SDK: Transcribe, TTS, Analyze Audio/Text in Python

Build Scalable Transcription Pipelines with Sync/Async Clients

Initialize DeepgramClient for sync and AsyncDeepgramClient for parallel ops using API key. Transcribe URL audio via client.listen.v1.media.transcribe_url(url, model="nova-3", smart_format=True, diarize=True, utterances=True, filler_words=True, language="en") to get structured response.results.channels0.alternatives0 with transcript, confidence (e.g., 0.98), words list (each with word, start/end ms, confidence, speaker), metadata (duration, channels, model). For files, use transcribe_file(request=audio_bytes, model="nova-3", paragraphs=True, summarize="v2") yielding paragraphs (speaker, start/end, sentences), AI summary (e.g., short paragraph), word count. Run async in parallel: await asyncio.gather(transcribe_url(...), transcribe_file(...)) cuts latency for high-volume processing, scaling to production pipelines without blocking.

Access raw bytes via with open(path, "rb") as f: f.read(); helpers like _get(obj, key) handle dict/object responses flexibly.

Generate and Compare TTS Voices Efficiently

Create speech with client.speak.v1.audio.generate(text, model="aura-2-asteria-en") returning stream/generator; aggregate to bytes via b"".join(chunk for chunk in response) or response.stream.getvalue(), save as MP3. Switch voices seamlessly: "aura-2-asteria-en" (female warm), "aura-2-orion-en" (male deep), "aura-2-luna-en" (female bright) on same text like "Hello!" produce ~10-50KB files, enabling A/B testing or dynamic selection in apps. This unifies TTS in voice AI loops post-transcription.

Extract Insights via Text Intelligence and Advanced Controls

Analyze text with client.read.v1.text.analyze({"text": review_text}, language="en", sentiment=True, topics=True, intents=True, summarize=True) for results.sentiments.average (e.g., POSITIVE score 0.99), segments, topics (e.g., "product_quality" conf 0.95), intents (e.g., "recommend" conf 0.92), summary. Target transcripts: add search=["spacewalk","mission"], replace=[{"find":"um","replace":"[hesitation]"}], keyterm=["spacewalk","NASA"] to highlight hits (start/end/conf), boost detection. Raw access with_raw_response.transcribe_url(...) exposes headers (dg-request-id) for debugging. Wrap in try/except ApiError: request_options={"timeout_in_seconds":30, "max_retries":2} handles 4xx/5xx gracefully, ensuring resilient pipelines for real-time apps.

Build Scalable Transcription Pipelines with Sync/Async Clients

Generate and Compare TTS Voices Efficiently

Extract Insights via Text Intelligence and Advanced Controls

More on Edge

Stealth CloakBrowser Automation in Colab with Persistence

OpenAI Privacy Filter: Local PII Redaction Breakthrough

Build F1 MCP Server in VS Code with Python & Copilot

Offline AI Music Search for Cars with Qdrant Edge