Lyria 3 Pro: Generate 3-Min Songs with Section Timestamps
Lyria 3 Pro adds precise control over full 3-minute songs via timestamps for intro/verse/chorus/bridge, custom lyrics, BPM/key settings, and multimodal image/video inputs through Gemini API.
Precise Structural Control Unlocks Full Songs
Lyria 3 Pro overcomes Lyria 3's limitations—no more 30-second clips that abruptly end without structure. Now generate up to 3-minute tracks by defining sections like intro (0-10s), verse (10-30s), chorus, bridge, drop, build, solo, or outro with exact timestamps. Specify BPM (e.g., 90), key (e.g., A minor), and mood shifts (e.g., low-fi hip-hop to high-energy peaks). This ensures the model follows instructions precisely, producing dynamic compositions where beats strip away for atmospheric synths before heavy bass drops, maintaining coherence across sections.
Prompt example for quick generation: "Dynamic cool underground bar track that constantly shifts energy between chill vibes and high-energy peaks," paired with BPM 90 and key selection. For structured output, detail each segment's length and style, yielding breakdowns like "intro: low-fi hip-hop (0-10s)" transitioning seamlessly to builds.
Custom Lyrics and Genre Flexibility
Input your own lyrics and assign them to specific sections (verse here, chorus there), generating vocals, instrumentation, and full tracks in genres like pop, lo-fi, indie, hip-hop, or classical. Add mood descriptors, instruments, BPM, and key for tailored results. Example lyrics prompt produces singing like "midnight city streets in the rhythm of this room," with pop beats and builds that captivate listeners.
This turns vague ideas into professional songs, supporting multilingual potential by specifying languages in prompts. Trade-off: Outputs excel in structured prompts but may need iteration for complex videos.
Multimodal Inputs for Visual-Mood Matching
Feed images for mood-matched tracks—upload a dynamic image with prompt "create a dynamic track inspired by this image," and Gemini's multimodality analyzes visuals to compose fitting audio quickly.
For videos, pipe through Gemini Flash: "Dynamic track inspired by this video" auto-generates soundtracks syncing energy to content (e.g., short clips get ambient scores). Works for nested videos but performs best on simpler inputs; complex scenes may require refined prompts.
Access via Gemini app (paid), Vertex AI (enterprise), or Google AI Studio/Gemini API. Build apps like the demo's five-tab interface (quick generate, structured composer, lyric studio, image-to-music, video soundtrack) to test all modes rapidly.