MiniMax CLI: Terminal AI for Text, Images, Video, Speech, Music
MiniMax CLI lets you generate text, images, videos, speech, and music directly from terminal or AI agents, with streaming, multi-turn chat, vision, search, and dual global/CN API support. Requires Node.js 18+ and MiniMax token.
Multimodal Generation Capabilities
MiniMax CLI provides terminal access to MiniMax AI for creating text (multi-turn conversations, streaming responses, system prompts, JSON mode), images (text-to-image with aspect ratios and batching), videos (async generation with progress tracking), speech (TTS using 30+ voices, speed controls, streaming playback), music (text-to-music with optional lyrics), vision (image analysis and description), and web search. Dual-region support switches seamlessly between global (api.minimax.io) and China (api.minimaxi.com) endpoints, enabling agents like OpenClaw, Cursor, or Claude Code to integrate these features.
Trade-offs: Async operations like video require polling for status; all features need a paid MiniMax token plan (global: platform.minimax.io/subscribe/token-plan; CN: platform.minimaxi.com/subscribe/token-plan).
Setup and Authentication
Install globally with bun install -g @minimaxi/cli or npm i -g @minimaxi/cli (Node.js 18+ required). Authenticate via mmx auth for OAuth browser flow or mmx auth logout. Check quotas with mmx quota, configure with mmx config set, and update via mmx update. Repository uses TypeScript (99.8%), has 280 stars, 16 forks, and includes docs like AGENTS.md, SKILL.md, ERRORS.md.
Practical Command Patterns
Pipe inputs for chaining: echo "user:Hi\nassistant:Hey!" | mmx text "How are you?". Generate images/videos in batches: mmx image "A cat" "Logo". Stream speech: mmx speech "Hello!" --stream | say (macOS) or pipe to echo. Music with lyrics: mmx music "Upbeat pop" "[verse] La da dee, sunny day". Vision: mmx vision "What breed?" < image.jpg. Search: mmx search "MiniMax AI latest news". Quick starts like mmx text "What is MiniMax?" or mmx image "A cat in a spacesuit" deliver instant results, respecting token limits and enabling production workflows.