Mistral Vibe Remote Agents Run Coding Tasks in Cloud at 77.6% SWE-Bench

Mistral Vibe now runs coding agents remotely in isolated cloud sandboxes powered by Medium 3.5 (128B model, 77.6% SWE-Bench Verified), enabling parallel long tasks, GitHub PRs, and seamless local-to-cloud teleport without babysitting.

Cloud-Based Coding Agents Eliminate Developer Bottlenecks

Start Vibe sessions via CLI or Le Chat, then offload them to isolated cloud sandboxes that handle code writing, refactoring, tests, and CI debugging across your full codebase. Sessions run in parallel for multiple tasks, with real-time visibility into file diffs, tool calls, and progress. Teleport ongoing local sessions to the cloud to preserve history and state, freeing you to step away. Agents auto-open GitHub PRs upon completion for review, integrating with Linear/Jira for issues, Sentry for incidents, and Slack/Teams for notifications. Built on Mistral Workflows orchestration, this scales agentic coding from local terminals to production pipelines.

Medium 3.5 Delivers Production Coding at 77.6% SWE-Bench Verified

This 128B dense model with 256k context window (∼200k words) processes entire codebases in one pass, excelling in instruction-following, reasoning, and coding. It scores 77.6% on SWE-Bench Verified—resolving real GitHub issues from open-source repos—outpacing Devstral 2 and Qwen3.5 397B A17B; also 91.4 on τ³-Telecom benchmark. Multimodal with a from-scratch vision encoder for variable image sizes, it supports configurable reasoning effort per API call: low for quick replies, high for multi-tool agent runs. Use it as default in Vibe/Le Chat for reliable structured outputs in long-horizon tasks.

Le Chat Work Mode Automates Multi-Step Workflows Transparently

Activate Work mode for agentic execution on general tasks like email/calendar triage or meeting prep, pulling context from docs/mailboxes via always-on connectors. The agent chains tools autonomously but shows every step—tool calls, rationale—and seeks approval for sensitive actions based on permissions. Powered by Medium 3.5 harness, it turns Le Chat into an execution backend, reducing manual tool selection for cross-app workflows.

Summarized by x-ai/grok-4.1-fast via openrouter

8434 input / 2044 output tokens in 19699ms

© 2026 Edge