DeepSeek V4 Tests: 3D Code Strong, SVG & QA Weak

Expert Mode Delivers Bigger Outputs but Limits Concurrency

DeepSeek's new interface offers two modes: Expert for the most powerful generations (likely V4) and Instant for image prompts and multimodal tasks. Expert mode processes one prompt at a time without parallel threads, ensuring focused compute on complex requests. Attach images automatically switches to Instant, confirming multimodal support. Use Expert for single, high-fidelity code outputs like full HTML files with Three.js; avoid it for batch testing due to the one-at-a-time restriction.

3D Generation Succeeds on Practical Layouts and Objects

For a 1585 square foot 3D floor plan with two rooms and two washrooms, Expert mode outputs a single runnable HTML file using HTML, CSS, JS, and Three.js. The result shows accurate layout: visible bathrooms and bedrooms, fully navigable and usable. Similarly, a Three.js Pokeball generates a polished, dark-blue tinted sphere matching refined styles like GPT-4o. These tests prove DeepSeek V4 handles interactive 3D architecture and object modeling reliably—copy the HTML, open in a browser, and interact immediately without tweaks.

Creative SVGs, Complex Scenes, and Functionality Fall Short

SVG panda holding a burger produces disproportionate hands and low overall quality, lacking polish. A 3D chessboard with all pieces and autoplay for legal moves looks visually impressive but autoplay fails entirely—pieces render but no opponent simulation or win detection works. Majestic 3D butterfly in a blue garden with camera controls resembles a distorted character (like Gardevoir) more than an insect; basic movement functions but lacks detail and accuracy. Trade-off: Strong visuals don't guarantee working interactions; test functionality post-generation.

Reasoning Stalls on Simple QA, Hinting at Scale Limits

Basic question-answering gets stuck midway, failing to complete responses—issues may resolve in API versions but expose current web interface limits. Overall, V4 shows promise over prior models but trails DeepSeek R1 in size and consistency; wait for full release before production use. Prioritize it for 3D code prototypes where it outperforms on usability.