Qwen Surpasses Llama in Downloads and Inference Cost

Chinese models claimed 41% of Hugging Face downloads last year vs US 36.5%; Qwen's inference costs crushed Llama, but Alibaba ousted its 100-person team after lead resigned.

Chinese Models Dominate Open-Source AI Downloads

Between February 2025 and 2026, Chinese models captured 41% of all downloads on Hugging Face, outpacing U.S. models at 36.5%. Qwen, Alibaba's leading family, drove much of this shift despite its ~100-engineer team—far smaller than ByteDance's competing unit of nearly 1,000.

Qwen's Inference Edge Over Llama

In three weeks of benchmarking for a client's workload, the author abandoned Llama testing after four days. Qwen's cost advantages were insurmountable, making further Llama comparisons unjustifiable for production inference.

Qwen Team's Sudden End

On March 4, 2026, at 12:11 AM Beijing time, Qwen lead Junyang Lin posted on X: 'me stepping down. bye my beloved qwen.' Alibaba then dismantled the team. The article teases '3 Things to Do Before July' for teams, emphasizing urgency amid Qwen's rise, but details are paywalled.

Summarized by x-ai/grok-4.1-fast via openrouter

3684 input / 902 output tokens in 8466ms

© 2026 Edge