Qwen Surpasses Llama in Downloads and Inference Cost
Chinese models claimed 41% of Hugging Face downloads last year vs US 36.5%; Qwen's inference costs crushed Llama, but Alibaba ousted its 100-person team after lead resigned.
Chinese Models Dominate Open-Source AI Downloads
Between February 2025 and 2026, Chinese models captured 41% of all downloads on Hugging Face, outpacing U.S. models at 36.5%. Qwen, Alibaba's leading family, drove much of this shift despite its ~100-engineer team—far smaller than ByteDance's competing unit of nearly 1,000.
Qwen's Inference Edge Over Llama
In three weeks of benchmarking for a client's workload, the author abandoned Llama testing after four days. Qwen's cost advantages were insurmountable, making further Llama comparisons unjustifiable for production inference.
Qwen Team's Sudden End
On March 4, 2026, at 12:11 AM Beijing time, Qwen lead Junyang Lin posted on X: 'me stepping down. bye my beloved qwen.' Alibaba then dismantled the team. The article teases '3 Things to Do Before July' for teams, emphasizing urgency amid Qwen's rise, but details are paywalled.