Qwen Surpasses Llama in Downloads and Inference Cost

Chinese Models Dominate Open-Source AI Downloads

Between February 2025 and 2026, Chinese models captured 41% of all downloads on Hugging Face, outpacing U.S. models at 36.5%. Qwen, Alibaba's leading family, drove much of this shift despite its ~100-engineer team—far smaller than ByteDance's competing unit of nearly 1,000.

Qwen's Inference Edge Over Llama

In three weeks of benchmarking for a client's workload, the author abandoned Llama testing after four days. Qwen's cost advantages were insurmountable, making further Llama comparisons unjustifiable for production inference.

Qwen Team's Sudden End

On March 4, 2026, at 12:11 AM Beijing time, Qwen lead Junyang Lin posted on X: 'me stepping down. bye my beloved qwen.' Alibaba then dismantled the team. The article teases '3 Things to Do Before July' for teams, emphasizing urgency amid Qwen's rise, but details are paywalled.