Alibaba’s Qwen team posted preview rankings for two Qwen3.7 variants on Chatbot Arena, placing Qwen3.7 Max Preview 13th overall in the Text Arena and Qwen3.7 Plus Preview 16th overall in the Vision Arena, according to a post on Thread Reader App from the Alibaba Qwen team.
Chatbot Arena rankings are determined by human preference votes in blind head-to-head comparisons, making them harder to inflate than single-benchmark scores. A position in the top 20 for text puts Qwen3.7 Max alongside models from OpenAI, Anthropic, and Google. A separate top-20 position for vision is notable because most labs optimize one capability at a time.
The “Preview” label signals that these are pre-release checkpoints, not final weights. Rankings often shift between preview and general release as labs tune for human preference specifically. Qwen’s prior generation, Qwen2.5, closed a meaningful gap with frontier models at a fraction of the training cost, and the Qwen3.7 numbers suggest that trajectory is continuing.
For teams building on open-weight models, Qwen3.7 is worth running on your specific workloads before the general release lands. If the preview rankings hold, any team currently defaulting to a heavier frontier model for cost reasons should benchmark Qwen3.7 Max as a direct alternative before committing to a Q3 contract.
Reported by the Alibaba Qwen team on Thread Reader App (publication date undated).