ai/qwen3.5

Verified Publisher

By Docker

•Updated 6 days ago

397B MoE model with 17B activation for reasoning, coding, agents, and multimodal understanding

Model

50K+

Overview Tags

ai/qwen3.5 repository overview

⁠Qwen3.5

Qwen3.5 represents a significant advancement in foundation models, delivering exceptional utility and performance through breakthroughs in multimodal learning, architectural efficiency, and global accessibility. The flagship Qwen3.5-397B-A17B model features 397 billion total parameters with 17 billion activated parameters using a sparse Mixture-of-Experts architecture, achieving state-of-the-art results across reasoning, coding, agents, and visual understanding tasks.

This model integrates unified vision-language capabilities through early fusion training on multimodal tokens, achieving cross-generational parity with text-focused Qwen3 models while surpassing previous Qwen3-VL models. The efficient hybrid architecture combines Gated Delta Networks with sparse Mixture-of-Experts to deliver high-throughput inference with minimal latency and cost overhead.

Qwen3.5 provides expanded multilingual support for 201 languages and dialects, enabling inclusive worldwide deployment with nuanced cultural and regional understanding. The model's reinforcement learning approach scales across million-agent environments with progressively complex task distributions for robust real-world adaptability.

⁠Characteristics

Attribute	Value
Provider	Qwen / Alibaba Cloud
Architecture	Mixture-of-Experts (512 experts, 10 routed + 1 shared active)
Total Parameters	397B (17B activated)
Context Length	262,144 tokens (extensible to 1,010,000)
Languages	201 languages and dialects
Input modalities	Text, Image
Output modalities	Text
License	Apache 2.0

⁠Using this model with Docker Model Runner

docker model run qwen3.5

For more information, check out the Docker Model Runner docs⁠.

⁠Benchmarks

Benchmark Overview

⁠Knowledge

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
MMLU-Pro	87.4	89.5	89.8	85.7	87.1	87.8
MMLU-Redux	95.0	95.6	95.9	92.8	94.5	94.9
SuperGPQA	67.9	70.6	74.0	67.3	69.2	70.4
C-Eval	90.5	92.2	93.4	93.7	94.0	93.0

⁠Instruction Following

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
IFEval	94.8	90.9	93.5	93.4	93.9	92.6
IFBench	75.4	58.0	70.4	70.9	70.2	76.5
MultiChallenge	57.9	54.2	64.2	63.3	62.7	67.6

⁠Long Context

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
AA-LCR	72.7	74.0	70.7	68.7	70.0	68.7
LongBench v2	54.5	64.4	68.2	60.6	61.0	63.2

⁠STEM

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
GPQA	92.4	87.0	91.9	87.4	87.6	88.4
HLE	35.5	30.8	37.5	30.2	30.1	28.7

⁠Reasoning

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
LiveCodeBench v6	87.7	84.8	90.7	85.9	85.0	83.6
HMMT Feb 25	99.4	92.9	97.3	98.0	95.4	94.8
HMMT Nov 25	100.0	93.3	93.3	94.7	91.1	92.7
IMOAnswerBench	86.3	84.0	83.3	83.9	81.8	80.9
AIME26	96.7	93.3	90.6	93.3	93.3	91.3

⁠General Agent

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
BFCL-V4	63.1	77.5	72.5	67.7	68.3	72.9
TAU2-Bench	87.1	91.6	85.4	84.6	77.0	86.7
VITA-Bench	38.2	56.3	51.6	40.9	41.9	49.7
DeepPlanning	44.6	33.9	23.3	28.7	14.5	34.3
Tool Decathlon	43.8	43.5	36.4	18.8	27.8	38.3
MCP-Mark	57.5	42.3	53.9	33.5	29.5	46.1

⁠Search Agent

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
HLE w/ tool	45.5	43.4	45.8	49.8	50.2	48.3
BrowseComp	65.8	67.8	59.2	53.9	--	69.0
WideSearch	76.8	76.4	68.0	57.9	72.7	74.0
Seal-0	45.0	47.7	45.5	46.9	57.4	46.9

⁠Multilingualism

Benchmark	GPT5.2	Claude 4.5 Opus	Gemini-3 Pro	Qwen3-Max-Thinking	K2.5-1T-A32B	Qwen3.5-397B-A17B
MMMLU	89.5	90.1	90.6	84.4	86.0	88.5

⁠Links

⁠Considerations

The model requires substantial computational resources; quantized versions (Q2_K_XL through Q8_0) are available for different hardware configurations
Native context length is 262K tokens; extended context up to 1M tokens may require additional configuration
This repository contains GGUF quantized versions optimized by Unsloth for efficient local inference
The model features a sparse MoE architecture activating only 17B of 397B parameters per token, balancing performance with efficiency
Multimodal capabilities support both text and image inputs through unified vision-language training
Vision capabilities require separate mmproj model files (included in repository)

⁠Generated by

This model card was automatically generated using cagent-action⁠. Want to learn more about Docker Model Runner? Check out the project repository: https://github.com/docker/model-runner⁠.

Tag summary

Recent tags

Content type

Model

Digest

sha256:c6c5be279…

Size

5.6 GB

Last updated

6 days ago

docker model pull ai/qwen3.5:9B-UD-Q4_K_XL

This week's pulls

Pulls:

5,584

Last week

Learn more⁠