Model and Framework Support Matrix#

MindIE SD currently supports the vLLM Omni framework, the Cache DiT framework, and the Modelers community. In theory, MindIE SD can accelerate inference for any multimodal model, but the matrix below lists the representative models and feature combinations that are currently supported.

Model support#

Model

vLLM Omni

Cache DiT + diffusers

Modelers community

Stable Diffusion 1.5

✖️

✖️

Stable Diffusion 2.1

✖️

✖️

Stable Diffusion XL

✖️

✖️

Stable Diffusion XL_inpainting

✖️

✖️

Stable Diffusion XL_lighting

✖️

✖️

Stable Diffusion XL_controlnet

✖️

✖️

Stable Diffusion XL_prompt_weight

✖️

✖️

Stable Diffusion 3

✖️

✖️

Stable Video Diffusion

✖️

✖️

Stable Audio Open v1.0

✖️

✖️

OpenSora v1.2

✖️

✖️

OpenSoraPlan v1.2

✖️

✖️

OpenSoraPlan v1.3

✖️

✖️

CogView3-Plus-3B

✖️

✖️

CogVideoX-2B

✖️

✖️

CogVideoX-5B

✖️

✖️

HunyuanDit

✖️

✖️

HunyuanVideo

✖️

✖️

HunyuanVideo-1.5

✖️

✖️

Hunyuan3D-2.1

✖️

✖️

Wan2.1

✖️

✖️

Wan2.2

✖️

✖️

FLUX.1-dev

FLUX.2-dev

✖️

Qwen-Image

✖️

Qwen-Image-Edit

✖️

Qwen-Image-Edit-2509

✖️

Z-Image

✖️

✖️

Z-Image-Turbo

✖️

vLLM Omni features and model performance#

Model

Hardware

Cache

Parallelism

Sparse FA

Quantization

Fused operators

FLUX.1-dev

Atlas 800I A2 server

✖️

Qwen-Image

Atlas 800I A2 server

✖️

✖️

Qwen-Image-Edit

Atlas 800I A2 server

✖️

✖️

Qwen-Image-Edit-2509

Atlas 800I A2 server

✖️

✖️

Z-Image-Turbo

Atlas 800I A2 server

✖️

✖️

✖️

Note Atlas 800I A2 servers use 313T default compute and 64 GB of memory.

Cache DiT + diffusers features and model performance#

Model

Hardware

Cache

Parallelism

Sparse FA

Quantization

Fused operators

FLUX.1-dev

Atlas 800I A2 server

✖️

FLUX.2-dev

Atlas 800I A2 server

✖️

✖️

✖️

Modelers community feature combinations and model performance#

Model

Hardware

Cache

Parallelism

Sparse FA

Quantization

Fused operators

Notes

Stable Diffusion 1.5

  • Atlas 800I A2 server
  • Atlas 300I DUO inference card

✖️

✖️

None

Stable Diffusion 2.1

  • Atlas 800I A2 server
  • Atlas 300I DUO inference card

✖️

✖️

None

Stable Diffusion XL

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server
  • Atlas 300I DUO inference card

✖️

✖️

None

Stable Diffusion XL_inpainting

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

Functional integration complete

Stable Diffusion XL_lighting

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

Functional integration complete

Stable Diffusion XL_controlnet

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

Functional integration complete

Stable Diffusion XL_prompt_weight

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

Functional integration complete

Stable Diffusion 3

  • Atlas 800I A2 server
  • Atlas 300I DUO inference card

✖️

✖️

None

Stable Video Diffusion

Atlas 800I A2 server

✖️

✖️

None

Stable Audio Open v1.0

  • Atlas 800I A2 server
  • Atlas 300I DUO inference card

✖️

✖️

✖️

None

OpenSora v1.2

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

None

OpenSoraPlan v1.2

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

None

OpenSoraPlan v1.3

Atlas 800I A2 server

✖️

✖️

None

CogView3-Plus-3B

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

None

CogVideoX-2B

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

None

CogVideoX-5B

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

None

FLUX.1-dev

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

FLUX.2-dev

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

HunyuanDit

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

None

HunyuanVideo

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

HunyuanVideo-1.5

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

None

Hunyuan3D-2.1

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

Wan2.1

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

None

Wan2.2

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

None

Qwen-Image

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

Qwen-Image-Edit

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

Qwen-Image-Edit-2509

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

None

Z-Image

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

✖️

✖️

None

Z-Image-Turbo

  • Atlas 800I A2 server
  • Atlas 800I A3 supernode server

✖️

✖️

✖️

✖️

None

Note

  • Atlas 300I DUO inference cards use 280T default compute and 48 GB of memory.

  • Atlas 800I A2 servers use 313T default compute and 64 GB of memory.

  • Atlas 800I A3 supernode servers use 560T default compute and 64 GB of memory.