Supported Models

Fleek provides optimized inference for leading open-source AI models across text, image, video, and 3D generation. All models run at $0.0025 per GPU-second with no per-token or per-prediction markup.

Browse all models →

Large Language Models (LLMs)

Model	Provider	Parameters	Context	Use Case
DeepSeek R1	DeepSeek	671B MoE (37B active)	256K	Advanced reasoning, math, code
Kimi K2.5	Moonshot AI	1T MoE (32B active)	256K	Multimodal, agentic workflows
GLM 4.7	Z.ai	355B MoE (32B active)	200K	Code generation (73.8% SWE-bench)
Qwen3-235B	Alibaba	235B	128K	Multilingual, general reasoning
Llama 70B	Meta	70B	128K	Production workloads
gpt-oss-120b	Community	120B	32K	High throughput, clean IP

Coding-Optimized

Model	Provider	Parameters	Context	Highlights
Qwen3 Coder 480B	Alibaba	480B MoE	256K	67% SWE-bench, Apache 2.0
Qwen3 Coder 30B A3B	Alibaba	30B MoE (3B active)	256K	Lightweight, fast inference

Image Generation

Model	Provider	Quality	Resolution	Features
FLUX.2	Black Forest Labs	Premium	2048×2048	Multi-reference editing, color control
Z-Image	Alibaba	Quality	2048×2048	DiT architecture, LoRA-friendly
Qwen Image 2512	Alibaba	Quality	2048×2048	Prompt enhancement, style transfer
SDXL Turbo	Stability AI	Fast	1024×1024	Real-time, 1-step generation

Image Editing

Model	Provider	Features
Qwen Edit 2511	Alibaba	Instruction-based editing, style transfer, object removal

Video Generation

Model	Provider	Duration	Resolution	Features
LTX-2 19B	Lightricks	Up to 20s	4K	Audio sync, lip sync, 50fps
Wan Move	Alibaba	Up to 15s	1080p	Image-to-video, motion control
Wan 2.2 14B	Alibaba	Up to 16s	1080p	Text-to-video, temporal consistency
SeedVR	ByteDance	Up to 10s	4K	Video upscaling, detail enhancement
Seedance 1.5	ByteDance	Up to 12s	1080p	Dance generation, beat sync

3D Generation

Model	Provider	Output Formats	Features
Hunyuan 3D-2	Tencent	GLB, OBJ, FBX	Text-to-3D, image-to-3D, game-ready
Trellis 2 4B	Microsoft	GLB, USDZ, OBJ	Fast, MIT licensed, Apple support

Model Selection Guide

For reasoning & analysis

→ DeepSeek R1 — Best-in-class chain-of-thought reasoning

For coding tasks

→ GLM 4.7 — Highest SWE-bench score (73.8%) → Qwen3 Coder 480B — Handles entire codebases

For multimodal workflows

→ Kimi K2.5 — Native image/screenshot analysis

For production scale

→ Llama 70B — Proven reliability, broad support

For image generation

→ FLUX.2 — Photorealistic quality → SDXL Turbo — Real-time generation

For video

→ LTX-2 19B — Only model with native audio sync

Coming Soon

We're constantly adding new models. Request a model on Discord or contact us.

Questions?

Browse models: fleek.sh/models
Pricing details: fleek.sh/docs/pricing
Discord: discord.gg/h6K2cJU4np