Sesterce
    • Compute
    • Inference
    • Storage
    • API Reference
    • Documentation
    • Help & Requests
    Log inSign up
B200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLEB200 BARE-METAL CLUSTER - PRE-ORDER AVAILABLE

Inference

Quickly deploy a public or custom model to a dedicated inference endpoint.

New Deployment
Public modelsYour models
Text
Model
Parameters
Size
Context Window
gte-Qwen2-1.5B-instructgte-Qwen2-1.5B-instructContinue
1.5 B
2 GB
128 K
Qwen2.5-7B-InstructQwen2.5-7B-InstructContinue
7 B
4.7 GB
128 K
Qwen2.5-Coder-32B-InstructQwen2.5-Coder-32B-InstructContinue
32 B
20 GB
128 K
Qwen2.5-Instruct-GPTQ-Int8Qwen2.5-Instruct-GPTQ-Int8Continue
14 B
4.7 GB
128 K
Qwen3-14BQwen3-14BContinue
14.2 B
9 GB
32 K
Qwen3-30B-A3BQwen3-30B-A3BContinue
30 B
17 GB
128 K
Qwen3-32BQwen3-32BContinue
32.8 B
33 GB
128 K
QwQ-32BQwQ-32BContinue
32.5 B
65.5 GB
32 K
QwQ-32B-PreviewQwQ-32B-PreviewContinue
32.5 B
65.5 GB
32 K
Gemma3-1BGemma3-1BContinue
1 B
1 GB
128 K
Mistral-7B-Instruct-v0.3Mistral-7B-Instruct-v0.3Continue
7 B
4.1 GB
32 K
Mistral-Nemo-Instruct-2407Mistral-Nemo-Instruct-2407Continue
12.2 B
7.1 GB
128 K
Mistral-Small-Instruct-2409Mistral-Small-Instruct-2409Continue
7 B
4.7 GB
128 K
Mistral-Small-Instruct-2501Mistral-Small-Instruct-2501Continue
24 B
14 GB
128 K
DeepSeek-R1-Distill-LlamaDeepSeek-R1-Distill-LlamaContinue
70 B
4.7 GB
128 K
DeepSeek-R1-Distill-QwenDeepSeek-R1-Distill-QwenContinue
32 B
20 GB
128 K
DeepSeek-R1-Distill-QwenDeepSeek-R1-Distill-QwenContinue
14 B
9 GB
130 K
Llama-3.1-8B-InstructLlama-3.1-8B-InstructContinue
8 B
4.7 GB
128 K
Llama-3.2-1B-InstructLlama-3.2-1B-InstructContinue
1 B
1.3 GB
128 K
Llama-3.2-3B-InstructLlama-3.2-3B-InstructContinue
3.2 B
2 GB
128 K
Llama-3.3-70B-InstructLlama-3.3-70B-InstructContinue
70 B
43 GB
128 K
Phi-3.5-MoE-instructPhi-3.5-MoE-instructContinue
32 B
8.4 GB
128 K
phi-4phi-4Continue
14.7 B
8.4 GB
16 K
Marco-o1Marco-o1Continue
7.6 B
4.7 GB
128 K
Llama-3.1-Nemotron-70B-InstructLlama-3.1-Nemotron-70B-InstructContinue
70 B
43 GB
128 K
aya-expanse-32baya-expanse-32bContinue
32 B
20 GB
130 K