Platform Pricing
Free Credits to all new users (worth 2 million tokens on Llama 3.3-70B)
CentML offers competitive pricing for GenAI model deployment,
with flexible options to suit a wide range of
models, from small to large-scale deployments.
Developer
Best performance - no hidden fees
- Free credits for all new users (worth 2 million tokens on Llama 3.3-70B)
- Full pay-as-you-go billing - per minute/per token
- On-demand dedicated endpoints - no rate limits
- Planner feature available for all deployments
- No daily limits
Enterprise
Custom solutions for scaling
- Custom pricing
- Unlimited rate limits
- Unlimited deployed models
- Dedicated and self-hosted deployments
- Guaranteed uptime SLA
- 24/7 tech support
- Plus all features from the Developer package
Platform Pricing Overview
Deploying Applications are calculated on a credit-based billing system, where 1 CentML credit equals 1 USD.
You can buy credits through the Platform by going to your Account page.
Serverless Endpoint usage is billed according to the total number of tokens generated and processed.
Model | Credits per 1M Tokens | Recommended Usage |
---|---|---|
DeepSeek-R1 671B MoE | 3.99 | Coding assistance, logic-based Q&A, advanced reasoning, and chat |
Llama-3.3-70B-Instruct | 0.50 | General Q&A and Retrieval Augmented Generation |
Qwen2.5-Coder-32B-Instruct* | 0.80 | Fast coding assistance |
Qwen2.5-VL-7B-Instruct* | 0.15 | Object detection and description in images and video |
Llama-3.1-8B-Instruct* | 0.10 | Fast multi-language inference and translation |
Llama-3.2-3B-Instruct* | 0.06 | Faster multi-language inference and translation |
Phi-3.5-mini-instruct* | 0.12 | Fast, high-quality reasoning |
* Coming Soon
Dedicated Deployments
Dedicated deployments are charged based on the type and duration of hardware used,
following a per-minute
billing system.
Accelerators | Credits per hour |
---|---|
NVIDIA L4 - 24GB | 0.30 |
NVIDIA A10G - 24GB | 0.30 |
NVIDIA A100 - 40GB | 1.10 |
AWS Inf2 - 32GB | 2.00 |
NVIDIA H100 - 80GB | 2.50 |
NVIDIA H200 - 141GB | 2.60 |
GCP v6e TPU - 32GB | 2.70 |
AMD MI300X - 192GB | 14.00 |
Customized Plans
Looking for specialized requirements or larger-scale deployments?
We offer customizable plans to suit
enterprise needs. Contact us for details.