Platform Pricing Page

Free $10 Credits to all new users (4 million tokens on Llama 3.1 405B)

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.

Developer

Best performance – no hidden fees

$10 free credits for all new users
Full pay-as-you-go billing – per minute/per token
On-demand dedicated endpoints – no rate limits
Planner feature available for all deployments
No daily limits

Deploy Now

Enterprise

Custom solutions for scaling

Custom pricing
Unlimited rate limits
Unlimited deployed models
Dedicated and self-hosted deployments
Guarenteed uptime SLA
24/7 tech support
Plus all features from the Developer package

Platform Pricing Overview

Deploying Applications are calculated on a credit-based billing system, where 1 CentML credit equals 1 USD. You can buy credits through the Platform by going to your Account page.

Serverless Endpoint usage is billed according to the total number of tokens generated and processed.

Model Size	Price per 1M Tokens	Examples
Small (1-4B)	$0.04	Smaller language models
Medium (7-11B)	$0.08	General-purpose models
Large (70-90B)	$0.50	Complex AI applications
X-Large (405B)	$2.50	High-demand, intensive LLMs

Dedicated Deployments

Dedicated deployments are charged based on the type and duration of hardware used, following a per-minute billing system.

GPU	Credits per GPU hour
NVIDIA L4 – 24GB	0.30
NVIDIA A10G – 24GB	0.30
NVIDIA A100 – 40GB	1.10
NVIDIA H100 – 80GB	2.50

Customized Plans

Looking for specialized requirements or larger-scale deployments? We offer customizable plans to suit enterprise needs. Contact us for details.

Book a Demo