Platform Pricing

Free $10 Credits to all new users (4 million tokens on Llama 3.1 405B)

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.

Developer

Best performance – no hidden fees

  • $10 free credits for all new users
  • Full pay-as-you-go billing – per minute/per token
  • On-demand dedicated endpoints – no rate limits
  • Planner feature available for all deployments
  • No daily limits
Deploy Now

Enterprise

Custom solutions for scaling

  • Custom pricing
  • Unlimited rate limits
  • Unlimited deployed models
  • Dedicated and self-hosted deployments
  • Guarenteed uptime SLA
  • 24/7 tech support
  • Plus all features from the Developer package
Contact Us

Platform Pricing Overview

Deploying Applications are calculated on a credit-based billing system, where 1 CentML credit equals 1 USD. You can buy credits through the Platform by going to your Account page.

Serverless Endpoint usage is billed according to the total number of tokens generated and processed.

Model Size Price per 1M Tokens Examples
Small (1-4B) $0.04 Smaller language models
Medium (7-11B) $0.08 General-purpose models
Large (70-90B) $0.50 Complex AI applications
X-Large (405B) $2.50 High-demand, intensive LLMs

Dedicated Deployments

Dedicated deployments are charged based on the type and duration of hardware used, following a per-minute billing system.

Accelerator Credits per hour
NVIDIA L4 – 24GB 0.30
NVIDIA A10G – 24GB 0.30
NVIDIA A100 – 40GB 1.10
AWS Inf2 – 32GB 2.00
NVIDIA H100 – 80GB 2.50
NVIDIA H200 – 141GB 2.60
GCP v6e TPU – 32GB 2.70
AMD MI300X – 192GB 14.00

Customized Plans

Looking for specialized requirements or larger-scale deployments? We offer customizable plans to suit enterprise needs. Contact us for details.