Llama 4 Maverick 17B (128E) Instruct
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
Overview
Llama 4 Maverick is a natively multimodal AI model developed by Meta, featuring a mixture-of-experts architecture with 17 billion activated parameters and 128 experts. Released in April 2025, it is designed for industry-leading performance in text and image understanding across multiple languages. This instruct model is intended for a wide range of commercial and research applications leveraging its multimodal capabilities and large number of experts for enhanced processing.
Tags
CentML Optimized
Chat
Dedicated
Serverless
VLM
API
curl -X POST "https://api.centml.com/openai/v1/chat/completions" \
-H "Authorization: Bearer *******************************************" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
"messages": [{ "role": "system", "content": "You are a helpful assistant." }],
"stream": false
}'
from openai import OpenAI
client = OpenAI(
api_key="*******************************************",
base_url="https://api.centml.com/openai/v1"
)
completion = client.chat.completions.create(
model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
messages=[{ "role": "system", "content": "You are a helpful assistant." }],
stream=False,
)
print(completion.choices[0].message)
import OpenAI from "openai";
const client = new OpenAI(
api_key="*******************************************",
base_url="https://api.centml.com/openai/v1"
)
async function main() {
const completion = await client.chat.completions.create({
model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
messages: [{ "role": "system", "content": "You are a helpful assistant." }],
stream: false,
});
console.log(completion.choices[0])
}
main()