Author: ermek

A Technical Deep Dive into Pipeline Parallel Inference with CentML

Case Studies

A Technical Deep Dive into Pipeline Parallel Inference with CentML

With yesterday’s release of Llama-3.1-405B, we’re excited to announce that CentML’s recent contribution to vLLM, adding pipeline parallel inference support, […]

Read More

Hardware Efficiency in the Era of LLM Deployments

Updates

Hardware Efficiency in the Era of LLM Deployments

How CServe can make LLM deployment easy, efficient and scalable

Read More

Get started

Let's make your LLM better! Book a Demo