A Technical Deep Dive into Pipeline Parallel Inference with CentML

Case Studies

A Technical Deep Dive into Pipeline Parallel Inference with CentML

With yesterday’s release of Llama-3.1-405B, we’re excited to announce that CentML’s recent contribution to vLLM, adding pipeline parallel inference support, […]

Read More

Hardware Efficiency in the Era of LLM Deployments

Updates

Hardware Efficiency in the Era of LLM Deployments

How CServe can make LLM deployment easy, efficient and scalable

Read More

How to profile a Hugging Face model with DeepView

Updates

How to profile a Hugging Face model with DeepView

Hugging Face has become a leading platform for natural language processing (NLP) and machine learning (ML) enthusiasts. It provides a […]

Read More

Introducing DeepView: Visualize your neural network performance

Updates

Introducing DeepView: Visualize your neural network performance

Optimize PyTorch neural networks, peak performance, and cost efficiency for your deep learning projects

Read More

Get started

Let's make your LLM better! Book a Demo