May 12, 2025
A team of researchers from CentML and the University of Toronto analyzed LLM parallelization methods and developed Seesaw, an LLM inference engine optimized for throughput-oriented tasks.

May 12, 2025
A team of researchers from CentML and the University of Toronto analyzed LLM parallelization methods and developed Seesaw, an LLM inference engine optimized for throughput-oriented tasks.
Updates
Hugging Face has become a leading platform for natural language processing (NLP) and machine learning (ML) enthusiasts. It provides a large repository of pre-trained models and tools for developing advanced applications. But before you start using an ML model, you need to profile it. Our open-source tool, DeepView, can help you analyze the behavior and […]
Updates
Optimize PyTorch neural networks, peak performance, and cost efficiency for your deep learning projects
Case Studies
In partnership with CentML, Oracle has developed innovative solutions to meet the growing demand for high-performance NVIDIA GPUs for machine learning (ML) model training and inference. Utilizing CentML’s state-of-the-art ML optimization software and Oracle Cloud Infrastructure (OCI), the collaboration has achieved significant performance improvements for both training and inference tasks, specifically with the LLaMa-V2 and […]
Case Studies
A growing generative AI company partnered with CentML to accelerate their API-as-a-service and iterate with foundational models—all without using top-of-the-line NVIDIA GPUs like the A100. The challenge A growing generative AI company realized that their modern Large Language Models (LLMs) needed powerful GPUs for pre-training, fine-tuning, and inference. They used the Hugging Face Trainer to […]
Updates
Jan 25, 2024
Optimize your PyTorch and ONNX models while streamlining the deep learning process
Updates
The New York Times has filed a lawsuit against OpenAI and Microsoft over alleged copyright infringement in AI model training. The generative AI is at a crossroads. The implications are clear: content creators recognize the value of their work, lawyers see an opportunity to challenge tech giants like Microsoft and Google, and enterprises are seeking […]
Updates
tl;dr: We’re excited to introduce CServe—an easy-to-deploy, highly efficient, and low-cost serving framework for LLMs to help you cut your operational costs in half while optimizing for both server-side throughput and client-side latency constraints. Generative AI is BIG. We’ve witnessed its power with ChatGPT. But if AI is the future that can benefit us all, […]
Updates
CentML is transforming access to cutting-edge AI training and deployment.
Events
Nov 23, 2023
Predicting and Optimizing Runtime Performance of Deep Learning Models
Updates
October 25, 2023 – CentML, a software platform that dramatically improves the performance and cost of deploying ML models, today announced the completion of a $27 million seed round. The round was led by Gradient Ventures, Google’s AI-focused venture fund, with participation from Radical Ventures, NVIDIA, Deloitte Ventures, and Thomson Reuters Ventures. As AI and […]