APLOS
Introducing ‘Tally’: A Novel Method of Efficient GPU Sharing for AI Workloads
Tally allows multiple AI tasks to share the same GPU, allowing for superior infrastructure efficiency.
Tally allows multiple AI tasks to share the same GPU, allowing for superior infrastructure efficiency.