Architectural Support for Programming Languages and Operating Systems
Introducing ‘Tally’: A Novel Method of Efficient GPU Sharing for AI Workloads
Tally allows multiple AI tasks to share the same GPU, allowing for superior infrastructure efficiency.