A high-throughput and memory-efficient inference and serving engine for LLMs
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.