Advanced Computer Networks

Shortcuts | Project

Course information

Instructor: Minlan Yu.
Semester: Fall 2024.
Website: https://github.com/minlanyu/cs243-site/tree/fall2024
Outline: Data parallelism and sharding, model parallelism and pipelining, parameter server and all-reduce, collective communication optimizations; LLM training, LLM serving, throughput-latency tradeoffs, distributed serving; NCCL as a service, flow scheduling, RDMA, congestion control, ethics; Checkpointing, fault tolerance, diagnosis; Data ingestion, LLM training in production, TPU, sustainable AI.
Technologies: Python, C++/CUDA, PyTorch, vLLM, Amazon Web Services (AWS).

Project

Our final project was about integration of KV cache sparsification in vLLM in a performant and memory-efficient way. Details of the project can be found here.