Advanced Computer Networks

COMPSCI 2430

Shortcuts | Project

Course information

  • Instructor: Minlan Yu.

  • Semester: Fall 2024.

  • Website: https://github.com/minlanyu/cs243-site/tree/fall2024

  • Outline: Data parallelism and sharding, model parallelism and pipelining, parameter server and all-reduce, collective communication optimizations; LLM training, LLM serving, throughput-latency tradeoffs, distributed serving; NCCL as a service, flow scheduling, RDMA, congestion control, ethics; Checkpointing, fault tolerance, diagnosis; Data ingestion, LLM training in production, TPU, sustainable AI.

  • Technologies: Python, C++/CUDA, PyTorch, vLLM, Amazon Web Services (AWS).

Project

Our final project was about integration of KV cache sparsification in vLLM in a performant and memory-efficient way. Details of the project can be found here.