Kascade: A Practical Sparse Attention Method for Long-Context LLM Inference
March 2025-October 2025 Mentor: Saurabh Goyal, Nipun Kwatra and Ramachandran Ramjee
Scalable and distributed efficient training of GNNs
Recommended citation: D. Deshmukh, S. Goyal, N. Kwatra, and R. Ramjee, "Kascade: A Practical Sparse Attention Method for Long-Context LLM Inference," arXiv preprint arXiv:2512.16391, 2025 https://arxiv.org/abs/2512.16391
