Mar 30, 2026 Gram Newton-Schulz: A Fast, Hardware-Aware Newton-Schulz Algorithm for Muon Mar 16, 2026 Mamba-3 Part 2 - Methodological Deep Dive Mar 16, 2026 Mamba-3 Part 1 Mar 05, 2026 FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling