Jan 31, 2026 FlashAttention-2 in Triton: From GPU Mental Models to Kernel Performance Jan 16, 2026 Deriving the FlashAttention Backward Pass