GammaNet - Stable Feature-Space Decay in Linear RNNs
GammaNet - Allowing Kimi Delta Attention to decay along meaningful directions instead of meaningless coordinates.
I write about math, machine learning, and whatever else I find interesting.
GammaNet - Allowing Kimi Delta Attention to decay along meaningful directions instead of meaningless coordinates.
Bra-ket notation as a unifying language for linear maps, MLPs, and self-attention — making the key-value structure of deep learning explicit.
This is my first post. I put off writing this for too long.
Hi! I'm Lucas Sun. Welcome to my blog.