Tools mentioned by Yi Ma
Software and services Yi Ma has mentioned across podcast appearances.
SignalCast may earn a small commission on purchases through these links — at no extra cost to you. As an Amazon Associate we earn from qualifying purchases.
CRATE
Author“White-Box Transformers (CRATE): Multi-head self-attention emerges mathematically as gradient steps optimizing rate reduction objectives, while MLPs function as sparsification operators. This derivation eliminates dozens of hyperparameters and achieves linear time complexity versus quadratic in standard transformers, enabling principled architecture design rather than empirical search.”