Scaling laws tags Machine learning, LLM, The Scaling Hypothesis Scaling laws inform the training and scaling of the largest models. Links to this note Knowledge Base Index Mixture of Experts Notes on: Attention Residuals by Kimi Team, Guangyu Chen, Yu Zhang, Jianlin Su et al. (2026) Notes on: MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention by MiniMax (2025) Last changed 2026.04.07 | authored by Hugo Cisneros
Loading comments...