Test-time compute

tags: Machine learning, LLM, Reinforcement learning

Links to this note

Notes on: DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning by Ziwei Zheng, Michael Yang, Jack Hong, Chenxiao Zhao, Guohai Xu, Le Yang, Chao Shen, Xing Yu (2025)
Chain-of-Thought reasoning
Knowledge Base Index
Notes on: DFlash: Block Diffusion for Flash Speculative Decoding by Jian Chen, Yesheng Liang, Zhijian Liu (2026)
Notes on: MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention by MiniMax (2025)
Notes on: Reinforcement Learning via Self-Distillation by Hübotter, J., Lübeck, F., Behric, L., Baumann, A., Bagatella, M., Marta, D., Hakimi, I., Shenfeld, I., Kleine Buening, T., Guestrin, C. & Krause, A. (2026)
Speculative Decoding

Last changed 2026.04.08 | authored by Hugo Cisneros

Comments

Loading comments...

Back to Notes