← Browse all tags
Language modeling
Notes
Notes on: LoRA Learns Less and Forgets Less by Dan Biderman, Jacob Portes, Jose Javier Gonzalez Ortiz, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John P. Cunningham (2024)
Notes on: Embarrassingly Simple Self-Distillation Improves Code Generation by Zhang, R., Bai, R. H., Zheng, H., Jaitly, N., Collobert, R., & Zhang, Y. (2026)