Hugo Cisneros

Blog
Notes
Projects
Resume
Contact

Home /
Notes /
PPO

PPO

tags: Reinforcement learning, Algorithm, Machine learning

Links to this note

Knowledge Base Index
Token-level credit assignment in reasoning traces

Last changed 2026.04.09 | authored by Hugo Cisneros

Comments

Loading comments...

Leave a comment

Name *

Email (optional, not displayed)

Comment *

Blog
Code
© Hugo Cisneros 2026