đ TRL v0.29.0 introduces trl-training: an agent-native training skill.
This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as: - Supervised Fine-Tuning (SFT) - Direct Preference Optimization (DPO) - Group Relative Policy Optimization (GRPO)
Weâre excited to see what the community builds on top of this.
If youâre working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! đ¤
đ smolagents v1.21.0 is here! Now with improved safety in the local Python executor: dunder calls are blocked! â ď¸ Still, not fully isolated: for untrusted code, use a remote executor instead: Docker, E2B, Wasm. ⨠Many bug fixes: more reliable code. đ https://github.com/huggingface/smolagents/releases/tag/v1.21.0