TPJtdKLEAg pfp
TPJtdKLEAg
@yzomebjv
💥 We are so excited to introduce OTC-PO, the first RL framework for optimizing LLMs’ tool-use behavior in Tool-Integrated Reasoning. Arxiv: https://t.co/BfyMZ6Z4zh Huggingface: https://t.co/bLwjwjrPZK ?? Simple, generalizable, plug-and-play (just a few lines of code) 🧠
0 reply
0 recast
0 reaction