# Learnings - Gotchas - For tool-calling SFT with assistant-only loss, trajectories that end on a `tool` turn do not teach stop behavior, so append a final content-only assistant turn after terminal tool confirmations to reduce post-answer extra tool calls. *(F014)* - Repeat penalties should key on `(method, argument)` over a short recent-call window (deque `maxlen=3`) so alternating reuse patterns like `A→B→A` are penalized while cross-method same-argument calls are not. *(F015)*