In a tool loop, you can't set response_format because the model needs to be able to think in plain English. But you still need the final response to be in the desired format, so we add response_format only on the last iteration.