-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Include command output when sending timeout to model #3576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include command output when sending timeout to model #3576
Conversation
…nied and Timeout variants - Changed SandboxErr::Denied and SandboxErr::Timeout to carry boxed ExecToolCallOutput. - Updated handling and matching of these variants throughout exec, codex, and error modules. - Improved error reporting by referencing structured output fields directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's have an integration test to test the whole behavior or at least make sure that the prefix and the timeout appears in the test.
assert_eq!(output.exit_code, 124); | ||
assert_eq!(output.stdout.text, "before\n"); | ||
assert!(output.stderr.text.is_empty()); | ||
assert_eq!(output.aggregated_output.text, "before\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is command timed out after {} milliseconds\n
in the putput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not in the output. It gets formatted in right before the result is sent back to model in codex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe integration test would be helpful? to see what got sent to the model. just to make sure that the model knows it timed out.
/// Command timed out | ||
#[error("command timed out")] | ||
Timeout, | ||
Timeout { output: Box<ExecToolCallOutput> }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the error()
include fields from output
like Denied
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an opinion. Don't even know when it's used.
Ok(raw_output) => { | ||
let stdout = raw_output.stdout.from_utf8_lossy(); | ||
let stderr = raw_output.stderr.from_utf8_lossy(); | ||
#[allow(unused_mut)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because of the #[cfg(target_family = "unix")]
below? If so, alternatively:
#[cfg(unix)]
let mut timed_out = raw_output.timed_out;
#[cfg(not(unix))]
let timed_out = raw_output.timed_out;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because of the cfg, I think #[allow(unused_mut)] is a bit cleaner.
|
||
// if the command timed out, we can simply return this failure to the model | ||
if matches!(error, SandboxErr::Timeout) { | ||
if let SandboxErr::Timeout { output } = &error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the case where the command times out, but is not run in a sandbox (i.e., --dangerously-bypass-approvals-and-sandbox
) also covered by this case even though it isn't run in a "sandbox"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it all goes through the run_exec_with_events
who's result we're processing here.
… assert_network_blocked
Include a timed_out boolean in ExecToolCallOutput to indicate if a command timed out. Update output formatting to prepend a timeout message when timed_out is true, and simplify related function signatures. Add test coverage for this behavior. Fix related error construction and consumers.
Use output.exit_code instead of exit_code to correctly check the command exit status and ensure proper sandbox breach detection.
Being able to see the output helps the model decide how to handle the timeout.