feat: StreamableShell with exec_command and write_stdin tools #2574

bolinfest · 2025-08-22T02:53:20Z

This introduces a complementary set of tools, exec_command and write_stdin, which are designed to facilitate working with long-running processes in a token-efficient manner.

To test:

codex-rs$ just codex --cd .. --config experimental_use_exec_command_tool=true

Though the above alone is unlikely to convince the model to use the new tools because of this bit in the base instructions:

codex/codex-rs/core/prompt.md

Lines 266 to 271 in e4c275d

    
           ## Shell commands 
        
           When using the shell, you must adhere to the following guidelines: 
        
           - When searching for text or files, prefer using `rg` or `rg --files` respectively because `rg` is much faster than alternatives like `grep`. (If the `rg` command is not found, then use alternatives.) 
        
           - Read files in chunks with a max chunk size of 250 lines. Do not use python scripts to attempt to output larger chunks of a file. Command line output will be truncated after 10 kilobytes or 256 lines of output, regardless of the command used.

So you probably need to add:

--config experimental_instructions_file=/some/other_instructions.md

where /some/other_instructions.md suggests using exec_command and write_stdin instead of shell.

hanson-openai · 2025-08-22T03:44:26Z

codex-rs/core/src/exec_command/responses_api.rs

+    let mut properties = BTreeMap::<String, JsonSchema>::new();
+    properties.insert(
+        "session_id".to_string(),
+        JsonSchema::String {


looks like it was defined as an int elsewhere (also we don't output a session ID from the exec outputs so the model hallcuinates one)

{"type":"function_call_output","call_id":"call_ZH0rYAapH2v1P5XNHOTGO7Fk","output":"failed to parse function arguments: invalid type: string \"e6d45cfc-81f2-44d6-b52b-6df1f8c3b9d0\", expected u32 at line 1 column 68"}

hanson-openai · 2025-08-22T03:48:37Z

codex-rs/core/src/exec_command/session_manager.rs

+                ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: text,


will probably work best if the output is either in the form:

Wall time: {:.3g} seconds Process exited with code {code} # or, if running: Process running with session ID {session_id} Warning: truncated output (original token count: {tokens}) # (if truncated) Output: {actual text output}

OR if it's more convenient to use json:

{ "wall_time": round(wall_time, 3), "exit_code": code, "session_id": session_id, # should not be set if exit_code is set "original_token_count": count, # only if truncated # NOTE: you'll want to dump the JSON w/o whitespace to reduce token usage "output": {"1":line1,"2":line2,...} }

Both exec_command and write_stdin can return the exact same shape of output

hanson-openai · 2025-08-22T03:53:34Z

codex-rs/core/src/exec_command/session_manager.rs

+                // Cap by assuming 4 bytes per token (TODO: use a real tokenizer).
+                let cap_bytes_u64 = params.max_output_tokens.saturating_mul(4);


FYI: the semantics of the internal implementation are actually to collect all output within yield_time_ms (uncapped) and then truncate the middle tokens/characters after receiving all the output.

Otherwise the agent has no real way of "flushing out" outputs if a command has a ton of logspew

hanson-openai · 2025-08-22T03:57:00Z

codex-rs/core/src/exec_command/responses_api.rs

+
+    ResponsesApiTool {
+        name: WRITE_STDIN_TOOL_NAME.to_owned(),
+        description: r#"Write characters to the stdin of an existing exec_command session."#


reference description:

Write characters to an exec session's stdin. Returns all stdout+stderr received within yield_time_ms. Can write control characters (\u0003 for Ctrl-C), or an empty string to just poll stdout+stderr.

hanson-openai · 2025-08-22T22:30:59Z

codex-rs/core/src/exec_command/session_manager.rs

+                    // Skip missed messages; continue collecting fresh output.
+                }
+                Ok(Err(tokio::sync::broadcast::error::RecvError::Closed)) => break,
+                Err(_) => break, // timeout


In the Python version I found that read() may throw a BlockingIOError sometimes, in which case i would wait a bit and try again https://docs.python.org/3/library/exceptions.html#BlockingIOError

not sure if that's a idiosyncracy from the python path, or is it handled internally by the rust standard libs somewhere?

hanson-openai · 2025-08-22T22:31:41Z

codex-rs/core/src/exec_command/session_manager.rs

+    // Split budget between head and tail. We prefer to keep whole lines to
+    // avoid producing partial numeric tokens like "73".


hanson-openai · 2025-08-22T22:32:34Z

codex-rs/core/src/exec_command/session_manager.rs

+    if prefix_end > 0 {
+        out.push_str(&s[..prefix_end]);
+    }
+    if suffix_start > prefix_end && suffix_start < s.len() {
+        out.push_str(&s[suffix_start..]);
+    }


would be good to put some message in the middle so it's more visible where the truncation happened, e.g.

…{truncated} tokens truncated…

bolinfest · 2025-08-22T23:23:04Z

codex-rs/core/src/exec_command/session_manager.rs

+                    // Forward to broadcast; best-effort if there are subscribers.
+                    let _ = output_tx_clone.send(buf[..n].to_vec());
+                }
+                Err(_) => break,


@hanson-openai I think this is the read that you are worried about BlockingIO. I'll add some logic.

bolinfest · 2025-08-23T00:06:17Z

@hanson-openai addressed your feedback and updated tests, but I'll also ask someone from Codex to take a look

dylan-hurd-oai

Approving to unblock, will review more carefully tomorrow.

My main current thought is that an integration test would be really helpful here!

bolinfest · 2025-08-23T01:10:47Z

Did you miss the integration test in session_manager.rs?

bolinfest force-pushed the pr2574 branch from 7f0d2f4 to 25adff0 Compare August 22, 2025 03:18

hanson-openai reviewed Aug 22, 2025

View reviewed changes

bolinfest force-pushed the pr2574 branch 4 times, most recently from e502e00 to a906dca Compare August 22, 2025 07:43

bolinfest requested a review from hanson-openai August 22, 2025 07:44

bolinfest force-pushed the pr2574 branch from a906dca to c279878 Compare August 22, 2025 07:58

bolinfest marked this pull request as ready for review August 22, 2025 07:58

bolinfest force-pushed the pr2574 branch 3 times, most recently from fcfcf13 to 3b7acc3 Compare August 22, 2025 21:26

hanson-openai reviewed Aug 22, 2025

View reviewed changes

bolinfest mentioned this pull request Aug 22, 2025

feat: exec-command-mcp #2500

Closed

bolinfest commented Aug 22, 2025

View reviewed changes

bolinfest force-pushed the pr2574 branch from 3b7acc3 to 1e89c82 Compare August 23, 2025 00:05

bolinfest requested a review from dylan-hurd-oai August 23, 2025 00:06

feat: StreamableShell with exec_command and write_stdin tools

3b10445

bolinfest force-pushed the pr2574 branch from 1e89c82 to 3b10445 Compare August 23, 2025 00:14

bolinfest requested a review from hanson-openai August 23, 2025 00:15

dylan-hurd-oai approved these changes Aug 23, 2025

View reviewed changes

bolinfest merged commit e3b03ea into main Aug 23, 2025
30 checks passed

bolinfest deleted the pr2574 branch August 23, 2025 01:10

github-actions bot locked and limited conversation to collaborators Aug 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: StreamableShell with exec_command and write_stdin tools #2574

feat: StreamableShell with exec_command and write_stdin tools #2574

Uh oh!

bolinfest commented Aug 22, 2025 •

edited

Loading

Uh oh!

hanson-openai Aug 22, 2025

Uh oh!

hanson-openai Aug 22, 2025 •

edited

Loading

Uh oh!

hanson-openai Aug 22, 2025 •

edited

Loading

Uh oh!

hanson-openai Aug 22, 2025 •

edited

Loading

Uh oh!

hanson-openai Aug 22, 2025

Uh oh!

hanson-openai Aug 22, 2025

Uh oh!

hanson-openai Aug 22, 2025

Uh oh!

bolinfest Aug 22, 2025

Uh oh!

bolinfest commented Aug 23, 2025

Uh oh!

dylan-hurd-oai left a comment

Uh oh!

bolinfest commented Aug 23, 2025

Uh oh!

Uh oh!

Uh oh!

	## Shell commands

	When using the shell, you must adhere to the following guidelines:

	- When searching for text or files, prefer using `rg` or `rg --files` respectively because `rg` is much faster than alternatives like `grep`. (If the `rg` command is not found, then use alternatives.)
	- Read files in chunks with a max chunk size of 250 lines. Do not use python scripts to attempt to output larger chunks of a file. Command line output will be truncated after 10 kilobytes or 256 lines of output, regardless of the command used.

		// Cap by assuming 4 bytes per token (TODO: use a real tokenizer).
		let cap_bytes_u64 = params.max_output_tokens.saturating_mul(4);

		// Split budget between head and tail. We prefer to keep whole lines to
		// avoid producing partial numeric tokens like "73".

feat: StreamableShell with exec_command and write_stdin tools #2574

feat: StreamableShell with exec_command and write_stdin tools #2574

Uh oh!

Conversation

bolinfest commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanson-openai Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

hanson-openai Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

bolinfest Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

bolinfest commented Aug 23, 2025

Uh oh!

dylan-hurd-oai left a comment

Choose a reason for hiding this comment

Uh oh!

bolinfest commented Aug 23, 2025

Uh oh!

Uh oh!

Uh oh!

bolinfest commented Aug 22, 2025 •

edited

Loading

hanson-openai Aug 22, 2025 •

edited

Loading

hanson-openai Aug 22, 2025 •

edited

Loading

hanson-openai Aug 22, 2025 •

edited

Loading