Skip to content

Jfm sync chan refactor #5289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

JackMordaunt
Copy link

@JackMordaunt JackMordaunt commented Jun 5, 2025

This PR addresses #5287: renaming select_raw to try_select_raw and fixes the TOCTOU by looping until the can_{send,recv} invariant holds.

I recommend reviewing by commit to see each change in isolation.

TODO: I think this deserves a test harness. Currently select_raw doesn't seem to have a test.

Tests added.

Format according to ols' odinfmt tool.
This fixes a flaw in the original implementation: the returned index is
actually useless to the caller.

This is because the index returned refers to the internal "candidate"
list. This list is dynamic, and may not have all of the input channels
(if they weren't ready according to chan.can_{recv,send}). That means
the index is not guaranteed to mean anything to the caller.

The fix introduced here is to return the index into the input slice
(recvs,sends) and an enum to specify which input slice that is.

If no selection was made, then (-1, .None) is returned to communicate as
much.
This follows the convention where non-blocking operations are prefixed
with "try" to indicate as much.

Since select_raw in it's current form doesn't block, it should be
try_select_raw, and allow select_raw to name a blocking implementation.
Fixes a TOCTOU where the channel could be used between the call to
can_{recv,send} and {recv,send} causing an unexpected blocking
operation.

To do this we use the non-blocking try_{recv,send} and retry the check
in a loop. This guarantees non-blocking select behaviour, at the cost of
spinning if the input channels are highly contended.

Signed-off-by: Jack Mordaunt <[email protected]>

/*
Determines what operations `Chan` supports.
*/
Direction :: enum {
Send = -1,
Both = 0,
Both = 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this formatting changed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my odinfmt from ols.

Can drop the formatting commit if the formatting is undesirable.

@@ -174,8 +180,15 @@ Example:
}
*/
@(require_results)
create_buffered :: proc($C: typeid/Chan($T), #any_int cap: int, allocator: runtime.Allocator) -> (c: C, err: runtime.Allocator_Error)
where size_of(T) <= int(max(u16)) {
create_buffered :: proc(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this formatting changed?

Comment on lines 1260 to 1277
if count == 0 {
return -1, .None
}

select_idx = rand.int_max(count) if count > 0 else 0
candidate_idx := rand.int_max(count) if count > 0 else 0

sel := candidates[candidate_idx]
if sel.is_recv {
status = .Recv
if !try_recv_raw(recvs[sel.idx], recv_out) {
continue
}
} else {
status = .Send
if !try_send_raw(sends[sel.idx], send_msgs[sel.idx]) {
continue
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be the only change to the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two changes:

  • name changed to try_select_raw
  • toctou addressed by placing the body in a loop

Use a label to clarify the continue statements.
This makes the proc easier and safer to call by letting the caller nil
out messages to skip sends.
This is necessary because we need to allow the test guarantee against a
rare condition: where a third-party thread steals a value between the
validity checks can_{send,recv} and the channel operation
try_{send,recv}.
This test harness ensures consistent non-blocking semantics and
validates that we have solved the toctou condition.

The __global_context_for_test is a bit of a hack to fuse together the
test supplied proc and the executing logic in packaage chan.
@JackMordaunt
Copy link
Author

I've pushed commits that add a test harness and some polish.

To achieve the toctou test I introduced some global state (hidden behind a when ODIN_TEST clause) to allow the test logic to control the execution of try_select_raw more precisely.

@gingerBill please let me know if you want me to revert the initial formatting commit (d28e54a)

@JackMordaunt JackMordaunt requested a review from gingerBill June 8, 2025 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants