Skip to content

sampling: make top_n_sigma no-op at <=0 rather than <0 #13345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 6, 2025

Conversation

DocShotgun
Copy link
Contributor

This changes the behavior of the recently-added top_n_sigma sampler to a short-circuit no-op state at values <= 0 rather than < 0. The rationale for this change is as follows:

  • The current behavior of top_n_sigma == 0 is redundant as it is a more roundabout way to achieve greedy decoding, which already has other means of being specified, i.e. top_k == 1
  • top_n_sigma == 0 represents no-op rather than greedy decoding in other existing tooling (i.e. text-generation-webui, aphrodite-engine, koboldcpp, YALS), so this would keep the interface consistent for frontend developers

@CISC
Copy link
Collaborator

CISC commented May 6, 2025

Do you know the rationale for koboldcpp also checking for cur_p->size <= 1?

Well, looking closer I see why, so perhaps add that too?

* avoid running nsigma when only a single candidate remains

Co-authored-by: Sigbjørn Skjæret <[email protected]>
@CISC
Copy link
Collaborator

CISC commented May 6, 2025

Ouch, test-sampling fails, can you look into why?

@DocShotgun
Copy link
Contributor Author

Ouch, test-sampling fails, can you look into why?

I took a look at it, and far as I can tell, it's because this line:

test_top_n_sigma({0.1f, 0.2f, 0.3f, 0.4f}, {1.0f, 0.0f, 0.0f, 0.0f}, 0.00f);

explicitly checks for top_n_sigma == 0 leading to greedy decoding, and that behavior is changed by this PR.

If I change the test to check for no-op instead, it passes:

test_top_n_sigma({0.1f, 0.2f, 0.3f, 0.4f}, {0.4f, 0.3f, 0.2f, 0.1f}, 0.00f);

* adjust the sampling test to reflect top_n_sigma == 0 behaving as no-op rather than greedy decoding
@github-actions github-actions bot added the testing Everything test related label May 6, 2025
@CISC
Copy link
Collaborator

CISC commented May 6, 2025

If I change the test to check for no-op instead, it passes

Great, can you also add a comment explaining why it was changed?

@CISC CISC merged commit ffc7272 into ggml-org:master May 6, 2025
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants