Skip to content

gh-135647: fix random.vonmisesvariate() and random.lognormvariate() accept invalid parameters #135717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

michaelzheng67
Copy link

@michaelzheng67 michaelzheng67 commented Jun 19, 2025

fix random.vonmisesvariate() and random.lognormvariate() accept invalid parameters. Throw ValueError if kappa < 0 and sigma < 0 for random.vonmisesvariate() and random.lognormvariate() respectively.

@python-cla-bot
Copy link

python-cla-bot bot commented Jun 19, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Jun 19, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Jun 19, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@rhettinger rhettinger self-assigned this Jun 19, 2025
@rhettinger
Copy link
Contributor

rhettinger commented Jun 19, 2025

Our past position of this was to not expend clock cycles defending against invalid inputs, leaving the user with garbage-in-garbage-out. In some cases, we have cheap checks for likely errors. In others, there is a preference for a fast code path.

In this case, error checks are being added despite not having any known user issues. ISTM this PR makes everyone pay for a problem that almost no one has.

Tim, @tim-one , do you want to decide this one?

@rhettinger rhettinger assigned tim-one and unassigned rhettinger Jun 19, 2025
@tim-one
Copy link
Member

tim-one commented Jun 19, 2025

I think my position on this should be obvious, given that I'm the author of:

Errors should never pass silently.
Unless explicitly silenced.

😄 There are cases where that's impractical, but not - to my eyes - these cases. For example, sorting implicitly requires that elements support a consistent, deterministic, total ordering. But there's no way to check for that. The sorting code nevertheless never assumes that __lt__ will return sane results. It remains defensive throughout, regardless of cost.

Here vonmisesvariate() already checks for kappa <= 1e-6 near the start, a very unlikely thing to succeed (if people want a uniform distribution, they wouldn't be calling this to begin with). Nesting a check for "less than 0.0?" inside that suite is essentially costless.

For lognormvariate(), I think there's a different error here: it calls normalvariate(), which neither documents nor checks that its sigma must be non-negative. But it should - negative sigma isn't in its domain. If that were repaired, lognormvariate() would inherit the appropriate sigma check.

@michaelzheng67
Copy link
Author

Just to add on, seems like other distributions also do an error check as well already. For e.g:

def gammavariate(self, alpha, beta):
       ...
        if alpha <= 0.0 or beta <= 0.0:
            raise ValueError('gammavariate: alpha and beta must be > 0.0')
       ...
def binomialvariate(self, n=1, p=0.5):
       ...
        if n < 0:
            raise ValueError("n must be non-negative")
       ...

@rhettinger rhettinger removed their request for review June 19, 2025 23:27
@rhettinger
Copy link
Contributor

Just to add on, seems like other distributions also do an error check as well already

Some do, some don't. Each case has its own cost/benefit analysis, a likelihood of erroneous user input weighed against cost of the additional code. A big slow function like binomialvariate doesn't is affected much by error checks. Other functions with tighter code would feel it more. With respect to the likelihood of erroneous input, a function like lognormvariate has two and half decades of deployment indicating that the risk is near zero.

Side note: Tim's position is less obvious than he claims. The issue is assigned to him and he could just commit it but hasn't. In earlier issues, he pushed for all the complexity of the __init_subclass__ logic just to save a single comparison. So there was a chance that he might care about adding a new check that has never proven necessary in practice. Anyway, the PR is assigned to him and he can decide what he thinks is best.

@tim-one
Copy link
Member

tim-one commented Jun 20, 2025

@rhettinger, aa I said, I don't believe this PR is ready to commit. In particular, focusing on lognormvariate() is missing the point: the real problem there is that normalvariate() doesn't check, or even document, that sigma >= 0.0 is a precondition. If it did, lognormvariate() could rely on it to validate sigma.

And as already explained, the idea that skipping the check in vonmisesvariate() was the result of some cost/benefit analysis is not credible, period. The code is already checking for a "very small" input, which is itself very unlikely, and nesting a check for < 0.0 inside that unlikely suite won't be noticed by anyone.

I don't recall the specifics of what else you may be talking about. For example, I did push for a major speedup in shuflle() via cutting needless work that would have left the code very straightforward and easy to follow, but it didn't tickle your fancy for reasons I never grasped. So your zeal for chopping cycles isn't as straightforward as you'd have us believe either 😉.

I sometimes go to extreme lengths to save cycles, and you already know (or should know) that. But rarely (not never) at the expense of letting errors pass silently.

And I don't care that nobody ever complained about these specific cases. Most users apply blind trust to library functions, and if they do stumble into finding out that a library returned garbage for a bad argument, most never speak up.

IBM's RANDU is widely considered to be one of the most ill-conceived random number generators ever designed, and was described as "truly horrible" by Donald Knuth,

That didn't stop it from being one of the most widely used PRNGs in its time, and only God knows how many bad decisions were based on its poor quality.

I'm not a fan of catastrophe-driven development. I think it's our duty to pro-actively do what we can to prevent disasters. Validating input constraints is just very basic best practice.

In any case, yes, I intend to merge this PR - but not before it's ready to merge.

@tim-one
Copy link
Member

tim-one commented Jun 20, 2025

@rhettinger, I've tried, but I have no memory of ever talking about __init_subclass__ with you in any context (or with anyone else, for that matter). Refresh my memory?

About shuffle(), my apologies for getting that one wrong. You were in fact agreeable, but nothing came of it. I noticed at the time that it was closed as "Not Planned", but didn't notice that you weren't the one who closed it. I see now that you didn't, and it just fell through the cracks. I simply assumed at the time that you closed it, and moved on.

@bedevere-app
Copy link

bedevere-app bot commented Jun 21, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants