Skip to content

[SPARK-52563][PS] Fix var naming bug in _assert_pandas_almost_equal #51253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

petern48
Copy link
Contributor

What changes were proposed in this pull request?

Small bug fix where the wrong variable names were used

Why are the changes needed?

The function uses lval and rval instead of the parameters val1 and val2

Does this PR introduce any user-facing change?

No

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

No

@petern48
Copy link
Contributor Author

First PR to spark. I would appreciate some help with creating the Jira issue. It didn't seem like I can create one myself or create a Jira account for Apache

@xinrong-meng
Copy link
Member

Thanks for contribution!
Would you mind creating a JIRA ticket and linking the number to SPARK-XXXXX in your PR title? Please see https://spark.apache.org/contributing.html

@petern48 petern48 changed the title [SPARK-XXXXX][PS] Fix var naming bug in _assert_pandas_almost_equal [SPARK-52563][PS] Fix var naming bug in _assert_pandas_almost_equal Jun 24, 2025
@petern48
Copy link
Contributor Author

Thanks for contribution! Would you mind creating a JIRA ticket and linking the number to SPARK-XXXXX in your PR title? Please see https://spark.apache.org/contributing.html

Yep, just did. I only recently realized I could request an account so I can create tickets.

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Would be great to add a test case for it.

@petern48
Copy link
Contributor Author

petern48 commented Jul 8, 2025

@allisonwang-db I actually don't really see a good test we could write here? Do you? This function is actually a nested function (so more like a lambda) that is inside of a private function (_assert_pandas_almost_equal)... See here. To me, it doesn't make sense to write a test at this level.

Interestingly, this PR shouldn't change the behavior of the code at all since the variables lval and rval always happen to have the desired values of the intended arguments val1 and val2. This change is more of a typo fix to protect against errors if someone were to change the code later.

@allisonwang-db
Copy link
Contributor

Looks like the inner function is used in this case:

for lval, rval in zip(left[lcol].dropna(), right[rcol].dropna()):
                if not compare_vals_approx(lval, rval):

@petern48
Copy link
Contributor Author

petern48 commented Jul 8, 2025

Looks like the inner function is used in this case:

Yes, it certainly is used, but my point was that it's still nested inside the _assert_pandas_almost_equal() function, so I can't access it from a test without unnesting the function. Instead, I added some tests to cover other error types for _assert_pandas_almost_equal() that were previously missing.

@allisonwang-db What do you think?

@HyukjinKwon
Copy link
Member

cc @xinrong-meng

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petern48 Thanks for fixing the bug and adding the test cases!

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants