Skip to content

[CORE] Let LocalSparkContext clear active context in beforeAll #51284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

price-qian
Copy link

@price-qian price-qian commented Jun 25, 2025

What changes were proposed in this pull request?

This change is to ensure no active SparkContext remain from previous test runs. This prevents potential resource leaks by cleaning up any lingering context before test execution begins. By having this call in the trait, we stabilize all tests that mix in this trait.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Ran all unit tests on 3.5 branch and all passed.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Jun 25, 2025
@sarutak
Copy link
Member

sarutak commented Jun 26, 2025

@price-qian Thank you for opening this PR.
Unfortunately, I don't think this change needed because resetSparkContext in afterEach calls clearActiveContext indirectly. So, do you mind closing this PR?

Also, please note:

  • Give a proper tag to the title of PR like [SPARK-XXXXX] ([MINOR] is acceptable if a change is very small like typo-fix).
  • Activate your GitHub Actions to run CI. See this error message.
  • Follow our contribution guide.

@price-qian
Copy link
Author

Thanks @sarutak for the review. Tests that mix in this trait will clean up SparkContext properly. The issue is when some tests do not mix in it and forget to clean up SparkContext. When running alone, they can all pass but when running together, the lingering SparkContext can cause issues when a different test tries to do new SparkContext because SparkContext doesn't allow multiple active instances. This PR is to make it more defensive in this shared trait. I can adjust based on your notes too but just want to share this thought with you first.

@price-qian
Copy link
Author

I've activated my Github Actions but I don't see a way to rerun this workflow.

@sarutak
Copy link
Member

sarutak commented Jun 30, 2025

@price-qian
I understand the issue you argue. But I have never encountered such an issue when I run tests, so could you show the way to reproduce this issue for us to judge if the suggested solution is the best way?
Also, please file this issue in JIRA and note the issue and solution more precisely.

I've activated my Github Actions but I don't see a way to rerun this workflow.

If you configured correctly, GA should run after the next push.

This change is to ensure no active SparkContext remain from previous test runs. This prevents potential resource leaks by cleaning up any lingering context before test execution begins. By having this call in the trait, we stabilize all tests that mix in this trait.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Ran all unit tests on 3.5.6 branch and all passed.

### Was this patch authored or co-authored using generative AI tooling?
No.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants