Skip to content

v2.3: validator: Add --wait-for-exit flag to exit subcommand (backport of #6780) #6908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 11, 2025

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Jul 10, 2025

Problem

Copied from #6233

Currently, agave-validator exit will return immediately when the AdminRpc function call returns, even though the actual validator process might still be running. This isn't ideal if someone is trying to do something like:

agave-validator exit && ./restart_validator_script.sh

Summary of Changes

  • Add a new pid method to AdminRpc interface
  • Add a new --wait-for-exit flag option to exit subcommand
    • It works by querying the PID from a running validator, initiating exit (same as before) and then using waiting until the returned PID is no longer alive

Previous Attempt

This change was previously introduced in #6233; however, there was a minor issue discoverd that affects upgrade scenarios. Namely, if a "new" agave-validator binary (one that calls pid()) is used to exit an "old" agave-validator (one that predates this change), it would error out since the old bin wouldn't recognize the pid() request.

This is mitigated by calling pid() BUT only checking the result if using --wait-for-exit. Since the new flag is opt-in, it seems more reasonable to expect that someone who opted in to a new feature would keep an eye on it. Once a version with this change is widely adopted, pid() will be available and more direct error handling will work just fine

Since this is re-adding a reverted commit, I left the original two commits as-is and added a third commit for the change in behavior. Hopefully this makes things easier for folks who might have previously reviewed this change


This is an automatic backport of pull request #6780 done by Mergify.

agave-validator exit currents returns immediately after the AdminRpc
call returns. However, the running validator has not exited at this
point and may continue to tear itself down for multiple seconds

The exit subcommand now has an optional flag, --wait-for-exit, that
queries the PID from the validator and loops until that PID has fully
terminated. Use of this flag means that a caller can be sure the
running validator is dead when agave-validator exit returns

(cherry picked from commit 6bcd5ba)
@mergify mergify bot requested a review from a team as a code owner July 10, 2025 04:07
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 35.21127% with 46 lines in your changes missing coverage. Please review.

Project coverage is 82.8%. Comparing base (9963221) to head (3971515).

Additional details and impacted files
@@            Coverage Diff            @@
##             v2.3    #6908     +/-   ##
=========================================
- Coverage    82.8%    82.8%   -0.1%     
=========================================
  Files         850      850             
  Lines      379824   379883     +59     
=========================================
+ Hits       314710   314754     +44     
- Misses      65114    65129     +15     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@steviez steviez requested a review from brooksprumo July 10, 2025 18:31
@steviez steviez merged commit 0a75c9c into v2.3 Jul 11, 2025
46 checks passed
@steviez steviez deleted the mergify/bp/v2.3/pr-6780 branch July 11, 2025 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants