Skip to content

Solve sync timeout without additional parameter #285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 20, 2025
Merged

Conversation

shikokuchuo
Copy link
Member

Closes #275. Supersedes #284.

Solves this issue a different way without requiring an additional parameter. The advantage of this is that it's not going to be obvious where to set the parameter if the default fails, and setting it too wide will potentially waste resources in the rare case there is an issue.

Instead, an informative 'time elapsed' message is output to the console every 10 seconds. The user can then choose to continue waiting or interrupt.

@shikokuchuo shikokuchuo merged commit 29dc0c6 into main May 20, 2025
12 checks passed
@shikokuchuo shikokuchuo deleted the sync-timeout branch May 20, 2025 15:31
@shikokuchuo
Copy link
Member Author

@wlandau FYI. I've tested that this doesn't affect crew, but just to draw your attention that an initial sync timeout no longer causes mirai to error, but it'll keep trying.

@wlandau
Copy link

wlandau commented May 21, 2025

Thanks, I like the retries. The retry messages are great in semi-interactive scenarios, but I do foresee a (small) possibility of an infinite wait in fully automated workflows.

@shikokuchuo
Copy link
Member Author

Thanks. The sync is not designed to be a point of failure though. It's always local between host/dispatcher (and only when auto-launched, host/daemons or dispatcher/daemons, so not in crew). I'd consider the possibility if there's a demonstrated failure mode.

Also, in reality, for workflows like targets pipelines that take a substantial amount of time, it'd be highly unlikely that you'd simply leave it to run without checking the logs (or be notified) that the workflow has actually started.

@wlandau
Copy link

wlandau commented May 22, 2025

I'd consider the possibility if there's a demonstrated failure mode.

True, it's rare to fail at daemons().

Also, in reality, for workflows like targets pipelines that take a substantial amount of time, it'd be highly unlikely that you'd simply leave it to run without checking the logs (or be notified) that the workflow has actually started.

For the research-oriented pipelines my group and I run, yes. However, I regularly hear about fully automated targets pipelines that run on a daily schedule (similar to the intended use case of Airflow). I wouldn't expect users to check for runaway pipelines every day in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

initial sync with dispatcher timed out after 10s
2 participants