-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Update Grok to use the new Matcher#setTimeout #139405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Note: the tests compile in this commit, but they do not all pass.
Note: the tests compile in this commit, but they do not all pass.
but MatcherWatchdogTests is no longer valuable
|
Pinging @elastic/es-data-management (Team:Data Management) |
|
Hi @joegallo, I've created a changelog YAML for you. |
dakrone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for simplifying this, I left some comments but nothing major.
| } | ||
|
|
||
| if (result == Matcher.INTERRUPTED) { | ||
| throw new RuntimeException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use an ElasticsearchException here instead, since it's the ES version of RuntimeException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grok doesn't have visibility to org.elasticsearch.server, so it can't see that class. We could change that, but I don't want to do that without chatting up a wider audience. I DRYed it in e821c7e to make it just one place and not three, though, so at least the badness is a little more contained.
| matcherWatchdog.unregister(matcher); | ||
| } | ||
| if (result == Matcher.INTERRUPTED) { | ||
| throw new RuntimeException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here for exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DRYed in e821c7e.
| BiConsumer<Long, Runnable> scheduler | ||
| ) { | ||
| return new Default(interval, maxExecutionTime, relativeTimeSupplier, scheduler); | ||
| static MatcherWatchdog newInstance(long maxExecutionTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're touching this anyway, can we changed maxExecutionTime in include the units (millis) in its name? Perhaps maxExecutionMillis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍, I went with maxExecutionTimeMillis in 19cc207.
| if (maxExecutionTime > 0) { | ||
| matcher.setTimeout(maxExecutionTime * NSEC_PER_MSEC); | ||
| } else { | ||
| matcher.setTimeout(-1); // disable timeouts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we pass in 0 as the time, what is expected by the caller? Is it expected that we would always time out, or that we'd never time out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm, I don't really have a strong opinion about it. I'll poke around for some prior art so that I can claim we're doing something similar to something else common in Elasticsearch.
Do you prefer that 0 means no timeout? I'm fine with just making it do that if that's what you prefer. (edit: wait, that is the current logic)
| long intervalMillis = IngestSettings.GROK_WATCHDOG_INTERVAL.get(settings).getMillis(); | ||
| long maxExecutionTimeMillis = IngestSettings.GROK_WATCHDOG_INTERVAL.get(settings).getMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of this PR, we should mark this setting as deprecated because it no longer does anything, right? Can you update the PR to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's fair. I'll handle that in the morning.
| private Default(long interval, long maxExecutionTime, LongSupplier relativeTimeSupplier, BiConsumer<Long, Runnable> scheduler) { | ||
| this.interval = interval; | ||
|
|
||
| private Default(long maxExecutionTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here about renaming to include units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also covered in 19cc207.
Follow up to #139075
More recent versions of
jruby/jonihave the ability to specify a timeout for aMatcher'ssearch(see jruby/joni#78). This is quite a bit easier for us to use than the current scheme of thread registration and monitoring inMatcherWatchdog.Default, and it's faster, too (since there's no synchronization overhead of registering/unregistering).7c81f85 is just my humble opinion, it's surprising to me that it wasn't added back in #31024, but it wasn't. 🤷
4eabe3e is bug which was introduced by me in #95426. 💀
Personally, I am of the opinion that the
MatcherWatchdoginterface is entirely obsolete and should be removed. However, there's a fairly intricate third implementation of the interface over inx-pack/plugin/text-structurethat I'm not qualified to touch, so I resisted the urge to refactor all the way up through removing the interface completely. As it stands, though, the two first-class default implementations really aren't using the methods of the interface in very meaningful ways anymore, but they still do follow the right shape.