Closed
Description
Elasticsearch Version
8.15
Installed Plugins
No response
Java Version
bundled
OS Version
any
Problem Description
When executing a significantly complicated dismax query, its possible that when iterating impacts, the iteration seemingly gets "stuck".
A CPU thread gets take hostage at 100%, and iterates forever. Task cancellation does nothing as the CPU is stuck in a busy loop working without doing any IO.
(stuck means running for hours and hours, requiring a server restart to stop)
It in particular, gets stuck in this loop:
private int advanceImpacts(int target) throws IOException {
if (target > upTo) {
moveToNextBlock(target);
}
while (true) {
if (maxScore >= minScore) {
return target;
}
if (upTo == NO_MORE_DOCS) {
return NO_MORE_DOCS;
}
target = upTo + 1;
moveToNextBlock(target);
}
}
Then in moveToNextBlock
this executes the ES812ScoreSkipReader
impacts check and possibly, this adversely sets the target resulting in a loop.
Steps to Reproduce
@softwaredoug discovered this, I will defer to him.
Logs (if relevant)
/tmp/jstack.4.log:"elasticsearch[eck-elasticsearch-es-default-5][search_worker][T#2]" #78 [163] daemon prio=5 os_prio=0 cpu=771958.29ms elapsed=13461.64s tid=0x00007f5f28013750 nid=163 runnable [0x00007f5e5d1fd000]
/tmp/jstack.4.log- java.lang.Thread.State: RUNNABLE
/tmp/jstack.4.log- at org.apache.lucene.search.DisjunctionScoreBlockBoundaryPropagator.advanceShallow([email protected]/DisjunctionScoreBlockBoundaryPropagator.java:79)
/tmp/jstack.4.log- at org.apache.lucene.search.DisjunctionMaxScorer.advanceShallow([email protected]/DisjunctionMaxScorer.java:79)
/tmp/jstack.4.log- at org.apache.lucene.search.ConjunctionScorer.advanceShallow([email protected]/ConjunctionScorer.java:80)
/tmp/jstack.4.log- at org.apache.lucene.search.ReqOptSumScorer.advanceShallow([email protected]/ReqOptSumScorer.java:274)
/tmp/jstack.4.log- at org.apache.lucene.search.ReqOptSumScorer$1.moveToNextBlock([email protected]/ReqOptSumScorer.java:82)
/tmp/jstack.4.log- at org.apache.lucene.search.ReqOptSumScorer$1.advanceImpacts([email protected]/ReqOptSumScorer.java:106)
/tmp/jstack.4.log- at org.apache.lucene.search.ReqOptSumScorer$1.advanceInternal([email protected]/ReqOptSumScorer.java:129)
/tmp/jstack.4.log- at org.apache.lucene.search.ReqOptSumScorer$1.nextDoc([email protected]/ReqOptSumScorer.java:112)
/tmp/jstack.4.log- at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange([email protected]/Weight.java:298)
/tmp/jstack.4.log- at org.apache.lucene.search.Weight$DefaultBulkScorer.score([email protected]/Weight.java:236)
/tmp/jstack.4.log- at org.elasticsearch.search.internal.CancellableBulkScorer.score([email protected]/CancellableBulkScorer.java:45)
/tmp/jstack.4.log- at org.apache.lucene.search.BulkScorer.score([email protected]/BulkScorer.java:38)
/tmp/jstack.4.log- at org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf([email protected]/ContextIndexSearcher.java:436)
/tmp/jstack.4.log- at org.elasticsearch.search.internal.ContextIndexSearcher.search([email protected]/ContextIndexSearcher.java:365)
/tmp/jstack.4.log- at org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3([email protected]/ContextIndexSearcher.java:350)
/tmp/jstack.4.log- at org.elasticsearch.search.internal.ContextIndexSearcher$$Lambda/0x00007f5fec574000.call([email protected]/Unknown Source)
/tmp/jstack.4.log- at org.apache.lucene.search.TaskExecutor$TaskGroup.lambda$createTask$0([email protected]/TaskExecutor.java:117)
/tmp/jstack.4.log- at org.apache.lucene.search.TaskExecutor$TaskGroup$$Lambda/0x00007f5fec519c80.call([email protected]/Unknown Source)
/tmp/jstack.4.log- at java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:317)