Skip to content

Commit 895aed0

Browse files
authored
Update the intervals query docs (#111808) (#111822)
Since apache/lucene-solr#620, intervals disjunctions are automatically rewritten to handle cases where minimizations can miss valid matches. This change updates the documentation to take this behaviour into account (users don't need to manually pull intervals disjunctions to the top anymore).
1 parent eedf52c commit 895aed0

File tree

2 files changed

+31
-65
lines changed

2 files changed

+31
-65
lines changed

docs/reference/query-dsl/intervals-query.asciidoc

Lines changed: 0 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -397,68 +397,3 @@ This query does *not* match a document containing the phrase `hot porridge is
397397
salty porridge`, because the intervals returned by the match query for `hot
398398
porridge` only cover the initial two terms in this document, and these do not
399399
overlap the intervals covering `salty`.
400-
401-
Another restriction to be aware of is the case of `any_of` rules that contain
402-
sub-rules which overlap. In particular, if one of the rules is a strict
403-
prefix of the other, then the longer rule can never match, which can
404-
cause surprises when used in combination with `max_gaps`. Consider the
405-
following query, searching for `the` immediately followed by `big` or `big bad`,
406-
immediately followed by `wolf`:
407-
408-
[source,console]
409-
--------------------------------------------------
410-
POST _search
411-
{
412-
"query": {
413-
"intervals" : {
414-
"my_text" : {
415-
"all_of" : {
416-
"intervals" : [
417-
{ "match" : { "query" : "the" } },
418-
{ "any_of" : {
419-
"intervals" : [
420-
{ "match" : { "query" : "big" } },
421-
{ "match" : { "query" : "big bad" } }
422-
] } },
423-
{ "match" : { "query" : "wolf" } }
424-
],
425-
"max_gaps" : 0,
426-
"ordered" : true
427-
}
428-
}
429-
}
430-
}
431-
}
432-
--------------------------------------------------
433-
434-
Counter-intuitively, this query does *not* match the document `the big bad
435-
wolf`, because the `any_of` rule in the middle only produces intervals
436-
for `big` - intervals for `big bad` being longer than those for `big`, while
437-
starting at the same position, and so being minimized away. In these cases,
438-
it's better to rewrite the query so that all of the options are explicitly
439-
laid out at the top level:
440-
441-
[source,console]
442-
--------------------------------------------------
443-
POST _search
444-
{
445-
"query": {
446-
"intervals" : {
447-
"my_text" : {
448-
"any_of" : {
449-
"intervals" : [
450-
{ "match" : {
451-
"query" : "the big bad wolf",
452-
"ordered" : true,
453-
"max_gaps" : 0 } },
454-
{ "match" : {
455-
"query" : "the big wolf",
456-
"ordered" : true,
457-
"max_gaps" : 0 } }
458-
]
459-
}
460-
}
461-
}
462-
}
463-
}
464-
--------------------------------------------------

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search/230_interval_query.yml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@ setup:
2121
- '{"text" : "Baby its cold there outside"}'
2222
- '{"index": {"_index": "test", "_id": "4"}}'
2323
- '{"text" : "Outside it is cold and wet"}'
24+
- '{"index": {"_index": "test", "_id": "5"}}'
25+
- '{"text" : "the big bad wolf"}'
26+
- '{"index": {"_index": "test", "_id": "6"}}'
27+
- '{"text" : "the big wolf"}'
2428

2529
---
2630
"Test ordered matching":
@@ -444,4 +448,31 @@ setup:
444448
prefix: out
445449
- match: { hits.total.value: 3 }
446450

451+
---
452+
"Test rewrite disjunctions":
453+
- do:
454+
search:
455+
index: test
456+
body:
457+
query:
458+
intervals:
459+
text:
460+
all_of:
461+
intervals:
462+
- "match":
463+
"query": "the"
464+
- "any_of":
465+
"intervals":
466+
- "match":
467+
"query": "big"
468+
- "match":
469+
"query": "big bad"
470+
- "match":
471+
"query": "wolf"
472+
max_gaps: 0
473+
ordered: true
474+
475+
- match: { hits.total.value: 2 }
476+
- match: { hits.hits.0._id: "6" }
477+
- match: { hits.hits.1._id: "5" }
447478

0 commit comments

Comments
 (0)