Skip to content

lookup join first draft #123719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 231 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
231 commits
Select commit Hold shift + click to select a range
d8c77e3
lookup join first draft
georgewallace Feb 28, 2025
f5c5961
fixing link
georgewallace Feb 28, 2025
9c92e58
correcting image
georgewallace Feb 28, 2025
224656e
updates
georgewallace Feb 28, 2025
e3d94a1
fixing build error
georgewallace Feb 28, 2025
a6dec83
still fixing
georgewallace Feb 28, 2025
13b73a6
updates
georgewallace Feb 28, 2025
1714289
providing example
georgewallace Feb 28, 2025
47a8bf6
fixing image
georgewallace Feb 28, 2025
c64bfb9
fixing example
georgewallace Mar 3, 2025
1632577
Apply suggestions from code review
georgewallace Mar 4, 2025
8ae5d05
fixes
georgewallace Mar 4, 2025
6ffccb5
corrections based on alex's feedback
georgewallace Mar 4, 2025
999472d
fixes
georgewallace Mar 4, 2025
b5fd81f
updates
georgewallace Mar 4, 2025
74d627a
Apply suggestions from code review
georgewallace Mar 5, 2025
140bb1f
Apply suggestions from code review
georgewallace Mar 11, 2025
2810e9d
[Inference API] Auto-propagate product origin for every subclass of E…
timgrein Mar 3, 2025
6d5022a
Adding support for specifying embedding type to Jina AI service setti…
ymao1 Mar 3, 2025
fb2dd02
Query rules: Check if MGET request failed when we retrieve query rule…
ioanatia Mar 3, 2025
7aca7e2
ESQL - remove some dead code (#123720)
not-napoleon Mar 3, 2025
fb28191
Enable a sparse doc values index for `@timestamp` in time-series indi…
jordan-powers Mar 3, 2025
050fc6c
Grant read access to the config dir (#123882)
rjernst Mar 3, 2025
bd9f6c8
[ML] Improve EIS authorization to perform requests on a periodic basi…
jonathan-buttner Mar 3, 2025
332a503
Speed up stored field spec management (#122645)
nik9000 Mar 3, 2025
7b28714
Update Flatten Graph Docs to Include a Real Flattened Graph 9.x (#123…
john-wagster Mar 3, 2025
4238b4c
remove duplicate paths in FileAccessTree (#123776)
jdconrad Mar 3, 2025
56f69f8
Explicit import layout for packages begin with io (#123854)
ywangd Mar 3, 2025
f1127d4
Comment on disabling compression when using HTTPS (#123877)
DaveCTurner Mar 3, 2025
f98dc57
Relax profiles assertions (#123923)
dnhatn Mar 3, 2025
67080c9
Mute org.elasticsearch.smoketest.MlWithSecurityIT test {yaml=ml/data_…
elasticsearchmachine Mar 4, 2025
61e484d
Fix custom authz engine for multi-project (#123937)
tvernum Mar 4, 2025
b915c21
Bump up httpcore version (#123932)
joegallo Mar 4, 2025
201992e
Unmute CrossClusterQueryWithPartialResultsIT tests (#123943)
dnhatn Mar 4, 2025
7e86684
Skip internal logging in compat check (#123940)
rjernst Mar 4, 2025
8b67e4d
[Test] More robust unique allocationId randomization (#123941)
ywangd Mar 4, 2025
738e5f1
Add temporary workaround for IntelliJ editorconfig bug (#123954)
nicktindall Mar 4, 2025
176015c
ESQL: Ensure non-zero row size in `EstimatesRowSize` (#122762)
kanoshiou Mar 4, 2025
dbf074a
Add missing APM entitlements (#123462)
ldematte Mar 4, 2025
c8f9a9a
[Entitlements] Add URLConnection instrumentation for ftp, http and ht…
ldematte Mar 4, 2025
960bd64
Add `@UpdateForV9` for `ReferenceDocs` (#123928)
DaveCTurner Mar 4, 2025
697870d
Mute org.elasticsearch.indices.recovery.IndexRecoveryIT testSourceThr…
elasticsearchmachine Mar 4, 2025
aa66099
[DOCS] Update Elasticsearch docs README for v9 (#123902)
leemthompo Mar 4, 2025
e093838
[Gradle] Fix and simplify disabling assertions in test tasks (#123038)
breskeby Mar 4, 2025
a8606c8
Align `TransportVersion#bestKnownVersion` with 8.x (#123801)
DaveCTurner Mar 4, 2025
95fea35
[DOCS] Delete old asciidoc preview URL action (#123909)
leemthompo Mar 4, 2025
82b7a66
Unmute `TimeSeriesDataStreamsIT.testSearchableSnapshotAction` (#123973)
nielsbauman Mar 4, 2025
f01a9e8
Mute org.elasticsearch.xpack.esql.heap_attack.HeapAttackIT testLookup…
elasticsearchmachine Mar 4, 2025
38f083b
Add test for deprecated index settings in N-2 indices (#122493)
cbuescher Mar 4, 2025
5f67b42
Failure store access - selector-aware role building (#122715)
n1v0lg Mar 4, 2025
90ea040
Unmute `SemanticInferenceMetadataFieldsRecoveryTests#testSnapshotReco…
jimczi Mar 4, 2025
4ced718
Drop `TLS_RSA` ciphers from default cipher suites for JDK 24 (#123600)
n1v0lg Mar 4, 2025
c729072
Bump versions after 8.17.3 release
elasticsearchmachine Mar 4, 2025
33f4f76
Prune changelogs after 8.17.3 release
elasticsearchmachine Mar 4, 2025
3f31a03
Split ESQL functions/operators docs files (#123904)
craigtaverner Mar 4, 2025
81dfe0b
Bump versions after 8.16.5 release
elasticsearchmachine Mar 4, 2025
44182e2
[ML] Retry on streaming errors (#123076)
prwhelan Mar 4, 2025
9ffe225
Make NotEntitledException inherit from SecurityException for compatib…
ldematte Mar 4, 2025
91d5ac1
missing file entitlement used by google-http-client for oauth2 (#123985)
ldematte Mar 4, 2025
d274230
Remove trappy timeouts from `IndicesAliasesRequest` (#123987)
DaveCTurner Mar 4, 2025
b2a48a6
Add some logging to better see what's up with testCancelFailedSearchW…
smalyshev Mar 4, 2025
7a32762
Remove skip with semantic_text tests in ES|QL (#123948)
dnhatn Mar 4, 2025
5155441
Update docker.elastic.co/wolfi/chainguard-base:latest Docker digest t…
elastic-renovate-prod[bot] Mar 4, 2025
7d72171
Fix timestamp range query optimization for indices with doc values sk…
jordan-powers Mar 4, 2025
f40f814
Remove obsolete EIS feature flag class (#123716)
demjened Mar 4, 2025
cde9b0e
Add coordinating object to track bytes (#122460)
Tim-Brooks Mar 4, 2025
b43d7af
Add inbound_network entitlement to repository-hdfs plugin (#123907)
mark-vieira Mar 4, 2025
63e3abd
Add build artifact containing json file of all wire compatible versio…
mark-vieira Mar 4, 2025
fc4d6a4
[main] [ML] Use latest results index for new Anomaly Detection jobs (…
davidkyle Mar 4, 2025
908bd6b
Much faster indices lookup on metadata (#123749)
original-brownbear Mar 4, 2025
c8dfd31
[CI] Fix the lucene compatibility tests in intake (#124034)
breskeby Mar 4, 2025
95867f6
Add jdk.management.agent module to server boot layer on start (#123938)
mark-vieira Mar 4, 2025
2f2022a
Simplify check to split bulk request (#124035)
Tim-Brooks Mar 4, 2025
7424b45
Mute org.elasticsearch.entitlement.runtime.policy.FileAccessTreeTests…
elasticsearchmachine Mar 4, 2025
81b0555
Remove duplicate exclusive paths (#124023)
prdoyle Mar 4, 2025
0b22b49
Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClie…
elasticsearchmachine Mar 4, 2025
0abd223
Mute org.elasticsearch.smoketest.MlWithSecurityIT test {yaml=ml/3rd_p…
elasticsearchmachine Mar 4, 2025
c6bb176
Use MultiProjectPendingException more consistently (#123955)
ywangd Mar 5, 2025
7573f66
Change constructor to private for ProjectMetadata (#124060)
ywangd Mar 5, 2025
73f42e9
[Entitlements] Add URLConnection instrumentation for file protocol (#…
ldematte Mar 5, 2025
ab147e0
Reapply "Update Gradle wrapper to 8.13 (#122421)" (#123889) (#123896)
breskeby Mar 5, 2025
ed9f3ac
Explicitly pass project ID in simulate pipeline request (#124033)
nielsbauman Mar 5, 2025
e762fbd
Use singleton instance for default project-id (#123677)
ywangd Mar 5, 2025
950335b
[profiling] Take care of @UpdateForV9 (#123977)
rockdaboot Mar 5, 2025
adac826
Remove code to handle pre-7.6 TCP headers (#123899)
thecoop Mar 5, 2025
7c61d06
[DOCS] Update API ref link in docs README (#124069)
leemthompo Mar 5, 2025
987ecb1
Fix configuration cache compatibility issues (#124073)
breskeby Mar 5, 2025
60122a1
Move some security APIs to using promises in place of callbacks (#123…
original-brownbear Mar 5, 2025
4bc2c5f
Refactor FieldCapabilities creation by adding a proper builder object…
GalLalouche Mar 5, 2025
677829e
Simplify Lucene60 and Luene62 codec constructors (#124054)
javanna Mar 5, 2025
4d80506
add wiz and aws security hub new full posture data streams to kibana_…
maxcold Mar 5, 2025
e325dcd
Collapse 8.16.1 transport versions (#124003)
thecoop Mar 5, 2025
a7ace03
Re-remove min compatible version from SearchRequest (#123859)
thecoop Mar 5, 2025
0a71c40
ESQL: Fix ShapeGeometryFieldMapperTests (and rename) (#122871)
GalLalouche Mar 5, 2025
50d0b8d
Add note to servicenow connector ref (#124101)
leemthompo Mar 5, 2025
f3082a2
Remove synthetic recovery source feature flag. (#122615)
martijnvg Mar 5, 2025
b6f58b3
Mute org.elasticsearch.test.apmintegration.MetricsApmIT testApmIntegr…
elasticsearchmachine Mar 5, 2025
93581ee
Collapse transport versions for 8.17.0 (#124005)
thecoop Mar 5, 2025
e0dea94
Remove 7.11 and 7.12 transport versions (#124024)
thecoop Mar 5, 2025
6dcb345
[ML] Remove deprecated routes for ml trained models APIs (#124019)
davidkyle Mar 5, 2025
1e2fabc
Update node-settings.md (#123997)
shainaraskas Mar 5, 2025
c7fb1df
ESQL: Use a must boolean statement when pushing down to Lucene when s…
astefan Mar 5, 2025
d5667ea
[Stack Monitoring] [REVERT] Update stack monitoring templates for Sta…
consulthys Mar 5, 2025
441c0af
Revert "Mute org.elasticsearch.test.apmintegration.MetricsApmIT testA…
ldematte Mar 5, 2025
baf9c54
Generate compatible versions artifact in distributions dir (#124119)
mark-vieira Mar 5, 2025
c507cba
Mute org.elasticsearch.xpack.esql.plugin.MatchOperatorIT testScoring_…
elasticsearchmachine Mar 5, 2025
8e67c6f
Cleanup RegisteredDomainProcessorTests (#124118)
joegallo Mar 5, 2025
6c110ce
Adjust exception thrown when unable to load hunspell dict (#123743)
benwtrent Mar 5, 2025
6dacdb0
Use FallbackSyntheticSourceBlockLoader for boolean and date fields (#…
lkts Mar 5, 2025
d4aac83
Mute org.elasticsearch.search.query.QueryPhaseTimeoutTests testScorer…
elasticsearchmachine Mar 5, 2025
2a23d0d
Mute org.elasticsearch.search.query.QueryPhaseTimeoutTests testScorer…
elasticsearchmachine Mar 5, 2025
96a9946
Remove matched text from chunks (#123607)
Mikep86 Mar 5, 2025
a37291e
Add index mode to get data stream API (#122486)
dakrone Mar 5, 2025
a04781c
Update XPackPlugin for project awareness (#124087)
tvernum Mar 5, 2025
b9fa1fb
Introduce allow_partial_results setting in ES|QL (#122890)
dnhatn Mar 5, 2025
324f96a
Cleanup RegisteredDomainProcessor (#124123)
joegallo Mar 5, 2025
f981f1e
Mute org.elasticsearch.compute.data.BlockMultiValuedTests testToMask …
elasticsearchmachine Mar 5, 2025
1aabc15
Filter out JNA Cleaner thread from test leak detection (#114668) (#12…
rjernst Mar 5, 2025
84b5d67
Mute org.elasticsearch.smoketest.MlWithSecurityIT test {yaml=ml/start…
elasticsearchmachine Mar 6, 2025
fd5d23c
Limit the log line length for s3 deletion error (#123953)
ywangd Mar 6, 2025
87e0ef6
Resharding - Adding shards to an existing index (#121082)
ankikuma Mar 6, 2025
4e20ff4
Remove unused method from AutoCreateIndex (#124176)
tvernum Mar 6, 2025
0f5c99c
Drop cluster scope method from RoleMappingMetadata (#124174)
tvernum Mar 6, 2025
914f2cd
Remove workaround from TransportHandshakerTests (#123587)
tvernum Mar 6, 2025
78830f0
Re-instate watch count check with busy/wait (#123979)
lukewhiting Mar 6, 2025
ef2c8d6
Enhance memory accounting for document expansion and introduce max do…
fcofdez Mar 6, 2025
f980407
Avoid serializing empty _source fields in mappings. (#122606)
martijnvg Mar 6, 2025
4b32753
Use IDENTITY constants when creating stats in TrainedModelsStatsActio…
iverase Mar 6, 2025
8c92341
Make downsampling project-aware (#124000)
nielsbauman Mar 6, 2025
85cdb72
Make the test deterministic (#124204)
astefan Mar 6, 2025
ecc19b0
[ML] Update inference api rest spec (#124151)
jonathan-buttner Mar 6, 2025
3bd0052
[DOCS] Clarify support for doc_values (#124047)
marciw Mar 6, 2025
436320a
Mute org.elasticsearch.search.fieldcaps.FieldCapabilitiesIT testReloc…
elasticsearchmachine Mar 6, 2025
469c7e1
Updating description of stream API (#124209)
jonathan-buttner Mar 6, 2025
05bc1c6
ESQL: Revive some more of inlinestats functionality (#123589)
astefan Mar 6, 2025
6e06838
[DOCS] Update DOCS README.md backporting guidance (#124228)
leemthompo Mar 6, 2025
a28cfbc
Remove several remaining Core/Infra UpdateForV9 references (#124197)
thecoop Mar 6, 2025
6a7a21e
[Entitlements] MailToURLConnection instrumentation (#123829)
ldematte Mar 6, 2025
3f09ff0
Entitle inference to access AWS credentials (#123750)
prdoyle Mar 6, 2025
1c9c8e9
[ML] Refactor OpenAI request managers (#124144)
jonathan-buttner Mar 6, 2025
9afb36e
[Inference API] Fix output stream ordering in InferenceActionProxy (#…
timgrein Mar 6, 2025
d52ac36
Refactor RegisteredDomainProcessorTests (#124175)
joegallo Mar 6, 2025
4b9fc46
[DOCS] fix external links (#124248)
colleenmcginnis Mar 6, 2025
9926903
Remove seemingly fixed test mute (#124207)
cbuescher Mar 6, 2025
24d9ca0
[DOCS] Document source-related restrictions (#124011)
kkrik-es Mar 6, 2025
7e50bc1
Have create index return a bad request on poor formatting (#123761)
benwtrent Mar 6, 2025
7481411
Avoid hoarding cluster state references during rollover (#124107)
nielsbauman Mar 6, 2025
cb92185
Make enrich project-aware (#124099)
nielsbauman Mar 6, 2025
09e6f22
Retry ILM async action after reindexing data stream (#124149)
parkertimmins Mar 6, 2025
0688eca
[Deprecation] Update URL (#124259)
prwhelan Mar 6, 2025
4f34202
[CI] Disable multi-project tests in non-snapshot build tests (#124261)
brianseeders Mar 6, 2025
c0cb3f5
Cheaper handling of skipped shard iterators in AbstractSearchAsyncAct…
original-brownbear Mar 6, 2025
ef37ad3
Add annotation for watcher deprecated setting (#124278)
samxbr Mar 7, 2025
b752e61
Require project-id for waitForPersistentTasksCondition (#124180)
ywangd Mar 7, 2025
3dd09ea
Revert "Add temporary workaround for IntelliJ editorconfig bug (#1239…
nicktindall Mar 7, 2025
26a5cd4
Migrate getProject().index calls to lookup index (#124178)
tvernum Mar 7, 2025
beae3db
[Entitlements] Allow read access to a plugin's directory (#124111)
ldematte Mar 7, 2025
67c2088
Use valid REST version when determining capabilities (#123864)
thecoop Mar 7, 2025
3fe8211
Remove 7.13 and 7.14 transport versions (#124115)
thecoop Mar 7, 2025
c3b1f9a
Fix some lazy rollover code (#124153)
nielsbauman Mar 7, 2025
a0120be
Add tests for non-fatal errors in data node request sender (#124203)
idegtiarenko Mar 7, 2025
4cb8c7a
Mute org.elasticsearch.smoketest.MlWithSecurityIT test {yaml=ml/3rd_p…
elasticsearchmachine Mar 7, 2025
b57ab72
[Entitlements] Fix AbstractDelegateHttpsURLConnection "this" paramete…
ldematte Mar 7, 2025
6b358f4
Do not retry CBE (#124300)
idegtiarenko Mar 7, 2025
24b82b1
Fix NPE in AdaptiveAllocationsScalarService for null nodes (serverles…
jan-elastic Mar 7, 2025
ced8687
Remove 7.16 transport version (#124194)
thecoop Mar 7, 2025
2febbd2
Change setting's deprecation message wording (#120718)
alexey-ivanov-es Mar 7, 2025
3936ba8
Do not let ShardBulkInferenceActionFilter unwrap / rewrap ESException…
tteofili Mar 7, 2025
faf4b99
Upgrade httpclient to 5.3.3 for build-tools-internal (#124018)
joegallo Mar 7, 2025
0889142
[spotless] Remove extra imports (#124354)
arteam Mar 7, 2025
57ab5c4
Added optional parameters to QSTR ES|QL function (#121787)
svilen-mihaylov-elastic Mar 7, 2025
88b094a
Drop unused `prefix` and `suffix` from string collection utils (#124353)
DaveCTurner Mar 7, 2025
17baa2e
Don't use dot product similarity in SemanticInferenceMetadataFieldsRe…
Mikep86 Mar 7, 2025
aae70ac
Set cause on create index request in create from action (#124363)
parkertimmins Mar 7, 2025
684131c
Use `safeAwait` in `indexRandom` (#124362)
DaveCTurner Mar 7, 2025
2bf626b
[Docs] Fix cross-repo links to Beats docs (#124360)
kilfoyle Mar 7, 2025
69fadc7
Add bit vector support to semantic text (#123187)
Mikep86 Mar 7, 2025
431ca2d
Introduce `BoundedDelimitedStringCollector` (#124303)
DaveCTurner Mar 7, 2025
06d38f8
Mute org.elasticsearch.xpack.inference.mapper.SemanticInferenceMetada…
elasticsearchmachine Mar 7, 2025
d0966ac
Mute org.elasticsearch.xpack.inference.mapper.SemanticInferenceMetada…
elasticsearchmachine Mar 7, 2025
90d13fe
Mute org.elasticsearch.xpack.inference.mapper.SemanticInferenceMetada…
elasticsearchmachine Mar 7, 2025
59b4cfd
Refactor `SnapshotInfo` dataflow in finalization (#124336)
DaveCTurner Mar 7, 2025
02c9644
Mute org.elasticsearch.env.NodeEnvironmentTests testIndexCompatibilit…
elasticsearchmachine Mar 7, 2025
c175607
Handle empty input inference (#123763)
Samiul-TheSoccerFan Mar 8, 2025
5982720
[Entitlements] Use the correct format for the `EntitlementInstrumente…
ldematte Mar 8, 2025
d74dc02
DateProcessor refactoring (#124349)
joegallo Mar 8, 2025
9fead41
IngestDocument readability improvements (#124322)
joegallo Mar 8, 2025
0194727
Add exclusive access files for security module (#123676)
rjernst Mar 8, 2025
a4e32bf
[Entitlements] Add support for IT testing always allowed actions (#12…
ldematte Mar 8, 2025
86b65f3
Make NotEntitledException inherit from AccessControlException for com…
ldematte Mar 8, 2025
3f96715
Mute org.elasticsearch.entitlement.runtime.policy.PolicyManagerTests …
elasticsearchmachine Mar 8, 2025
2da66b8
Fix test - wait for other threads before throwing the exception (#124…
smalyshev Mar 8, 2025
7179480
Revert "missing file entitlement used by google-http-client for oauth…
ldematte Mar 8, 2025
9edf1c2
remove addess to home/.aws for repository-s3 (#124190)
ldematte Mar 8, 2025
6f65aa8
Mute org.elasticsearch.entitlement.runtime.policy.FileAccessTreeTests…
elasticsearchmachine Mar 8, 2025
ce611c6
Mute org.elasticsearch.xpack.restart.FullClusterRestartIT testWatcher…
elasticsearchmachine Mar 9, 2025
5870b3e
[Entitlements] Add URLConnection instrumentation for jar protocol (#1…
ldematte Mar 9, 2025
1d74dda
More debug logging in realms authenticator (#124342)
n1v0lg Mar 9, 2025
795dc15
Mute org.elasticsearch.indices.stats.IndexStatsIT testFilterCacheStat…
elasticsearchmachine Mar 9, 2025
374f484
Remove some overhead from TransportService message handling (#124428)
original-brownbear Mar 9, 2025
d7262fb
Make some constant SubscribableListener instances cheaper (#124452)
original-brownbear Mar 9, 2025
27e3d30
Cleanup dead code in o.e.search and o.e.a.search (#124445)
original-brownbear Mar 10, 2025
18b002a
[CI] Fix lucene compatibility tests in periodic builds (#124458)
breskeby Mar 10, 2025
d1ce796
Run `TransportExplainLifecycleAction` on local node (#122885)
nielsbauman Mar 10, 2025
624cfb8
Document `getMinTransportVersion` including exceptions (#124192)
DaveCTurner Mar 10, 2025
8a232a7
Deduplicate created objects when deserializing InternalAggregations i…
iverase Mar 10, 2025
e5637b0
Fix external URI images (#124350)
charlotte-hoblik Mar 10, 2025
e700036
Remove test usages of `DataStream#getDefaultBackingIndexName` in ILM …
gmarouli Mar 10, 2025
c8442b8
Give Kibana user 'all' permissions for .entity_analytics.* indices (#…
hop-dev Mar 10, 2025
1c8e0e6
Avoid reading unnecessary dimension values when downsampling (#124451)
martijnvg Mar 10, 2025
8184026
Remove 7.15 transport versions (#124193)
thecoop Mar 10, 2025
80c1b86
Fix entitlement checks for relative links (#124133)
mosche Mar 10, 2025
7e1936a
Remove SearchOperationListenerExecutor abstraction (#124298)
original-brownbear Mar 10, 2025
e090bde
Remove 7.17 transport versions (#124196)
thecoop Mar 10, 2025
195b5bb
[Tests] Simplify classpath for analytics javaRestTests (#124274)
breskeby Mar 10, 2025
d4065d0
ESQL: Unmute and fix BlockMultiValuedTests.testToMask (#124339)
ivancea Mar 10, 2025
b95fc4c
TSDB: Remove test compatibility for untested (#124113)
nik9000 Mar 10, 2025
514be2c
Fix FileAccessTreeTests#testDuplicateExclusivePaths to work on window…
ldematte Mar 10, 2025
a93865c
fix file tests to work across multiple invocations (#124412)
ldematte Mar 10, 2025
6611ee2
Ignore ordering in policy manager exclulsive tests (#124488)
rjernst Mar 10, 2025
e601fa8
Speed up block serialization (#124394)
dnhatn Mar 10, 2025
64b1704
Address test issue in QueryPhaseTimeoutTests (#124327)
javanna Mar 10, 2025
75bfcbf
Docs and simplifications to support for Lucene ancient versions (#124…
javanna Mar 10, 2025
4721d1c
[TEST] Remove BlockMultiValuedTests from muted tests list
javanna Mar 10, 2025
b708b32
Fix concurrency issue in ScriptSortBuilder (#123757)
javanna Mar 10, 2025
d97b414
Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClie…
elasticsearchmachine Mar 10, 2025
fab4b65
[ML] Modify test case to update running job (#124287)
edsavage Mar 10, 2025
353e161
ESQL: Lazy collection copying during node transform (#124424)
costin Mar 10, 2025
5126d57
Introduce IndexReshardingMetadata (#121360)
bcully Mar 11, 2025
6292e95
stuff
georgewallace Mar 5, 2025
fa6c1d3
fixing rebase
georgewallace Mar 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
updates
  • Loading branch information
georgewallace committed Mar 3, 2025
commit 224656e5574b5ebd7ce6c5ac618231c82991fff3
4 changes: 4 additions & 0 deletions docs/reference/elasticsearch/index-settings/index-modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ Index mode supports the following values:
`standard`
: Standard indexing with default settings.

`lookup`
: Index that can be used for lookup joins in ES|QL. Limited to 1 shard.


`time_series`
: *(data streams only)* Index mode optimized for storage of metrics. For more information, see [Time series index settings](time-series.md).

Expand Down
6 changes: 3 additions & 3 deletions docs/reference/query-languages/esql/esql-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ mapped_pages:

# {{esql}} commands [esql-commands]


## Source commands [esql-source-commands]

An {{esql}} source command produces a table, typically with data from {{es}}. An {{esql}} query must start with a source command.
Expand Down Expand Up @@ -685,21 +684,22 @@ TBD

**Examples**

TBD
**IP Threat correlation**: This query would allow you to see if any source IPs match known malicious addresses.

```esql
FROM firewall_logs
| LOOKUP JOIN threat_list ON source.IP
```

**Host metadata correctlation**: This query pulls in environment or ownership details for each host to correlate your metrics data.

```esql
FROM system_metrics
| LOOKUP JOIN host_inventory ON host.name
| LOOKUP JOIN employees ON host.name
```

TBD
**Service ownership mapping**: This query would show logs with the owning team or escalation information for faster triage and incident response.

```esql
FROM app_logs
Expand Down
7 changes: 7 additions & 0 deletions docs/reference/query-languages/esql/esql-enrich-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ For example, you can use `ENRICH` to:
* Add product information to retail orders based on product IDs
* Supplement contact information based on an email address

[`ENRICH`](/reference/query-languages/esql/esql-commands.md#esql-enrich) is similar to [`LOOKUP join`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) in the fact that they both help you join data together. You should use `ENRICH` when:

* Enrichment data doesn't changes frequently
* You can accept index-time overhead
* You are working with structured enrichment patterns
* You can accept having multiple matches combined into multi-values
* You can accept being limited to predefined match fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matching logic of "which lookup/enrich document matches a given input row" is also more lenient for ENRICH compared to LOOKUP JOIN. For instance, for LOOKUP JOIN, multivalued join keys like ["foo", "bar"] don't match anything if they occur in the main or lookup index; for ENRICH, there'll be a match if any value for the main index' join key is contained in the lookup index' join key.

I'll gather more precise information for this, we can maybe add this in a follow-up PR.


### How the `ENRICH` command works [esql-how-enrich-works]

Expand Down
35 changes: 30 additions & 5 deletions docs/reference/query-languages/esql/esql-lookup-join.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,63 @@
---
navigation_title: "LOOKUP JOIN"
navigation_title: "Correlate data with LOOKUP JOIN"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-enrich-data.html
---

# LOOKUP JOIN [esql-lookup-join]

The {{esql}} [`LOOKUP join`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) processing command combines, at query-time, data from one or more source indexes with field-value combinations found in an input table.
The {{esql}} [`LOOKUP join`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) processing command combines, at query-time, data from one or more source indexes with field-value combinations found in an input table. Teams often have data scattered across multiple indices – like logs, IPs, user IDs, hosts, employees etc. Without a direct way to enrich or correlate each event with reference data, root-cause analysis, security checks, and operational insights become time-consuming.

For example, you can use `LOOKUP JOIN` to:

* Pull in environment or ownership details for each host to enrich your metrics data
* Pull in environment or ownership details for each host to correlate your metrics data.
* Quickly see if any source IPs match known malicious addresses.
* Tag logs with the owning team or escalation info for faster triage and incident response.

[`LOOKUP join`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) is similar to [`ENRICH`](/reference/query-languages/esql/esql-commands.md#esql-enrich) in the fact that they both help you join data together. You should use `LOOKUP JOIN` when:

### How the `LOOKUP JOIN` command works [esql-how-lookup-join-works]
* Enrichment data changes frequently
* You want to avoid index time processing
* Working with regular indices
* Need to preserve distinct matches
* Need to match on any field in a lookup index

## How the `LOOKUP JOIN` command works [esql-how-lookup-join-works]

The `LOOKUP JOIN` command adds new columns to a table, with data from {{es}} indices. It requires a few special components:

:::{image} ../../../images/esql-lookup-join.png
:alt: esql lookup join
:::

::::{tip}
`LOOKUP JOIN` does not guarantee the output to be in any particular order. If a certain order is required, users should use a [`SORT`](/reference/query-languages/esql/esql-commands.md#esql-sort) somewhere after the `LOOKUP JOIN`.

::::

$$$esql-source-index$$$

Source index
: An index which stores enrich data that the `LOOKUP` command can add to input tables. You can create and manage these indices just like a regular {{es}} index. You can use multiple source indices in an enrich policy. You also can use the same source index in multiple enrich policies.


### Prerequisites [esql-enrich-prereqs]
## Prerequisites [esql-enrich-prereqs]

To use `LOOKUP JOIN`, you must have:

* Data types of join key and join field in the lookup index need to generally be the same - up to widening of data types, where e.g. `short,byte` are considered equal to `integer`. Also, text fields can be used on the left hand side if and only if there is an exact subfield whose name is suffixed with `.keyword`.

## Limitations

The following is a list of current limitations with `LOOKUP JOIN`

* `LOOKUP JOIN` will be sucessfull if both left and right type of the join are both `KEYWORD` types or if the left type is of `TEXT` and the right type is `KEYWORD`.
* Indices in [lookup](elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting) mode are always single-sharded.
* Cross cluster search is unsupported. Both source and lookup indicies must be local.
* `LOOKUP JOIN` can only use a single match field, and can only use a single index. Wildcards, aliases, and datastreams are not supported.
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match an existing field in the query. This may require renames or evals to achieve it.
* The query will circuit break if you fetch too much data in a single page. A large heap is needed to manage results of multiple megabytes.
* This limit is per page of data which is about about 10,000 rows.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if pages of data are a well-defined concept (at least user-facing). I also don't know if we can say this is about 10k rows. (@nik9000 may have more precise ideas on this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to know the exacts here, I took a stab based on what I read but unsure what the limit is. Also will they get a specific messagE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check in with Nik to get a better, but still precise wording here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think they will get a specific error message, probably just a generic circuit breaker exception.

Copy link
Contributor

@alex-spies alex-spies Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large.
More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large amount of heap space is needed if the matching documents from the lookup index for a batch are multiple megabytes or larger.
This is roughly the same as for `ENRICH`.

@nik9000 , could you please keep me honest here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Alex's proposal though I might say "about" instead of "normally". It is a bit fuzzy, but that's what you get when you don't describe pages to users precisely.

One thing that we could add is that larger nodes will allow bigger fetches.

This is fairly temporary - we should switch to a stream of results at some point.

* Matching many rows per incoming row will count against this limit.
* This limit is approximately the same as for [`ENRICH`](/reference/query-languages/esql/esql-commands.md#esql-enrich).