Skip to content

keyword search in ESQL is too slow compared to KQL/DSL #104517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
getkub opened this issue Jan 18, 2024 · 8 comments
Open

keyword search in ESQL is too slow compared to KQL/DSL #104517

getkub opened this issue Jan 18, 2024 · 8 comments
Assignees
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@getkub
Copy link

getkub commented Jan 18, 2024

Elasticsearch Version

8.11.4

Installed Plugins

No response

Java Version

bundled

OS Version

Kubernetes

Problem Description

in KQL Searching a keyword is much faster. But in ESQL there is no 'keyword' search function. But only wildcard and it is too slow

KQL is much faster on large datasets (due to keyword search ability) and ESQL is very poor and even time-out in most cases

Steps to Reproduce

Eg

# Sample data
{"employer_domain": "COM", "employer_country": "UNITED STATES"}
{"employer_domain": "ORG", "employer_country": "UNITED STATES"}
{"employer_domain": "COM", "employer_country": "UNITED KINGDOM"}
{"employer_domain": "COM", "employer_country": "THE UNITED COUNTRY"}
# KQL

employer_domain:"COM" AND employer_country:UNITED
# ESQL

| where employer_domain=="COM" AND employer_country LIKE "*UNITED*"

I couldn't find an ESQL command to just put the UNITED keyword as I've to put *UNITED*

Suggestions

Some options like

  • Have a new command like CONTAINS or SEARCH so it dedicately searches for keyword?
  • Uplift LIKE command to put keywords. So | where employer_country LIKE " UNITED " (there is space before after UNITED)
@getkub getkub added >bug needs:triage Requires assignment of a team area label labels Jan 18, 2024
@mbudge
Copy link

mbudge commented Jan 18, 2024

Can't see the wildcards in the KQL query.

@getkub
Copy link
Author

getkub commented Jan 18, 2024

Can't see the wildcards in the KQL query.

I couldn't find an ESQL command to just put the UNITED keyword as I've to put *UNITED*

@demjened demjened added the :Analytics/ES|QL AKA ESQL label Jan 18, 2024
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Jan 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@demjened
Copy link
Contributor

demjened commented Jan 18, 2024

@getkub Have you tried this with the LIKE keyword (i.e. WHERE employer_country LIKE "*UNITED*"), and if so, do you get the same performance? See https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-commands.html#esql-where.
(I assigned this ticket to :Analytics/ES|QL)

@getkub
Copy link
Author

getkub commented Jan 19, 2024

@getkub Have you tried this with the LIKE keyword (i.e. WHERE employer_country LIKE "*UNITED*"), and if so, do you get the same performance? See https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-commands.html#esql-where. (I assigned this ticket to :Analytics/ES|QL)

Sorry , i forgot mention in the search. Yes , I did put the LIKE and is having bad performance

@fang-xing-esql
Copy link
Member

@getkub There are efforts to bring some search functionalities into ES|QL recently, and users have the options to run KQL, match and query string queries through ES|QL now. The query in the description can be rewritten like below by leveraging KQL, match or query string functions, if they suffer poor performance with like or ==, these search functions could be alternatives.

where KQL("employer_domain: COM") AND KQL("employer_country: UNITED")

where employer_domain:"COM" AND employer_country:"UNITED"

Please find more details and example here.

@nik9000
Copy link
Member

nik9000 commented Apr 22, 2025

If you do:

{"employer_domain": "COM", "employer_country": "UNITED STATES"}
{"employer_domain": "ORG", "employer_country": "UNITED STATES"}
{"employer_domain": "COM", "employer_country": "UNITED KINGDOM"}
{"employer_domain": "COM", "employer_country": "THE UNITED COUNTRY"}

If it's slow you are likely getting text fields. If you query employer_domain.raw == "COM" then it should be fast. We're working on removing the need for .raw in #126641 - but that's 9.1.0. Pushing the LIKE on text fields is harder.

Easier is using the mapping to make those keyword fields. It really should be pretty fast that way.

FWIW the KQL and : operators are different from ==, especially when applied to text fields. KQL and : are match style queries. == is single valued equality.

@getkub
Copy link
Author

getkub commented Apr 22, 2025

If you do:
...
If it's slow you are likely getting text fields. If you query employer_domain.raw == "COM" then it should be fast. We're working on removing the need for .raw in #126641 - but that's 9.1.0. Pushing the LIKE on text fields is harder.

FWIW the KQL and : operators are different from ==, especially when applied to text fields. KQL and : are match style queries. == is single valued equality.

Thanks for that. This will be great solution. We will wait for 9.1.x to upgrade as don't want to change all the detection rules to .raw.
I couldn't underst the == and single valued equality mean as per your comment? So we won't be able to do employer_country == "UNITED STATES" or you saying it cannot replicate the 'match_all' kind of concept?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

6 participants