-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[ES|QL] Add suggested_cast #127139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ES|QL] Add suggested_cast #127139
Conversation
fd0b69d
to
b2e2546
Compare
b2e2546
to
cee8c04
Compare
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @limotova, I've created a changelog YAML for you. |
Would it make sense to also add a suggestion to cast in the error message? |
Pinging @elastic/kibana-esql (ES|QL-ui) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an example response in the PR description? It helps folks scanning these things.
...sql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/FieldExtractorTestCase.java
Outdated
Show resolved
Hide resolved
I was wondering to myself "should this return a list?" and I think the answer is "no". At this point there's always one best choice. |
return null; | ||
} | ||
if (originalTypes.contains(DATE_NANOS) && originalTypes.contains(DATETIME) && originalTypes.size() == 2) { | ||
return DATETIME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not DATE_NANOS
? This is orthogonal to aggregate_metric_double
, but I thought we want to auto-cast (not even suggest) to nanos and error out if the conversion is not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still some up in the air about this one. DATE_NANOS would leave a lot of stuff out of range. I feel like it's pretty low commitment to put this here though.
} | ||
} | ||
if (allNumeric) { | ||
return AGGREGATE_METRIC_DOUBLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an option to use one of the underlying metrics (e.g. max) in a mixed query? Or, we'd have to convert the e.g. double to aggregate_metric_double first, and then use the max from all of them (original and converted)? That'd be one expensive operation to read all values - it may be fine, if there's no better way to mix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Casting is only something we can do with one value at a time but these underlying metrics are things you'd get from an agg. I think. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the underlying metrics would be accessed in specific aggregations.
Maybe when we implement implicit casting of aggregate_metric_double
in aggregations instead of going double
-> aggregate_metric_double
-> submetric
(double/int), we can do aggregate_metric_double
-> submetric
(double/int), and have no conversion for doubles/ints (or if the other mapping is a long or something convert that to the submetric type)?
(The count submetric is an int but the others are doubles)
How can I cast to Is going to be
|
I'm cool with it. I wish we generated that error message lazily, but otherwise it's cool. |
Sorry for the drive by, but wanted to let you know: there's ongoing work (#123678) that will implicitly cast to date_nanos in this case, contrary to the suggestion, which may also become obsolete. |
Yeah, that exists. I figure folks will do @limotova, what do you think of an integration test that sets up indices with the conflicting types, does a Not sure we could do it for x - but maybe. Parameterized testing would make it easy to write. It probably takes a half second per combo... We have about 30 types. That'd be 900 cases.... so 450 seconds which is way too slow. But maybe I'm off? Still worth doing for some known combos I think. Ooh - it's probably half that because we don't care about order! Still 225 seconds is a lot. |
Sure this works for me too! As long as the cast type is valid! |
lgtm.. delegating to Nik to approve, since this is really ES|QL territory. |
It sounds like I should change the datetime + date_nanos combo to suggest
I can set up the integration test but I think it might be worth mentioning too that we might have 3+ types in a single request, so if we really wanted to go crazy there could be a lot more cases... |
return null; | ||
} | ||
if (originalTypes.contains(DATE_NANOS) && originalTypes.contains(DATETIME) && originalTypes.size() == 2) { | ||
return DATETIME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still some up in the air about this one. DATE_NANOS would leave a lot of stuff out of range. I feel like it's pretty low commitment to put this here though.
Adds a
suggested_cast
for unsupported fields based onoriginal_types
for specific cases:
datetime
anddate_nanos
, suggestsdate_nanos
aggregate_metric_double
and other numerics, suggestsaggregate_metric_double
keyword
in all other casesFor example, a field that is mapped to
aggregate_metric_double
in one index, butdouble
in another will have a response like:A field that is completely unsupported will have a response like: