[ES|QL] Date nanos implicit casting in union types #123678

fang-xing-esql · 2025-02-28T05:06:14Z

The goal of this PR is to support union typed fields that reference both date and date_nanos fields without explicit casting. Union typed(mixed with date and date_nanos) fields are casted to a common data type during Analyzer phase automatically. numeric or string typed fields are not within the scope of this PR.

Today, queries that reference union(mixed with date and date_nanos) typed fields like below either return nulls or fail, because they reference union typed fields without explicit casting. After this PR, they should not fail or return null.

mapping
curl -u elastic:password -X PUT "localhost:9200/sample_data_1?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "client.ip": {"type": "ip"},
      "message": {"type": "text"},
      "test_date": {"type": "date"},
      "event.duration": {"type": "long"}
    }
  }
}
'
curl -u elastic:password -X PUT "localhost:9200/sample_data_2?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "client.ip": {"type": "ip"},
      "message": {"type": "keyword"},
      "test_date": {"type": "date_nanos"},
      "event.duration": {"type": "integer"}
    }
  }
}
'
queries
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "FROM sample_data*"
}
'
       @timestamp       |   client.ip   |event.duration |    message    |   test_date   
------------------------+---------------+---------------+---------------+---------------
2023-10-23T12:15:03.360Z|172.21.2.162   |null           |null           |null           
2023-10-23T12:27:28.948Z|172.21.2.113   |null           |null           |null           
2023-10-23T13:33:34.937Z|172.21.0.5     |null           |null           |null           
2023-10-23T13:51:54.732Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:52:55.015Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:53:55.832Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:55:01.543Z|172.21.3.15    |null           |null           |null           
2023-10-23T12:15:03.360Z|172.22.2.162   |null           |null           |null           
2023-10-23T12:27:28.948Z|172.22.2.113   |null           |null           |null           
2023-10-23T13:33:34.937Z|172.22.0.5     |null           |null           |null           
2023-10-23T13:51:54.732Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:52:55.015Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:53:55.832Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:55:01.543Z|172.22.3.15    |null           |null           |null           

+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "FROM sample_data* | sort test_date"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "verification_exception",
        "reason" : "Found 1 problem\nline 1:26: Cannot use field [test_date] due to ambiguities being mapped as [2] incompatible types: [date_nanos] in [sample_data_2], [datetime] in [sample_data_1]"
      }
    ],
    "type" : "verification_exception",
    "reason" : "Found 1 problem\nline 1:26: Cannot use field [test_date] due to ambiguities being mapped as [2] incompatible types: [date_nanos] in [sample_data_2], [datetime] in [sample_data_1]"
  },
  "status" : 400
}

A field can be referenced in nearly all the processing commands, and this PR should cover all the possible places a (union typed)field can appear:

elasticsearchmachine · 2025-02-28T05:07:40Z

Hi @fang-xing-esql, I've created a changelog YAML for you.

fang-xing-esql · 2025-02-28T13:11:51Z

@elasticmachine update branch

costin · 2025-03-20T01:20:45Z

Fang, can you please double check the status of this PR? Is it ready for review or still working on it?

fang-xing-esql · 2025-03-20T12:12:52Z

Fang, can you please double check the status of this PR? Is it ready for review or still working on it?

I'm trying to make enrich work with union typed fields, and this is the last piece that I want to figure out before making this ready for review.

alex-spies · 2025-04-11T10:42:28Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+                        for (Expression e : newProjections) { // Add groupings
+                            newAggs.add(Expressions.attribute(e));
+                        }
+                        return agg.with(evalForInvalidMappedField(agg.source(), agg.child(), aliases), newProjections, newAggs);


Because we inject an EVAL here, there's a chance that we'll accidentally inject EVALs multiple times. Which could lead to broken plans, when an upstream EVAL already replaced the multi-typed field attribute by a reference attribute of the same name. Or, put the other way around, since we're replacing a field by another one with the same name, downstream plans may have invalid references now.

This could happen in case that the reference resolution manages to resolve the whole query plan, with multiple references to multi-typed fields, before we start adding Evals.

alex-spies · 2025-04-11T10:49:04Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+                        List<? extends NamedExpression> origAggs = agg.aggregates();
+                        List<NamedExpression> newAggs = new ArrayList<>(origAggs.size());
+                        for (int i = 0; i < origAggs.size() - newProjections.size(); i++) { // Add aggregate functions
+                            newAggs.add(origAggs.get(i));


Normally, the way to update the expressions of a plan is to transform the expressions and use AttributeMap.resolve with an AttributeMap<Attribute> that contains which attribute should be replaced by which other attribute.

A date in the valid date-nanos range should be acceptable to parse. This issue was showing up in a difference in behaviour between Elasticsearch ingest and the CsvTests. This fix makes CsvTests behave the same as Elasticearch itself.

craigtaverner

Ran out of time, so I'll post the comments so far.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

...sql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/FieldExtractorTestCase.java

x-pack/plugin/esql/qa/testFixtures/src/main/resources/data/employees_incompatible.csv

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java

fang-xing-esql · 2025-04-22T13:42:47Z

Thanks for reviewing and brainstorming @craigtaverner @alex-spies ! Most of the comments have been addressed, here is a summary of the list of things that have been discussed:

The scope of this PR is reduced to do implicit casting for union typed fields with date and date_nanos only. It makes sense to take numeric types out, as there are a lot of questions on date_nanos already, date_nanos can be a pilot, so that we can collect feedbacks before supporting the other data types.
We don't do double casting, if there is explicit casting on union typed fields with date and date_nanos coded in the query, they won't be cast to date_nanos implicitly.
The implicit casting for union typed fields with date and date_nanos is taken out of the existing ImplicitCasting rule, a new rule is added after ResolveUnionTypes, to do implicit casting for the union typed fields(without explicit casting) that are not resolved by ResolveUnionTypes. We are not always able to replace a union typed field with a conversion function in place, sometimes an eval child plan needs to be added for the conversion functions, we need to traverse the plan tree, we can do it inside the ResolveUnionTypes rule, if it is preferred, however it will add some complexity to it, that's the main reason that I keep it as a separate rule in ImplicitCastingForUnionTypedFields.
Enrich doesn't support date_nanos as a match field(checked in resolveEnrich), it is out of the scope of this PR.
Lookup join does not work with date_nanos join keys yet, it is out of the scope of this PR, join is also a binary plan, in the future if we want to support it, we need some special handlings for binary plans when traversing and transforming the plan.

There are some comments that haven't been addressed yet:

Add a warning message when we do implicit casting to date_nanos, I think this is a good idea, I haven't added it as I want to clarify with the other folks whether casting to date_nanos is the final decision, as date_nanos have a smaller range of valid values compared to date. I'll reach out to relevant folks to collect feedbacks on which data type to cast to.
Performance. Filter and aggregations are the most interesting ones. Adding implicit casting may affect performance as it may prevent predicates to be pushed down into Lucene, and adding conversion functions to union typed fields add another layer of processing for operators that are not pushed down into Lucene(today we don't push down aggregations in a lot of cases). Although these queries that reference union typed fields fail at Verifier today, and allowing them to run is a step forward, if we can improve their performance by pushing down more into Lucene and reduce an extra layer of processing, that will be ideal. I'll investigate more about the possibility of unwrapping the conversion functions and pushing down more at data nodes.

craigtaverner

This looks much nicer to me! I have a few comments and questions, but otherwise I think it is good.

craigtaverner · 2025-04-23T13:27:00Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

@@ -180,7 +187,9 @@ public class Analyzer extends ParameterizedRuleExecutor<LogicalPlan, AnalyzerCon
             */
            new ImplicitCasting(),
            new ResolveRefs(),
-            new ResolveUnionTypes()  // Must be after ResolveRefs, so union types can be found
+            new ResolveUnionTypes(),  // Must be after ResolveRefs, so union types can be found
+            // Must be after ResolveUnionTypes, if there is explicit casting on the union typed fields, implicit casting won't be added


Nice solution!

craigtaverner · 2025-04-23T13:37:18Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+            // MultiTypeEsField, and don't be double cast here.
+            Map<FieldAttribute, Alias> invalidMappedFieldCasted = new HashMap<>();
+            LogicalPlan transformedPlan = plan.transformUp(LogicalPlan.class, p -> {
+                if (p instanceof UnaryPlan == false && p instanceof LookupJoin == false) {


So we allow through UnaryPlan and LookupJoin, but later we do nothing for LookupJoin. Is this a placeholder for a further update supporting LookupJoin?

I'll add a comment here for LookupJoin, and remove the instanceof check, I was attempting to make date_nanos fields work as join keys, however it didn't workout easily for me. Thank you for creating a separate issue for it!

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

craigtaverner · 2025-04-23T13:49:11Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+        private static Expression castInvalidMappedField(DataType targetType, FieldAttribute fa) {
+            Source source = fa.source();
+            return switch (targetType) {
+                case DATETIME -> new ToDatetime(source, fa); // in case we decided to use DATE as a common type instead of DATE_NANOS


Nice as an example, but is effectively dead code. I wonder what are the chances we'll change our mind on this one?

It feels like we're making a decision prioritizing the LogsDb plans. But could we be causing trouble for other users? Or is the thinking that other users need to add an explicit cast? That seems a reasonable assumption.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

craigtaverner · 2025-04-23T14:38:46Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+            // if so add a project with eval, so that a not null value can be returned for a union typed field
+            if (logicalPlan.resolved()) {
+                List<Attribute> output = logicalPlan.output();
+                Map<FieldAttribute, Alias> newAliases = Maps.newHashMapWithExpectedSize(output.size());


This code is very similar to code above, but that also took into account previous/existing field mappings. Why is that not needed here?

The goal of castInvalidMappedFieldInFinalOutput is to add implicit casting to mixed date/date_nanos fields, that are not referenced by the query(there was no chance to do implicit casting for them if they are not referenced in any commands), however they are in the returned resultsets. For example, from index*. It only checks the top level plan's output, it doesn't traverse the entire tree, so it doesn't have a map to keep track of the casted fields map across all plans/subplans.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Eval.java

alex-spies · 2025-04-25T14:42:55Z

Discussed offline with @fang-xing-esql and @craigtaverner.

I think the proposed solution should work, it's just a little complex given that it needs to replicate a) some logic present in InsertFieldExtraction (inserting EVALs for implicitly cast fields just before they're used) and b) for passing on fields that users never explicitly dropped or kept into the final output correctly.

I suggested one more approach which we agreed to try before going forward with the Eval-injection approach. The alternative would turn invalid mapped fields of date/nanos type into multi typed fields per default, before any resolution step (multi typed fields in field attributes trigger automatic conversion on field extraction). I threw together a draft as a commit on top of this PR, which seems to still run most tests fine (all existing union type tests + most tests from this PR).

This approach still needs to special case for the situation where users specifically don't want to convert into nanos, but dates, i.e. when the query has datetime_nanos_field::datetime in some expression and the classical union type mechanism should kick in.

…nos implicitly

fang-xing-esql added 3 commits February 25, 2025 20:17

implicit casting for date nanos

42ce293

Merge branch 'main' into date-nanos-implicit-casting

e006e07

implicit casting for union typed field in logical plan

5436afb

fang-xing-esql added >enhancement :Analytics/ES|QL AKA ESQL labels Feb 28, 2025

elasticsearchmachine added the v9.1.0 label Feb 28, 2025

Update docs/changelog/123678.yaml

08d395c

github-actions bot temporarily deployed to docs-preview February 28, 2025 05:08 Destroyed

Merge branch 'main' into date-nanos-implicit-casting

93615fb

github-actions bot deployed to docs-preview February 28, 2025 13:12 View deployment

more implicit casting for union typed field in logical plan

3552ae3

github-actions bot deployed to docs-preview March 6, 2025 05:02 View deployment

Merge branch 'main' into date-nanos-implicit-casting

811c5a3

github-actions bot deployed to docs-preview March 6, 2025 05:15 View deployment

fang-xing-esql added 3 commits March 14, 2025 09:24

Merge branch 'main' into date-nanos-implicit-casting

3fb26ed

Merge branch 'main' into date-nanos-implicit-casting

b49a48d

refactor

29f7afe

github-actions bot deployed to docs-preview March 18, 2025 04:15 View deployment

Merge branch 'main' into date-nanos-implicit-casting

ed8ce9b

github-actions bot deployed to docs-preview March 18, 2025 15:08 View deployment

support aggregations with bucket and more tests

c13e83c

github-actions bot deployed to docs-preview March 19, 2025 11:49 View deployment

correct typo

e26d25d

github-actions bot deployed to docs-preview March 19, 2025 12:39 View deployment

Merge branch 'main' into date-nanos-implicit-casting

3b36978

github-actions bot deployed to docs-preview March 19, 2025 17:00 View deployment

alex-spies reviewed Apr 11, 2025

View reviewed changes

github-actions bot deployed to docs-preview April 11, 2025 14:05 View deployment

craigtaverner reviewed Apr 11, 2025

View reviewed changes

craigtaverner mentioned this pull request Apr 17, 2025

Date nanos implicit casting take2 #127002

Closed

fang-xing-esql added 2 commits April 21, 2025 10:19

Merge branch 'main' into date-nanos-implicit-casting

76f19f3

refactor and remove implicit casting for numeric types

279d368

github-actions bot deployed to docs-preview April 21, 2025 16:00 View deployment

fang-xing-esql changed the title ~~[ES|QL] Date nanos implicit casting~~ [ES|QL] Date nanos implicit casting in union types Apr 21, 2025

update tests

4fb446b

github-actions bot deployed to docs-preview April 21, 2025 18:12 View deployment

fix tests

00b6229

github-actions bot deployed to docs-preview April 21, 2025 20:14 View deployment

craigtaverner mentioned this pull request Apr 23, 2025

LOOKUP JOIN on date_nanos fields finds zero matches #127249

Open

craigtaverner approved these changes Apr 23, 2025

View reviewed changes

alex-spies mentioned this pull request Apr 24, 2025

[ES|QL] Add suggested_cast #127139

Merged

push down binary comparisons on date and date nanos union type fields

2d68f85

github-actions bot deployed to docs-preview April 25, 2025 01:16 View deployment

fang-xing-esql added the test-release Trigger CI checks against release build label Apr 25, 2025

Merge branch 'main' into date-nanos-implicit-casting

39acff6

github-actions bot deployed to docs-preview April 25, 2025 02:26 View deployment

Merge branch 'main' into date-nanos-implicit-casting

8c04bcc

github-actions bot deployed to docs-preview April 25, 2025 12:55 View deployment

Merge branch 'main' into date-nanos-implicit-casting

4284c16

github-actions bot deployed to docs-preview April 25, 2025 15:45 View deployment

fang-xing-esql added 2 commits May 6, 2025 13:27

Merge branch 'main' into date-nanos-implicit-casting

56dd224

modify testSuggestedCast as date_nanos and date are casted to date_na…

46e3c95

…nos implicitly

github-actions bot deployed to docs-preview May 6, 2025 18:19 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ES|QL] Date nanos implicit casting in union types #123678

[ES|QL] Date nanos implicit casting in union types #123678

fang-xing-esql commented Feb 28, 2025 •

edited

Loading

elasticsearchmachine commented Feb 28, 2025

fang-xing-esql commented Feb 28, 2025

costin commented Mar 20, 2025

fang-xing-esql commented Mar 20, 2025

alex-spies Apr 11, 2025

alex-spies Apr 11, 2025

craigtaverner left a comment

fang-xing-esql commented Apr 22, 2025 •

edited

Loading

craigtaverner left a comment

craigtaverner Apr 23, 2025

craigtaverner Apr 23, 2025

fang-xing-esql Apr 24, 2025

craigtaverner Apr 23, 2025

craigtaverner Apr 23, 2025

craigtaverner Apr 23, 2025

fang-xing-esql Apr 24, 2025

alex-spies commented Apr 25, 2025 •

edited

Loading

[ES|QL] Date nanos implicit casting in union types #123678

Are you sure you want to change the base?

[ES|QL] Date nanos implicit casting in union types #123678

Conversation

fang-xing-esql commented Feb 28, 2025 • edited Loading

elasticsearchmachine commented Feb 28, 2025

fang-xing-esql commented Feb 28, 2025

costin commented Mar 20, 2025

fang-xing-esql commented Mar 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

craigtaverner left a comment

Choose a reason for hiding this comment

fang-xing-esql commented Apr 22, 2025 • edited Loading

craigtaverner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-spies commented Apr 25, 2025 • edited Loading

fang-xing-esql commented Feb 28, 2025 •

edited

Loading

fang-xing-esql commented Apr 22, 2025 •

edited

Loading

alex-spies commented Apr 25, 2025 •

edited

Loading