Skip to content

Transform using geotile_grid pivot with missing_bucket=true triggers NullPointerException #126591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
elemask opened this issue Apr 10, 2025 · 1 comment
Labels
>bug :ml/Transform Transform Team:ML Meta label for the ML team

Comments

@elemask
Copy link

elemask commented Apr 10, 2025

Elasticsearch Version

8.4.3

Installed Plugins

No response

Java Version

bundled

OS Version

Ubuntu 22.04

Problem Description

Using a transform which pivots on a geotile_grid will trigger a NullPointerException when setting the missing_buckets option to true.

The same composite aggregation works fine and correctly generates a bucket whose key contains a null value (as the value to aggregate on is missing). I believe it is a bug in the transform code when manipulating the composite bucket containing a key with a null value.

Steps to Reproduce

  1. Create index with 3 documents:
PUT /test
{
  "mappings": {
    "properties": {
      "username": {
        "type": "keyword"
      },
      "date": {
        "type": "date"
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

PUT /test/_doc/1
{
  "username": "bob",
  "date": "2025-12-25",
  "location": "50.62096405029297,5.5580315589904785"
}

PUT /test/_doc/2
{
  "username": "bob",
  "date": "2024-12-25",
  "location": "50.620975494384766,5.558044910430908"
}

PUT /test/_doc/3
{
  "username": "bob",
  "date": "2023-12-25"
}
  1. Test composite aggregation on username + location fields:
GET /test/_search?filter_path=aggregations.pivot.buckets
{
  "query": {
    "match_all": {}
  },
  "size": 0, 
  "aggs": {
    "pivot": {
      "composite": {
        "sources": [
          {
            "user": {
              "terms": {
                "field": "username"
              }
            }
          },
          {
            "pings": {
              "geotile_grid": {
                "field": "location",
                "precision": 1,
                "missing_bucket": true
              }
            }
          }
        ]
      },
      "aggs": {
        "seen_first": {
          "min": {
            "field": "date"
          }
        }
      }
    }
  }
}

Result is as excepted. 2 buckets, one matching the 2 documents with geo_points and one matching the document with no geo_points.

{
  "aggregations": {
    "pivot": {
      "buckets": [
        {
          "key": {
            "user": "bob",
            "pings": null
          },
          "doc_count": 1,
          "seen_first": {
            "value": 1703462400000,
            "value_as_string": "2023-12-25T00:00:00.000Z"
          }
        },
        {
          "key": {
            "user": "bob",
            "pings": "1/1/0"
          },
          "doc_count": 2,
          "seen_first": {
            "value": 1735084800000,
            "value_as_string": "2024-12-25T00:00:00.000Z"
          }
        }
      ]
    }
  }
}
  1. Run a transform preview with the equivalent aggregations:
GET /_transform/_preview
{
  "source": {
    "index": "test"
  },
  "pivot": {
    "group_by": {
      "user": {
        "terms": {
          "field": "username"
        }
      },
      "pings": {
        "geotile_grid": {
          "field": "location",
          "precision": 1,
          "missing_bucket": true
        }
      }
    },
    "aggregations": {
      "seen_first": {
        "min": {
          "field": "date"
        }
      }
    }
  }
}

Result in HTTP 500 error

{
  "error": {
    "root_cause": [
      {
        "type": "null_pointer_exception",
        "reason": """Cannot invoke "Object.toString()" because "key" is null"""
      }
    ],
    "type": "null_pointer_exception",
    "reason": """Cannot invoke "Object.toString()" because "key" is null"""
  },
  "status": 500
}

Expected results: 2 buckets

Logs (if relevant)

Server logs contain the following backtrace:

java.lang.NullPointerException: Cannot invoke "Object.toString()" because "key" is null
    at [email protected]/org.elasticsearch.xpack.transform.transforms.pivot.AggregationResultUtils$GeoTileBucketKeyExtractor.value(AggregationResultUtils.java:486)
    at [email protected]/org.elasticsearch.xpack.transform.transforms.pivot.AggregationResultUtils.lambda$extractCompositeAggregationResults$0(AggregationResultUtils.java:123)
    at java.base/java.util.LinkedHashMap.forEach(LinkedHashMap.java:986)
    at [email protected]/org.elasticsearch.xpack.transform.transforms.pivot.AggregationResultUtils.lambda$extractCompositeAggregationResults$2(AggregationResultUtils.java:116)
[...]
@elemask elemask added >bug needs:triage Requires assignment of a team area label labels Apr 10, 2025
@iverase iverase added :ml/Transform Transform and removed needs:triage Requires assignment of a team area label labels Apr 10, 2025
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Apr 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml/Transform Transform Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

3 participants