Skip to content

[TSDS] Reindexing seems to ignore routing_path derived from mappings #125607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lucabelluccini opened this issue Mar 25, 2025 · 6 comments
Open
Assignees
Labels
>bug :Data Management/Data streams Data streams and their lifecycles :StorageEngine/TSDB You know, for Metrics Team:Data Management Meta label for data/management team Team:StorageEngine

Comments

@lucabelluccini
Copy link
Contributor

lucabelluccini commented Mar 25, 2025

Elasticsearch Version

8.17.4

Installed Plugins

No response

Java Version

bundled

OS Version

Not relevant

Problem Description

When attempting to reindex a TSDS data stream into another, the reindex doesn't start because of the error:

[index.mode=time_series] requires a non-empty [index.routing_path]

Attempting to reproduce, I noticed it triggers the error only when the routing_path on the destination index is not defined explicitly but deduced from the TSDS dimensions in the mappings.

Steps to Reproduce

  1. Spawn an ECH deployment 8.17.4 (but also older versions present the same issue)

  2. GET _index_template/metrics-fleet_server.agent_status as we want to reindex the data streams metrics-fleet_server.agent_status...

  3. We create a slightly modified index template reusing the existing index template

PUT _index_template/metrics-fleet_server.agent_status-storedsource
{
  "index_patterns": [
    "metrics-fleet_server.agent_status-storedsource*" # MORE SPECIFIC
  ],
  "template": {
    "settings": {
      "index": {
        "mode": "time_series",
        "mapping.source.mode": "stored" # WE FORCE STORED (but this is not relevant for the bug)
      },
      "time_series": { # WE SET THIS (but it is not relevant for the bug)
        "end_time": "2026-12-31T00:00:00.000Z",
        "start_time": "2023-09-01T00:00:00.000Z"
      }
    },
    "mappings": {
      "_meta": {
        "package": {
          "name": "fleet_server"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "_source": {
        "mode": "STORED" # WE FORCE STORED (but this is not relevant for the bug)
      }
    }
  },
  "composed_of": [
    "metrics@tsdb-settings",
    "metrics-fleet_server.agent_status@package",
    "metrics@custom",
    "metrics-fleet_server.agent_status@custom",
    "ecs@mappings",
    ".fleet_globals-1",
    ".fleet_agent_id_verification-1"
  ],
  "priority": 250, # WE INCREASE THE PRIORITY
  "_meta": {
    "package": {
      "name": "fleet_server"
    },
    "managed_by": "fleet",
    "managed": true
  },
  "data_stream": {
    "hidden": false,
    "allow_custom_routing": false
  },
  "ignore_missing_component_templates": [
    "metrics@custom",
    "metrics-fleet_server.agent_status@custom"
  ]
}
  1. We trigger the reindexing
POST _reindex
{
  "source": {
    "index": "metrics-fleet_server.agent_status-default"
  },
  "dest": {
    "index": "metrics-fleet_server.agent_status-storedsource",
    "op_type": "create"
  }
}
  1. The response is:
# Response
#{
#  "error": {
#    "root_cause": [
#      {
#        "type": "illegal_argument_exception",
#        "reason": "[index.mode=time_series] requires a non-empty [index.routing_path]"
#      }
#    ],
#    "type": "illegal_argument_exception",
#    "reason": "[index.mode=time_series] requires a non-empty [index.routing_path]"
#  },
#  "status": 400
#}
  1. If we trigger the creation of the DS manually, it gets created:
PUT _data_stream/metrics-fleet_server.agent_status-storedsource
  1. It has routing_path (GET metrics-fleet_server.agent_status-storedsource/_settings)
# Response
#{
#  ".ds-metrics-fleet_server.agent_status-storedsource-2025.03.25-000001": {
#    "settings": {
#      "index": {
#        "mapping": {
#          "total_fields": {
#            "limit": "1000",
#            "ignore_dynamic_beyond_limit": "true"
#          },
#          "source": {
#            "mode": "stored"
#          }
#        },
#        "hidden": "true",
#        "time_series": {
#          "end_time": "2026-12-31T00:00:00.000Z",
#          "start_time": "2023-09-01T00:00:00.000Z"
#        },
#        "provided_name": ".ds-metrics-fleet_server.agent_status-storedsource-2025.03.#25-000001",
#        "final_pipeline": ".fleet_final_pipeline-1",
#        "creation_date": "1742927336776",
#        "number_of_replicas": "1",
#        "routing_path": [
#          "cluster.id" ### <<< HERE
#        ],
#        "uuid": "ZoZZhCW0SrO9Y3uXSAzdgg",
#        "version": {
#          "created": "8521000"
#        },
#        "lifecycle": {
#          "name": "metrics"
#        },
#        "mode": "time_series",
#        "routing": {
#          "allocation": {
#            "include": {
#              "_tier_preference": "data_hot"
#            }
#          }
#        },
#        "number_of_shards": "1",
#        "default_pipeline": "metrics-fleet_server.agent_status-1.6.0"
#      }
#    }
#  }
#}
  1. Try to re-execute the reindex, it fails again, same error

  2. Update the Index template to explicitly define the routing_path:

PUT _index_template/metrics-fleet_server.agent_status-storedsource
{
  "index_patterns": [
    "metrics-fleet_server.agent_status-storedsource*"
  ],
  "template": {
    "settings": {
      "index": {
        "mode": "time_series",
        "mapping.source.mode": "stored"
      },
      "routing_path": [
        "cluster.id" # ADDED THIS
      ],
      "time_series": {
        "end_time": "2026-12-31T00:00:00.000Z",
        "start_time": "2023-09-01T00:00:00.000Z"
      }
    },
    "mappings": {
      "_meta": {
        "package": {
          "name": "fleet_server"
        },
        "managed_by": "fleet",
        "managed": true
      },
      "_source": {
        "mode": "STORED"
      }
    }
  },
  "composed_of": [
    "metrics@tsdb-settings",
    "metrics-fleet_server.agent_status@package",
    "metrics@custom",
    "metrics-fleet_server.agent_status@custom",
    "ecs@mappings",
    ".fleet_globals-1",
    ".fleet_agent_id_verification-1"
  ],
  "priority": 250,
  "_meta": {
    "package": {
      "name": "fleet_server"
    },
    "managed_by": "fleet",
    "managed": true
  },
  "data_stream": {
    "hidden": false,
    "allow_custom_routing": false
  },
  "ignore_missing_component_templates": [
    "metrics@custom",
    "metrics-fleet_server.agent_status@custom"
  ]
}
  1. Re-trigger the reindexing
POST _reindex
{
  "source": {
    "index": "metrics-fleet_server.agent_status-default"
  },
  "dest": {
    "index": "metrics-fleet_server.agent_status-storedsource",
    "op_type": "create"
  }
}
  1. It works (which is odd as the destination index is still there from the manual DS creation at step 6, I didn't even need to delete it)
# Response
#{
#  "took": 157,
#  "timed_out": false,
#  "total": 353,
#  "updated": 0,
#  "created": 353,
#  "deleted": 0,
#  "batches": 1,
#  "version_conflicts": 0,
#  "noops": 0,
#  "retries": {
#    "bulk": 0,
#    "search": 0
#  },
#  "throttled_millis": 0,
#  "requests_per_second": -1,
#  "throttled_until_millis": 0,
#  "failures": []
#}

Logs (if relevant)

No response

@lucabelluccini lucabelluccini added >bug needs:triage Requires assignment of a team area label Team:StorageEngine labels Mar 25, 2025
@martijnvg martijnvg added the :StorageEngine/TSDB You know, for Metrics label Mar 25, 2025
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine and removed needs:triage Requires assignment of a team area label labels Mar 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@lkts
Copy link
Contributor

lkts commented Mar 26, 2025

In step 4 does the destination data stream exist? I suspect that metrics-fleet_server.agent_status-storedsource gets treated as an index when calling _reindex.

@lucabelluccini
Copy link
Contributor Author

lucabelluccini commented Mar 26, 2025

Hello @lkts

At step (4), the data stream doesn't exist, but I would expect to be created automatically (I have the rights and the index template would generate a data stream).

But ignoring this, you can see that at step (6) I create the destination data stream manually (7) and still the reindex fails (8).

Only adding the routing_path at Index Template level (9) makes it succeed.

To me, it is not normal that we have a different behavior if the routing_path is explicit or derived from the mappings. Once the index gets "materialized", routing_path is there, regardless from where it is generated/coming from.

@lkts
Copy link
Contributor

lkts commented Apr 7, 2025

This happens because reindexer tries to build settings before the index is created

Settings settings = MetadataIndexTemplateService.resolveSettings(projectMetadata, template);

This crucially is different from

because the former does not take into account IndexSettingProviders.

It's an issue with reindexing that is not specific to TSDS. TSDS just has a setting validation that triggers it. I am not sure if these two code paths can be unified so let's ask somebody who knows this.

@lkts lkts added the :Data Management/Data streams Data streams and their lifecycles label Apr 7, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Apr 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@lkts
Copy link
Contributor

lkts commented Apr 7, 2025

In theory this validation can be bypassed by replacing this

with

settings.get(IndexSettings.MODE.get())

Not sure if its appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Data streams Data streams and their lifecycles :StorageEngine/TSDB You know, for Metrics Team:Data Management Meta label for data/management team Team:StorageEngine
Projects
None yet
Development

No branches or pull requests

5 participants