Skip to content

Failed document handler #95534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
felixbarny opened this issue Apr 25, 2023 · 7 comments
Closed

Failed document handler #95534

felixbarny opened this issue Apr 25, 2023 · 7 comments
Assignees
Labels
:Data Management/Data streams Data streams and their lifecycles >feature Team:Data Management Meta label for data/management team

Comments

@felixbarny
Copy link
Member

felixbarny commented Apr 25, 2023

Instead of dropping documents that have failed ingestion due to an exception during pipeline execution or indexing, it should be possible to store the failed document.

High-level options where to store the failed documents:

  1. In a dedicated data stream
  2. Within the same data stream, in dedicated failure backing indices
  3. Within the same index, only storing the _source and with an indicator that these reflect failed documents
@felixbarny felixbarny added the :Data Management/Data streams Data streams and their lifecycles label Apr 25, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Apr 25, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@lachlann562
Copy link

This would be extremely helpful, we recently discovered we were losing a huge number of messages because of inconsistency in data type and we couldn't find any easy way to identify these messages or recover the data associated with them. The abscence of log records is a huge risk when you are reliant on ELK as the "source of truth".

@ruflin
Copy link
Contributor

ruflin commented May 31, 2024

@lachlann562 Glad to hear this will be useful for you. To also expose the failure in the UI as soon as the feature lands, we are working on a dataset quality page. Some related issue can be found here: elastic/kibana#184572

@flash1293
Copy link
Contributor

@felixbarny can this be closed?

@wolframhaussig
Copy link

@felixbarny can this be closed?

Why should this be closed? It would be extremely helpful and reduce our effort to look for the missing data in a long pipe of components (app->openshift->ingest pipeline root->ingest pipeline app->index).

@felixbarny
Copy link
Member Author

Because this has been delivered. I'll defer to @mattc58 to check if we can close this out or if there's anything left.

@dakrone
Copy link
Member

dakrone commented Apr 28, 2025

Yes, I believe this can be closed as resolved by #126973.

@dakrone dakrone closed this as completed Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >feature Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

8 participants