Skip to content

Error Handling for "Ignoring Failure for this processor" in JSON parsing during conflicts in index template #126451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
doom160 opened this issue Apr 8, 2025 · 1 comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@doom160
Copy link

doom160 commented Apr 8, 2025

Elasticsearch Version

8.15.5

Installed Plugins

No response

Java Version

bundled

OS Version

Elastic cloud

Problem Description

Currently I am piping Kubernetes container logs to ElasticCloud 8.15.X. I want to make use of JSON processor in Ingest Pipeline to decode the container logs. As container logs come in different shape and sizes, its very unlikely i can enforce everyone to pipe logs in the same way, which there are cases where there are random fields that are conflict in field type e.g.

{"process": "audit"}
{"process":{ "id":123}}

logs from 2 different application may spit out the logs in different format, resulting it to drop the event.

{"log.level":"warn","@timestamp":"2025-04-08T03:02:20.683Z","message":"Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2025, time.April, 8, 3, 2, 5, 778206559, time.UTC), Meta:{\"input_id\":\"filestream-container-logs-089e3666-e252-484b-b7a4-a85b5c12876a-kubernetes-59b5cb54-ec6e-4a30-9013-1e1db3f366ae.elastic-agent\",\"raw_index\":\"logs-kubernetes.container_log
...
Private:(*input_logfile.updateOp)(0xc001cac2d0), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:5019] failed to parse field [message_json.process] of type [keyword] in document with id 'UolZE5YBC9Dr76AKSGUv'. Preview of field's value: '{pid=4955}'\",\"caused_by\":{\"type\":\"illegal_state_exception\",\"reason\":\"Can't get text on a START_OBJECT at 1:5005\"}}, dropping event!","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"filestream-default","type":"filestream"},"log":{"source":"filestream-default"},"log.logger":"elasticsearch","log.origin":{"file.line":446,"file.name":"elasticsearch/client.go"},"service.name":"filebeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}

I was hoping that when i turned on "ignore failure for this processor", it will just move ahead when there is index template, meaning not to drop the went, just don't attempt the json encoding, just keep it under message.

I want to find out what would be a good way to handle this kind of problem?

Steps to Reproduce

  1. Create ingest pipeline with json decoding to message field to another field
  2. pipe 2 documents to the same index with conflicting type
message: "{"process": "audit"}"
message: "{"process":{ "id":123}}"
  1. Go to discover page to look for message: "Cannot index event*"

Logs (if relevant)

No response

@doom160 doom160 added >bug needs:triage Requires assignment of a team area label labels Apr 8, 2025
@nielsbauman nielsbauman added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Apr 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

3 participants