Skip to content
This repository was archived by the owner on Oct 10, 2023. It is now read-only.
This repository was archived by the owner on Oct 10, 2023. It is now read-only.

ProducerMetricsInterceptor fails with IllegalStateException and/or IllegalAccessException #194

@gmcrobert

Description

@gmcrobert

Issue Description

When producing messages, the following exception is sometimes seen in the Kafka pod logs:

java.lang.IllegalStateException: Queue full
	at java.base/java.util.AbstractQueue.add(AbstractQueue.java:98)
	at java.base/java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:326)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsQueue.add(ProducerMetricsQueue.java:31)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsInterceptor.intercept(ProducerMetricsInterceptor.java:103)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsInterceptor.intercept(ProducerMetricsInterceptor.java:30)
	at com.ibm.eventstreams.interceptors.framework.FlowRequestResponse.interceptResponse(FlowRequestResponse.java:125)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain.lambda$new$4(InterceptorChain.java:134)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain$$Lambda$665/0x0000000099a13e60.handle(Unknown Source)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain.lambda$intercept$2(InterceptorChain.java:111)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain$$Lambda$1767/0x000000004801a980.accept(Unknown Source)

This problem is caused by the java thread that removes metrics from the ArrayBlockingQueue to terminate after receiving an exception. When the thread terminates, the queue is not emptied and the result is that all produce messages will receive the exception above. The exception has a significant impact on the performance of the produce messages and will prevent further producer metrics from being captured.

The problem is also sometimes seen in conjunction with an IllegalAccessException trying to set a final field through reflection. This happens when the producer acks are set to 0 or 1. Once the IllegalAccessException has occurred, it will be followed by the IllegalStateExceptions.

Once the queue is full, the only way to restore the system back to normal service is to delete the Kafka pods that contain the exceptions and allow a fresh pod to be started.

Environment

  • IBM Event Streams Version: 10.4.0
  • Operating system: Any supported OCP release

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions