-
Notifications
You must be signed in to change notification settings - Fork 14.5k
KAFKA-19070:: Adding task number to user provided client id to ensure each consumer has a unique client ID to avoid metric registration conflicts. #19341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
…each consumer has a unique client ID to avoid metric registration conflicts.
@C0urante can you please review this. |
@dajac can you please review this. |
Hi @mimaison can you please review this. Thanks. |
A label of 'needs-attention' was automatically added to this PR in order to raise the |
A label of 'needs-attention' was automatically added to this PR in order to raise the |
Thanks for the PR! Overall I agree that we should ensure metrics don't collide so operators can rely on them to be accurate. On the other hand the cliend Id can be used for authorization and quotas so if we don't use the exact value provided by the user we may break existing systems. I'm leaning considering this as a bug and automatically injecting the task Id, as proposed in this PR. @gharris1727 WDYT? |
A label of 'needs-attention' was automatically added to this PR in order to raise the |
Hi @mimaison, thanks for reviewing the PR. There is an another alternate way of ensuring unique client-id for each task as following. The configuration we provide through the POST/PUT API is the connector-level config. However, the task-level config is generated by the Connect framework via the kafka/connect/api/src/main/java/org/apache/kafka/connect/connector/Connector.java Line 124 in 2a7457f
Since we want each task to have a unique client-id, we can modify the value at this point (inside taskConfigs(...)) by appending the task number. Please let me know if this direction sounds good to you, and I’ll proceed with the change. |
A label of 'needs-attention' was automatically added to this PR in order to raise the |
My point is whatever mechanism you use, this can potentially break some existing deployments. A bunch of connectors only run a single task (for example Debezium), and if users set a specific client-id this would modify it. It may be worth discussing it on the dev list to see what people think. Can you send an email to the dev mailing list explaining the issue and asking whether this change requires a KIP or not? Thanks |
Hi @mimaison , That said, I'm a bit unsure how this would break existing deployments, since client.id is primarily used for logging, metrics, and monitoring. It doesn’t affect Kafka's core delivery semantics like group.id or partition assignment. From what I understand, the only change here would be in how the client.id appears in logs and metrics. Also, similar changes have been made in the past — for example, we updated the client ID format for consumers of the management topic from --statuses to -statuses, and a few versions ago we modified the default client ID of internal sink consumers. That said, I agree it's best to bring this to the dev mailing list. I'll send an email shortly summarizing the context and asking whether this requires a KIP. Thanks again! |
What
This PR updates the behavior of client.id assignment when a user provides a custom value via override configs in Kafka Connect.
Why
Currently, if a user overrides the client.id, all tasks or consumers inherit the same client ID. While this doesn't cause an immediate failure in Kafka, it leads to the following issues:
Metrics and logs are merged or overwritten, making observability inaccurate.
Quotas and throttling may be applied incorrectly.
Debugging becomes harder due to lack of per-task identity.
According to Kafka core behavior, client.id should be unique per client instance for proper tracking and diagnostics.
How
This PR appends the task number to the user-provided client.id to ensure uniqueness across tasks.
For example:
User provides: client.id=my-custom-client
Final client.id used by task 2: my-custom-client-2
This approach:
Respects the user’s original intent in naming
Guarantees unique client.id values per task
Improves metrics, logging, and debugging consistency