Skip to content

PubSub: Message ordering no longer being honored #1889

@turneand

Description

@turneand

Environment details

  1. Specify the API at the beginning of the title. For example, "BigQuery: ...").
    General, Core, and Other are also allowed as types
  2. OS type and version: Windows/unix
  3. Java version: 17
  4. version(s): 1.125.11 (works), 1.125.12(broken), 1.126.2(broken)

Steps to reproduce

  1. create a new pubsub topic and subscription with message ordering enabled (and exactly once delivery)
  2. publish 100 messages in a loop, with a simple text message containing the counter, and the ordering key of "defabc"
  3. start a single subscriber to receive the messages
  4. The messages should be received in order (and were in 1.125.11), but are now received only "mostly" in order

Code example

var project =
var topicName = TopicName.of(project, "andrew_test");
var publisher = Publisher.newBuilder(topicName).setEnableMessageOrdering(true).build();

try {
    for (int i = 0; i < 100; i++) {
        var data = ByteString.copyFromUtf8("hello #" + i);
        var message = PubsubMessage.newBuilder().setData(data).setOrderingKey("defabc").build();
        publisher.publish(message);
    }
} finally {
    publisher.shutdown();
    publisher.awaitTermination(1, TimeUnit.MINUTES);
}

var subscriptionName = ProjectSubscriptionName.of(project, "andrew-test-sub");
var subscriber = Subscriber.newBuilder(subscriptionName, (PubsubMessage message, AckReplyConsumer consumer) -> {
    System.out.println("Id: " + message.getMessageId() + " Data: " + message.getData().toStringUtf8());
    consumer.ack();
}).build();

try {
    subscriber.startAsync().awaitRunning();
    subscriber.awaitTerminated(30, TimeUnit.SECONDS);
} catch (TimeoutException e) {
    subscriber.stopAsync();
}

Stack trace

This also seems to lead to more frequent occurrences of this stack trace

io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Some acknowledgement ids in the request were sent out of order.

External references such as API reference guides

n/a

Any additional information below

The above code snippet works fine in 1.125.11, but in 1.125.12 and 1.126.2 (the latest as of raise date) we are getting messages delivered out of order.

I believe the root cause is the change from #1778 where under #1807 the LinkedHashSet was changed to a ConcurrentHashMap. When the notifyAckSuccess method is called, we are iterating over the outstandingReceipts to put into the outstandingBatch. Unfortunately this change switched it from insertion order to hash based ordering.

I can mitigate all of these issues by setting the max-outstanding-element-count down to 1, but then we lose the ability to parallel process, or the benefits of batching.

Metadata

Metadata

Assignees

Labels

api: pubsubIssues related to the googleapis/java-pubsub API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions