Skip to content

Add WeightedNonceGenerator #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

ankit27755
Copy link
Owner

@ankit27755 ankit27755 commented Apr 30, 2025


DeputyDev generated PR summary:


Size XXL: This PR changes include 1000+ lines and should take approximately 6+ hours to review, potentially spread across multiple sessions


The pull request titled "Add WeightedNonceGenerator" introduces several new components and modifications, mainly to support a new type of nonce generator called WeightedNonceGenerator. Here's a summary of what the PR does:

  1. New Classes and Files:

    • CircularQueue: A class representing a thread-safe circular queue to store integer IDs.
    • IdPool: Manages a pool of IDs using CircularQueue.
    • PartitionIdTracker: Tracks generated IDs and manages partitions for ID generation.
    • IdUtils: Utility class for ID-related functions, such as converting seconds to DateTime.
    • Configuration Classes: Several new configuration classes (DefaultNamespaceConfig, IdGeneratorConfig, NamespaceConfig, PartitionRange, WeightedIdConfig, WeightedPartition) are introduced to manage ID generation settings.
    • DistributedIdGenerator: An ID generator that supports distributed ID generation.
    • Formatters and Nonce Generators: Includes a new formatter SecondPrecisionIdFormatter and nonce generators (PartitionAwareNonceGenerator, WeightedNonceGenerator).
  2. Modifications:

    • IdGeneratorBase: Updated to support different types of nonce generators, including weighted and partition-aware generators.
    • NonceGeneratorBase: Modified to include a method for generating IDs for a specific partition and a method to convert IdInfo to Id.
    • RandomNonceGenerator: Adjusted to fit into the new architecture, especially regarding ID generation and validation.
  3. Tests:

    • New tests for PartitionAwareNonceGenerator and WeightedNonceGenerator to ensure the new functionality works as expected.
    • Performance tests (WeightedIdGeneratorPerfTest) are included to benchmark the new ID generation methods.
  4. Performance Results:

    • JSON files (testGenerateWithBenchmark.json) provide benchmarking results for PartitionAwareNonceGenerator and WeightedNonceGenerator.

Overall, this PR adds a weighted ID generation mechanism, allowing for more flexible and configurable ID generation strategies, particularly useful in distributed systems where load balancing across partitions is crucial.


DeputyDev generated PR summary until 33e4742

import java.util.function.Function;


@SuppressWarnings("unused")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PERFORMANCE_OPTIMIZATION: The performance benchmark results show that WeightedNonceGenerator is approximately 22% slower than PartitionAwareNonceGenerator (1004ms vs 823.6ms average time). This significant performance regression should be addressed, as ID generation is likely a high-volume operation in the system.

// Consider optimizing the core weighted partition selection algorithm:
// 1. Create a more efficient data structure for partition lookup
// 2. Consider using a precomputed array for faster partition lookups
// 3. Implement a binary search for partition selection instead of using RangeMap
// Example optimization approach:
private int[] partitionLookupArray;
private int[] partitionOffsets;

private void initializePartitionLookup() {
    partitionLookupArray = new int[maxShardWeight];
    partitionOffsets = new int[idGeneratorConfig.getPartitionCount()];
    // Fill lookup array with corresponding partition for each weight point
    // This allows O(1) partition lookup vs the current O(log n) lookup with RangeMap
}

@Override
protected int getTargetPartitionId() {
val randomNum = getSECURE_RANDOM().nextInt(maxShardWeight);
val partitionRange = Objects.requireNonNull(partitionRangeMap.getEntry(randomNum)).getValue();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODE_MAINTAINABILTIY: In the getTargetPartitionId method of WeightedNonceGenerator, there's a call to Objects.requireNonNull that will throw a NullPointerException if the partition range is not found. This would cause a runtime failure instead of handling the error gracefully. The code should check if the entry exists and implement a fallback mechanism for this critical operation.

@Override
protected int getTargetPartitionId() {
    val randomNum = getSECURE_RANDOM().nextInt(maxShardWeight);
    val partitionRangeEntry = partitionRangeMap.getEntry(randomNum);
    
    if (partitionRangeEntry == null) {
        log.error("No partition found for random number: {}", randomNum);
        // Fallback to a default partition or random selection
        return getSECURE_RANDOM().nextInt(getIdGeneratorConfig().getPartitionCount());
    }
    
    val partitionRange = partitionRangeEntry.getValue();
    int rangeSize = partitionRange.getEnd() - partitionRange.getStart() + 1;
    
    if (rangeSize <= 0) {
        log.error("Invalid partition range: {} to {}", partitionRange.getStart(), partitionRange.getEnd());
        return getSECURE_RANDOM().nextInt(getIdGeneratorConfig().getPartitionCount());
    }
    
    return getSECURE_RANDOM().nextInt(rangeSize) + partitionRange.getStart();
}

final MetricRegistry metricRegistry,
final Clock clock) {
super(nodeId, idGeneratorConfig, partitionResolverSupplier, metricRegistry, clock);
Preconditions.checkNotNull(getIdGeneratorConfig().getWeightedIdConfig());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR: Constructor parameter mismatch. The WeightedNonceGenerator constructor calls super(nodeId, idGeneratorConfig, partitionResolverSupplier, metricRegistry, clock) which resolves to PartitionAwareNonceGenerator's constructor, but this doesn't pass the idFormatter. The protected constructor in PartitionAwareNonceGenerator sets a default formatter which may be different from the one provided to WeightedNonceGenerator. This could cause inconsistent behavior when WeightedNonceGenerator is instantiated with a specific formatter.

public WeightedNonceGenerator(final int nodeId,
                                     final IdGeneratorConfig idGeneratorConfig,
                                     final Function<String, Integer> partitionResolverSupplier,
                                     final IdFormatter idFormatter,
                                     final MetricRegistry metricRegistry,
                                     final Clock clock) {
        super(nodeId, idGeneratorConfig, partitionResolverSupplier, idFormatter, metricRegistry, clock);
        Preconditions.checkNotNull(getIdGeneratorConfig().getWeightedIdConfig());
        partitionRangeMap = createWeightRangeMap(getIdGeneratorConfig().getWeightedIdConfig());
    }

private final IdGeneratorConfig idGeneratorConfig;
private final MetricRegistry metricRegistry;
private final Clock clock;
HashSet<Integer> timeKeys = new HashSet<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PERFORMANCE_OPTIMIZATION: The HashSet<Integer> timeKeys field in PartitionAwareNonceGenerator is declared but never used. This creates unnecessary memory allocation without providing any benefit. Unused fields should be removed to reduce memory consumption and improve code clarity.

// Remove this unused field
// HashSet<Integer> timeKeys = new HashSet<>();

private static final String QUEUE_FULL_METRIC_STRING = "idGenerator.queueFull.forPrefix.";
private static final String UNUSED_INDICES_METRIC_STRING = "idGenerator.unusedIds.forPrefix.";

// private final Meter queueFullMeter;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODE_MAINTAINABILTIY: The CircularQueue class contains multiple commented out lines of metric instrumentation code. This creates confusion about whether these metrics should be active or not. Either these metrics should be properly implemented or the commented code should be removed entirely to avoid maintainability issues.

//        this.queueFullMeter = metricRegistry.meter(QUEUE_FULL_METRIC_STRING + namespace);
//        this.unusedDataMeter = metricRegistry.meter(UNUSED_INDICES_METRIC_STRING + namespace);

// Uncomment and use these meters or remove the commented code and constants


import java.util.Optional;

public class IdPool {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODE_MAINTAINABILTIY: The IdPool class is missing JavaDocs for all methods and constructor, making it difficult to understand the purpose and usage of this class without diving into the implementation. Clear documentation would improve maintainability by helping developers understand the class's purpose and behavior at a glance.

public class IdPool {
    /** List of IDs for the specific IdPool */
    private final CircularQueue queue;

    /**
     * Creates an IdPool with the specified size and metrics registry.
     * 
     * @param size Size of the circular queue to maintain IDs
     * @param metricRegistry Metrics registry for monitoring queue behavior
     * @param namespace Namespace used for metrics reporting
     */
    public IdPool(int size, final MetricRegistry metricRegistry, final String namespace) {
        this.queue = new CircularQueue(size, metricRegistry, namespace);
    }

    /**
     * Stores a new ID in the pool.
     * 
     * @param id The ID to store
     */
    public void setId(int id) {
        queue.setId(id);
    }

    /**
     * Retrieves the next available ID from the pool.
     * 
     * @return Optional containing the next ID if available, empty Optional otherwise
     */
    public Optional<Integer> getNextId() {
        return queue.getNextId();
    }

    /**
     * Resets the pool state, clearing all stored IDs.
     */
    public void reset() {
        queue.reset();
    }
}

val idCountMap = getIdCountMap(allIdsList, partitionResolver);
int maxShardWeight = 1000;
for (WeightedPartition partition: config.getWeightedIdConfig().getPartitions()) {
val expectedIdCount = ((double) partition.getWeight() / maxShardWeight) * ((double) allIdsList.size() / (partition.getPartitionRange().getEnd()-partition.getPartitionRange().getStart()+1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODE_MAINTAINABILTIY: The checkDistribution method in WeightedNonceGeneratorTest doesn't validate if the partition range is valid (start <= end) before performing division. If the partition range is improperly configured, this could lead to runtime errors. The method should include validation for partition ranges to ensure robustness.

protected void checkDistribution(List<Id> allIdsList, Function<String, Integer> partitionResolver, IdGeneratorConfig config) {
    val idCountMap = getIdCountMap(allIdsList, partitionResolver);
    int maxShardWeight = 1000;
    for (WeightedPartition partition: config.getWeightedIdConfig().getPartitions()) {
        int partitionRangeSize = partition.getPartitionRange().getEnd() - partition.getPartitionRange().getStart() + 1;
        
        if (partitionRangeSize <= 0) {
            Assertions.fail("Invalid partition range: start=" + partition.getPartitionRange().getStart() + 
                          ", end=" + partition.getPartitionRange().getEnd());
            return;
        }
        
        val expectedIdCount = ((double) partition.getWeight() / maxShardWeight) * 
                           ((double) allIdsList.size() / partitionRangeSize);
        int idCountForPartition = 0;
        
        for (int partitionId = partition.getPartitionRange().getStart(); 
             partitionId <= partition.getPartitionRange().getEnd(); 
             partitionId++) {
            // Rest of the method...
        }
    }
}

Copy link

DeputyDev has completed a review of your pull request for commit 33e4742.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant