Skip to content

[RFC] Deployment Scheme: One Keeper for Each ClickHouse Shard #1743

Open
@cangyin

Description

@cangyin

In typical ClickHouse + Keeper setups, theres one 'default' Keeper cluster, consisting of 3 or 5 keeper servers, which serves for ClickHouse DDL task queue.

Replicated MergeTree table engines stores, typically large amount of, metadata in Keeper for data replication. This load can be put onto 'default' keeper cluster for simplicity. With more replicated tables added and ClickHouse cluster becoming larger and larger, auxiliary Keeper clusters should be added to hold the extra load, leaving 'default' Keeper cluster to focus on DDL queue only.

The problems here are:

  • One replicated table does not care about (neither read nor write) metadata of the others.
  • One shard of replicated table does not care about (neither read nor write) metadata of the others.

Which translate to bottlenecks:

  • Keeper requests from one replicated table has to wait for in-fly requests1 from the others to finish.
  • Keeper requests from one shard of replicated table has to wait for requests from the others to finish.

So we can deploy a single standalone Keeper server for each ClickHouse shard, in addition to the 'default' Keeper cluster.

For example for a ClickHouse cluster consisting of 5 shards, we need 5 shard-Keeper servers, plus 3 ddl-Keeper servers, 8 Keeper servers in total.

This comes with some benefits:

  • Shard-Keepers are standalone servers, which saves resources compared to Keeper clusters
  • Shard-Keepers does not need a quorum to work. The example ClickHouse cluster is still partially writable even if 4 out of 5 shard-Keepers are down. Whereas a 5-quorom Keeper cluster does not endure such quorum lost.
  • Promising unleashed Keeper perfomance.

Footnotes

  1. ClickHouse Metric ZooKeeperRequest

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions