Skip to content

Adding List and Range based partitioning #196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tingold
Copy link

@tingold tingold commented May 28, 2025

Basics:

This PR adds the ability for PGDog to support List and Range based partitioning in addition to the existing Hash method. Conceptually, the code simply looks at the configuration to see if the partition key's value is defined within a given shard. If the key value is not found the query will be routed to Shard::All

Summary of changes

  • Added pgdoc/src/config/shards.rs with the majority of the config structs and implementation logic.
  • Updates to pgdoc/src/config/mod.rs for integrating into the ShardedTable struct
  • Updates to pgdoc/src/frontend/sharding/values.rs to integrate into Routing and Operators
  • Updates to pgdoc/src/frontend/sharding/operator.rs to integrate into Routing and Operators
  • Updates to pgdoc/src/frontend/sharding/context.rs to integrate into Routing and Operators
  • Updates to pgdoc/src/frontend/sharding/context_builder.rs to integrate into Routing and Operators

Supported Datatypes

Currently only bigint is supported however ideally these can support floats, strings and datetimes in the future.

Config changes

3 fields have been added to the ShardedTables configuration. Because this defaults to hash partitioning, the configuration should be compatible with previously valid configs.

  • sharding_method: This controls what sharding method is used. defaults to hash, other options are list and range
  • shard_list_map: This holds the config for a map of shards (0,1 etc) to a list of values. If the partitioned column's value is in a list the query will be sent to the specified shard. It is possible for lists to overlap, in which case the first one found will be returned.
  • shard_range_map: This provides the ability to define ranges instead of using discrete values (as with the list).

Example Configs:

Range based sharding:

[[sharded_tables]]
database = "emb"
name = "globe24"
column = "partkey"
data_type = "bigint"
sharding_method = "range"

sharding_method = "range"
shard_range_map = { "0" = { start = 0, end = 1000 }, "1" = { start = 1000, end = 2000 }, "2" = { start = 2000, no_max = true } }

List based sharding:

[[sharded_tables]]
database = "emb"
name = "globe24"
column = "partkey"
data_type = "bigint"
sharding_method = "list"

[sharded_tables.shard_list_map]
"0" = { values = [64, 79, 90, 101, 109, 112, 127, 147, 150, 151, 153, 156, 157, 179, 180, 182, 193, 194, 195, 196, 197, 198, 199, 200, 201, 203, 204, 205, 206, 207, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 229, 233, 234, 235, 240, 241, 242, 244, 245, 247, 251, 252, 295, 297, 298, 301, 302, 303, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 323, 324, 325, 326, 327, 329, 331, 332, 333, 334, 335, 336, 338, 344, 346, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 370, 376, 378, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 400, 401, 402, 403, 404, 405, 406, 407, 409, 413, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 466, 472, 474, 484, 485, 496, 1007, 1018] }
"1" = { values = [422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 438, 439, 440, 441, 442, 443, 446, 447, 462, 463, 474, 484, 485, 487, 489, 490, 491, 493, 494, 495, 496, 498, 504, 532, 543, 565, 572, 573, 579, 582, 583, 585, 588, 589, 591, 592, 593, 594, 595, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 613, 617, 619, 624, 625, 626, 627, 628, 629, 630, 636, 637, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 804, 805, 808, 816, 817, 820, 821, 823, 829, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 850, 856, 858, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 882, 888, 890, 917, 960, 961, 964, 965, 976] }
"2" = { values = [64, 68, 69, 71, 77, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 101, 256, 258, 259, 260, 261, 263, 264, 269, 272, 273, 274, 637, 694, 700, 701, 702, 703, 711, 713, 715, 716, 717, 718, 719, 721, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 778, 779, 782, 783, 794, 795, 798, 799, 800, 801, 804, 805, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 842, 864, 866, 872, 874, 875, 878, 879, 890, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 936, 937, 938, 939, 940, 941, 942, 944, 945, 946, 948, 949, 951, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 978, 984, 986, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1003, 1004, 1005, 1006, 1007, 1008, 1010, 1016, 1018] }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant