Skip to content

add an efficient labels serializer #474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
move to the separate module, rename metamethod
  • Loading branch information
ftelnov committed Nov 3, 2023
commit 961ec61d854152986826bf6d0dcc8cef6eb49213
74 changes: 3 additions & 71 deletions metrics/api.lua
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ local Gauge = require('metrics.collectors.gauge')
local Histogram = require('metrics.collectors.histogram')
local Summary = require('metrics.collectors.summary')

local serializers = require("metrics.serializers")

local registry = rawget(_G, '__metrics_registry')
if not registry then
registry = Registry.new()
Expand Down Expand Up @@ -125,76 +127,6 @@ local function set_global_labels(label_pairs)
registry:set_labels(label_pairs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, squash your commits into the single one with a name like api: support custom labels serialization

end

--- Prepares a serializer for label pairs with given keys.
---
--- `make_key`, which is used during every metric-related operation, is not very efficient itself.
--- To mitigate it, one could add his own serialization implementation.
--- It is done via passing `__metrics_make_key` callback to the label pairs table.
---
--- This function gives you ready-to-use serializer, so you don't have to create one yourself.
---
--- BEWARE! If keys of the `label_pairs` somehow change between serialization turns, it would raise error mostlikely.
--- We cover internal cases already, for example, "le" key is always added for the histograms.
---
--- @class LabelsSerializer
--- @field wrap function(label_pairs: table): table Wraps given `label_pairs` with an efficient serialization.
--- @field serialize function(label_pairs: table): string Serialize given `label_pairs` into the key.
--- Exposed so you can write your own serializers on top of it.
---
--- @param labels_keys string[] Label keys for the further use.
--- @return LabelsSerializer
local function labels_serializer(labels_keys)
-- we always add keys that are needed for metrics' internals.
local __labels_keys = { "le" }
-- used to protect label_pairs from altering with unexpected keys.
local keys_index = { le = true }

-- keep only unique labels
for _, key in ipairs(labels_keys) do
if not keys_index[key] then
table.insert(__labels_keys, key)
keys_index[key] = true
end
end
table.sort(__labels_keys)

local function serialize(label_pairs)
local result = ""
for _, label in ipairs(__labels_keys) do
local value = label_pairs[label]
if value ~= nil then
if result ~= "" then
result = result .. '\t'
end
result = result .. label .. '\t' .. value
end
end
return result
end

local pairs_metatable = {
__index = {
__metrics_make_key = function(self)
return serialize(self)
end
},
-- It protects pairs from being altered with unexpected labels.
__newindex = function(table, key, value)
if not keys_index[key] then
error(('Label "%s" is unexpected'):format(key), 2)
end
rawset(table, key, value)
end
}

return {
wrap = function(label_pairs)
return setmetatable(label_pairs, pairs_metatable)
end,
serialize = serialize
}
end

return {
registry = registry,
collectors = collectors,
Expand All @@ -210,5 +142,5 @@ return {
unregister_callback = unregister_callback,
invoke_callbacks = invoke_callbacks,
set_global_labels = set_global_labels,
labels_serializer = labels_serializer
labels_serializer = serializers.basic_labels_serializer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
labels_serializer = serializers.basic_labels_serializer
labels_serializer = serializers.basic_labels_serializer,

}
12 changes: 4 additions & 8 deletions metrics/collectors/shared.lua
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
local clock = require('clock')
local fiber = require('fiber')
local log = require('log')
local serializers = require("metrics.serializers")

local Shared = {}

Expand Down Expand Up @@ -48,15 +49,10 @@ function Shared.make_key(label_pairs)
return ""
end
-- `label_pairs` provides its own serialization scheme, it must be used instead of default one.
if label_pairs.__metrics_make_key then
return label_pairs:__metrics_make_key()
if label_pairs.__metrics_serialize then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not a metamethod -- it's just some internal API (even though object is internal by itself). Since it is not a metamethod, I think you shouldn't use double underscore in the name: use _metrics_serialize (or simply serialize even).

return label_pairs:__metrics_serialize()
end
local parts = {}
for k, v in pairs(label_pairs) do
table.insert(parts, k .. '\t' .. v)
end
table.sort(parts)
return table.concat(parts, '\t')
return serializers.default_labels_serializer(label_pairs)
end

function Shared:remove(label_pairs)
Expand Down
86 changes: 86 additions & 0 deletions metrics/serializers.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
--- Default slow algorithm, polluted with sorting.
--- It is used when nothing is known about `label_pairs`.
local function default_labels_serializer(label_pairs)
local parts = {}
for k, v in pairs(label_pairs) do
table.insert(parts, k .. '\t' .. v)
end
table.sort(parts)
return table.concat(parts, '\t')
end


--- Prepares a serializer for label pairs with given keys.
---
--- `make_key`, which is used during every metric-related operation, is not very efficient itself.
--- To mitigate it, one could add his own serialization implementation.
--- It is done via passing `__metrics_serialize` callback to the label pairs table.
---
--- This function gives you ready-to-use serializer, so you don't have to create one yourself.
---
--- BEWARE! If keys of the `label_pairs` somehow change between serialization turns, it would raise error mostlikely.
--- We cover internal cases already, for example, "le" key is always added for the histograms.
---
--- @class LabelsSerializer
--- @field wrap function(label_pairs: table): table Wraps given `label_pairs` with an efficient serialization.
--- @field serialize function(label_pairs: table): string Serialize given `label_pairs` into the key.
--- Exposed so you can write your own serializers on top of it.
---
--- @param labels_keys string[] Label keys for the further use.
--- @return LabelsSerializer
local function basic_labels_serializer(labels_keys)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming is confusing here: default_labels_serializer is an actual serializer and basic_labels_serializer is a serializer constructor/builder/factory. Please, use name like build_labels_serializer or similar.

-- we always add keys that are needed for metrics' internals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is not internals -- it is histogram labeling approach (based on Prometheus design).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have similar label -- quantile -- for summary collectors, which are not covered here, but should be.

local __labels_keys = { "le" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also seems non-optimized: we will have a key that works only for a histogram. Maybe it's worth to have a different factories for counter/gauge, histogram and summary?

-- used to protect label_pairs from altering with unexpected keys.
local keys_index = { le = true }

-- keep only unique labels
for _, key in ipairs(labels_keys) do
if not keys_index[key] then
table.insert(__labels_keys, key)
keys_index[key] = true
end
end
table.sort(__labels_keys)

local function serialize(label_pairs)
local result = ""
for _, label in ipairs(__labels_keys) do
local value = label_pairs[label]
if value ~= nil then
if result ~= "" then
result = result .. '\t'
end
result = result .. label .. '\t' .. value
end
end
return result
end

local pairs_metatable = {
__index = {
__metrics_serialize = function(self)
return serialize(self)
end
},
-- It protects pairs from being altered with unexpected labels.
__newindex = function(table, key, value)
if not keys_index[key] then
error(('Label "%s" is unexpected'):format(key), 2)
end
rawset(table, key, value)
end
}

return {
wrap = function(label_pairs)
return setmetatable(label_pairs, pairs_metatable)
end,
serialize = serialize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
serialize = serialize
serialize = serialize,

}
end

return {
default_labels_serializer = default_labels_serializer,
basic_labels_serializer = basic_labels_serializer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
basic_labels_serializer = basic_labels_serializer
basic_labels_serializer = basic_labels_serializer,

}
2 changes: 1 addition & 1 deletion test/metrics_test.lua
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ g.test_labels_serializer_consistent = function()

t.assert_equals(actual, shared.make_key(label_pairs))
t.assert_equals(actual, shared.make_key(wrapped))
t.assert_not_equals(wrapped.__metrics_make_key, nil)
t.assert_not_equals(wrapped.__metrics_serialize, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a unit test, so it's better to move it to somewhere else (maybe a new file). We also need to have an integration test with a full example:

  • build a custom serializer,
  • set it through public API,
  • observe some values,
  • collect the metrics.

It also would be nice ti write an example (at least in tests) for a custom serializer considering default metrics, especially since they are enabled by default in Tarantool 2.11.1 and newer. As a reference, you may refer to Telegraf labels setup here: https://www.tarantool.io/en/doc/latest/book/monitoring/grafana_dashboard/#prepare-a-monitoring-stack (code block starting with [[inputs.http]]).

We should also do an integration test with global labels setup (AFAIU, one should cover it in serializer too).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it seems that I have misunderstood the general approach: this PR introduces a serializer for each label pairs instead of replacing a serializer for each metrics. In that case, default metrics won't be covered. On the other hand, the question about global labels still needs to be answered.


-- trying to set unexpected label.
t.assert_error_msg_contains('Label "new_label" is unexpected', function() wrapped.new_label = "123456" end)
Expand Down