Description
Describe the bug
I am using grafana-operator to deploy Grafana alert rules.
The alert rule groups are synced to the Grafana instances every 10 minutes (default). Sometimes, the alert rules are updated in a manner, that their fingerprint changes. This causes the alert rule state to reset, which again causes a lot of noise by triggering new notifications after the pending period for already firing alerts.
I tried to debug the issue by checking if there is constant drift between the alert rules in the CRs of the Grafana operator and the Grafana instances itself. But i can not find anything wrong. I did also check the alert_rule_version
table in the Grafana DB. The only columns that are different between the alert rule versions are:
- id (makes sense)
- parent_version (makes sense)
- version (makes sense)
- created (makes sense)
- rule_group_idx (not sure what that is, some ID for the whole rule group that changes?)
How can i debug this further? Any ideas to why the alert rules are updated constantly?
Version
Grafana operator: v5.17.0
Grafana: v11.5.2
To Reproduce
Steps to reproduce the behavior:
- Create alert rule groups with multiple alert rules with grafana operator
- Check alert rule history in Grafana to see them being updated constantly
Expected behavior
Alert rules only update, when they actually change.
Additional context
Output of the Grafana alert_rule_version
table for one specific alert rule, that has this problem:
id | rule_org_id | rule_uid | rule_namespace_uid | rule_group | parent_version | restored_from | version | created | title | condition | data | interval_seconds | no_data_state | exec_err_state | for | annotations | labels | rule_group_idx | is_paused | notification_settings | record | metadata
--------+-------------+------------------------------+--------------------------------------+--------------+----------------+---------------+---------+---------------------+------------------------------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+---------------+----------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------+----------------+-----------+---------------------------+--------+-----------------------------------------------------------------------------------------------------------------
664168 | 1 | example-dev-k8s-mem-pressure | 477b7cd3-2fee-4393-a40f-be43fc80ef4c | example-dev | 21 | 0 | 22 | 2025-03-14 12:54:15 | KubernetesNodeMemoryPressure | B | [{"refId":"A","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"prometheus-example-dev","model":{"datasource":{"type":"prometheus","uid":"prometheus-example-dev"},"editorMode":"code","expr":"kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"}","instant":true,"intervalMs":1000,"legendFormat":"__auto","maxDataPoints":43200,"range":false,"refId":"A"}},{"refId":"B","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"__expr__","model":{"conditions":[{"evaluator":{"params":[0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["C"]},"reducer":{"params":[],"type":"last"},"type":"query"}],"datasource":{"type":"__expr__","uid":"__expr__"},"expression":"A","intervalMs":1000,"maxDataPoints":43200,"refId":"B","type":"threshold"}}] | 600 | NoData | Error | 600000000000 | {"description":"Node {{ $labels.node }} has MemoryPressure condition\nVALUE = {{ $value }}\nLABELS = {{ $labels }}","summary":"Kubernetes memory pressure (node {{ $labels.node }})"} | {"project":"example","severity":"critical","stage":"dev"} | 1 | f | [{"receiver":"opsgenie"}] | | {"editor_settings":{"simplified_query_and_expressions_section":false,"simplified_notifications_section":false}}
664230 | 1 | example-dev-k8s-mem-pressure | 477b7cd3-2fee-4393-a40f-be43fc80ef4c | example-dev | 22 | 0 | 23 | 2025-03-14 13:04:15 | KubernetesNodeMemoryPressure | B | [{"refId":"A","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"prometheus-example-dev","model":{"datasource":{"type":"prometheus","uid":"prometheus-example-dev"},"editorMode":"code","expr":"kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"}","instant":true,"intervalMs":1000,"legendFormat":"__auto","maxDataPoints":43200,"range":false,"refId":"A"}},{"refId":"B","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"__expr__","model":{"conditions":[{"evaluator":{"params":[0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["C"]},"reducer":{"params":[],"type":"last"},"type":"query"}],"datasource":{"type":"__expr__","uid":"__expr__"},"expression":"A","intervalMs":1000,"maxDataPoints":43200,"refId":"B","type":"threshold"}}] | 600 | NoData | Error | 600000000000 | {"description":"Node {{ $labels.node }} has MemoryPressure condition\nVALUE = {{ $value }}\nLABELS = {{ $labels }}","summary":"Kubernetes memory pressure (node {{ $labels.node }})"} | {"project":"example","severity":"critical","stage":"dev"} | 0 | f | [{"receiver":"opsgenie"}] | | {"editor_settings":{"simplified_query_and_expressions_section":false,"simplified_notifications_section":false}}
664238 | 1 | example-dev-k8s-mem-pressure | 477b7cd3-2fee-4393-a40f-be43fc80ef4c | example-dev | 23 | 0 | 24 | 2025-03-14 13:04:16 | KubernetesNodeMemoryPressure | B | [{"refId":"A","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"prometheus-example-dev","model":{"datasource":{"type":"prometheus","uid":"prometheus-example-dev"},"editorMode":"code","expr":"kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"}","instant":true,"intervalMs":1000,"legendFormat":"__auto","maxDataPoints":43200,"range":false,"refId":"A"}},{"refId":"B","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"__expr__","model":{"conditions":[{"evaluator":{"params":[0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["C"]},"reducer":{"params":[],"type":"last"},"type":"query"}],"datasource":{"type":"__expr__","uid":"__expr__"},"expression":"A","intervalMs":1000,"maxDataPoints":43200,"refId":"B","type":"threshold"}}] | 600 | NoData | Error | 600000000000 | {"description":"Node {{ $labels.node }} has MemoryPressure condition\nVALUE = {{ $value }}\nLABELS = {{ $labels }}","summary":"Kubernetes memory pressure (node {{ $labels.node }})"} | {"project":"example","severity":"critical","stage":"dev"} | 1 | f | [{"receiver":"opsgenie"}] | | {"editor_settings":{"simplified_query_and_expressions_section":false,"simplified_notifications_section":false}}
664300 | 1 | example-dev-k8s-mem-pressure | 477b7cd3-2fee-4393-a40f-be43fc80ef4c | example-dev | 24 | 0 | 25 | 2025-03-14 13:14:16 | KubernetesNodeMemoryPressure | B | [{"refId":"A","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"prometheus-example-dev","model":{"datasource":{"type":"prometheus","uid":"prometheus-example-dev"},"editorMode":"code","expr":"kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"}","instant":true,"intervalMs":1000,"legendFormat":"__auto","maxDataPoints":43200,"range":false,"refId":"A"}},{"refId":"B","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"__expr__","model":{"conditions":[{"evaluator":{"params":[0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["C"]},"reducer":{"params":[],"type":"last"},"type":"query"}],"datasource":{"type":"__expr__","uid":"__expr__"},"expression":"A","intervalMs":1000,"maxDataPoints":43200,"refId":"B","type":"threshold"}}] | 600 | NoData | Error | 600000000000 | {"description":"Node {{ $labels.node }} has MemoryPressure condition\nVALUE = {{ $value }}\nLABELS = {{ $labels }}","summary":"Kubernetes memory pressure (node {{ $labels.node }})"} | {"project":"example","severity":"critical","stage":"dev"} | 0 | f | [{"receiver":"opsgenie"}] | | {"editor_settings":{"simplified_query_and_expressions_section":false,"simplified_notifications_section":false}}
664308 | 1 | example-dev-k8s-mem-pressure | 477b7cd3-2fee-4393-a40f-be43fc80ef4c | example-dev | 25 | 0 | 26 | 2025-03-14 13:14:17 | KubernetesNodeMemoryPressure | B | [{"refId":"A","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"prometheus-example-dev","model":{"datasource":{"type":"prometheus","uid":"prometheus-example-dev"},"editorMode":"code","expr":"kube_node_status_condition{condition=\"MemoryPressure\",status=\"true\"}","instant":true,"intervalMs":1000,"legendFormat":"__auto","maxDataPoints":43200,"range":false,"refId":"A"}},{"refId":"B","queryType":"","relativeTimeRange":{"from":600,"to":0},"datasourceUid":"__expr__","model":{"conditions":[{"evaluator":{"params":[0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["C"]},"reducer":{"params":[],"type":"last"},"type":"query"}],"datasource":{"type":"__expr__","uid":"__expr__"},"expression":"A","intervalMs":1000,"maxDataPoints":43200,"refId":"B","type":"threshold"}}] | 600 | NoData | Error | 600000000000 | {"description":"Node {{ $labels.node }} has MemoryPressure condition\nVALUE = {{ $value }}\nLABELS = {{ $labels }}","summary":"Kubernetes memory pressure (node {{ $labels.node }})"} | {"project":"example","severity":"critical","stage":"dev"} | 1 | f | [{"receiver":"opsgenie"}] | | {"editor_settings":{"simplified_query_and_expressions_section":false,"simplified_notifications_section":false}}
(5 rows)