patching command framework `infrahub db patch plan/apply/revert` #6311

ajtmccarty · 2025-04-16T00:54:09Z

new patching related commands and framework that allow applying and reverting patches in an idempotent manner. these are the new commands

infrahub db patch plan takes in a python-style path (infrahub.patches.this_patch) and writes a "patch plan" to the given directory (default infrahub-patches). the plan is a series of files that just save the exact changes to be made as JSON
infrahub db patch apply takes in a path to a directory where a patch plan exists and runs the patch plan against the database. the patch plan is run using idempotent code that ensures no duplicate nodes/edges are created. this command also updates some of the files in the patch plan directory with the IDs of created/deleted nodes so that they can be used in a revert and so that we know what has been added/deleted in case the command fails partway through
infrahub db patch revert takes in a path to a directory where a patch plan exists and reverts the database to its pre-patch state. this code is also idempotent. this command updates the same files as the apply command, removing the previously added/deleted elements as they are un-added and un-deleted

follow-up PR(s) will have code for actual patches for removing duplicated edges and nodes

here's an example of an instance of the PatchQuery class from the next PR

backend/infrahub/core/migrations/graph/patch_0001.py

codspeed-hq · 2025-04-18T23:02:52Z

CodSpeed Performance Report

Merging #6311 will not alter performance

_{Comparing ajtm-04152025-duplication-patch (6261a39) with stable (a987255)}

Summary

✅ 10 untouched benchmarks

ajtmccarty · 2025-04-21T21:34:24Z

backend/infrahub/patch/edge_adder.py

+        for e_to_add in edges_to_add:
+            all_prop_keys |= set(e_to_add.after_props.keys())
+
+        cypher_variable_map = "{" + ",".join([f"{p}: edge_to_add.after_props.{p}" for p in all_prop_keys]) + "}"


this is necessary so that we can correctly use MERGE in the cypher query below to avoid adding duplicate edges.

This should be a comment in the code

upon further review, this workaround is no longer necessary now that the Adder components use CREATE instead of MERGE in their cypher queries. So I've deleted it from here and the PatchPlanVertexAdder

ajtmccarty · 2025-04-21T21:35:32Z

backend/infrahub/patch/edge_adder.py

+            abstract_id = result.get("abstract_id")
+            concrete_id = result.get("db_id")
+            abstract_to_concrete_id_map[abstract_id] = concrete_id
+        return abstract_to_concrete_id_map


this dictionary is to track the database IDs of edges added as part of this patch so that we know which edges to remove if the patch is reverted and can track which edges have been added already if the patch fails partway through

This comment could be part of the function docstring contains so the reader knows what/why dict[str, str] is returned

good suggestion. Added docstrings for PatchPlanEdgeAdder.execute and PatchPlanVertexAdder.execute

ajtmccarty · 2025-04-21T21:36:44Z

backend/infrahub/patch/models.py

+    edges_to_add: list[EdgeToAdd] = field(default_factory=list)
+    edges_to_update: list[EdgeToUpdate] = field(default_factory=list)
+    edges_to_delete: list[EdgeToDelete] = field(default_factory=list)
+    added_node_db_id_map: dict[str, str] = field(default_factory=dict)


this is a dictionary for tracking the database IDs of added vertices/edges in case the patch is reverted and the entities must be deleted

ajtmccarty · 2025-04-21T21:52:29Z

backend/infrahub/patch/queries/base.py

+from ..models import PatchPlan
+
+
+class PatchQuery(ABC):


this is the class that we must implement when creating a new patch. The actual work will be in the plan method that must return a PatchPlan instance to record exactly what changes to make and how to reverse them

ajtmccarty · 2025-04-21T21:55:30Z

backend/tests/db_snapshot.py

I put this class here b/c it seems like the kind of thing that might be useful for other tests

LucasG0 · 2025-04-23T12:44:55Z

backend/tests/db_snapshot.py

+    db_id: str
+    from_db_id: str
+    to_db_id: str
+    edge_type: set


Should edge_type be str here? It makes me wonder, is mypy enabled on these test files?

typing is disabled on all of the other directories under backend.tests, but not this file, so I'm not sure why mypy didn't catch it. I tried deleting all the [[tool.mypy.overrides]] entries for backend.tests paths and running mypy backend/tests/db_snapshot.py still didn't pick anything up. either way, I fixed this type hint. good catch

LucasG0 · 2025-04-23T12:49:13Z

backend/infrahub/cli/patch.py

+    await patch_runner.revert(patch_plan_directory=patch_plan_dir)
+    rprint(f"{SUCCESS_BADGE} Patch plan successfully reverted")
+
+    await db.close()


It could be nice to have a plan_and_apply_patch command to make it simpler to use?

good suggestion. I added an optional --apply flag to the infrahub db patch plan command, which seems more efficient than adding another command that combines them

LucasG0 · 2025-04-23T12:56:57Z

backend/infrahub/patch/models.py

+@dataclass
+class VertexToAdd:
+    labels: list[str]
+    after_props: dict[str, str | int | bool | None]


Maybe an alias like PropertyType could be used instead of multiple str | int | bool | None. I am wondering if float should be there too?

made this change and included float

ogenstad · 2025-04-23T12:38:46Z

backend/infrahub/cli/patch.py

+    plan_reader = PatchPlanReader()
+    return PatchRunner(
+        plan_writer=plan_writer,
+        plan_reader=plan_reader,


Not really an issue but it would feel more consistent to not assign the plan_writer and plan_reader variables before the return as we don't use them and don't assign anything else in this function.

good point. made this change

ogenstad · 2025-04-23T12:42:33Z

backend/infrahub/cli/patch.py

+    for _, cls in inspect.getmembers(patch_module, inspect.isclass):
+        if issubclass(cls, PatchQuery) and cls is not PatchQuery:
+            patch_query_class = cls
+            break


Will this always be from files that we control? De we need to raise an error if there are two classes inheriting from PatchQuery?

good point. added handling for this case and improved the error message

ogenstad · 2025-04-23T12:46:56Z

backend/infrahub/cli/patch.py

+    logging.getLogger("infrahub").setLevel(logging.WARNING)
+    logging.getLogger("neo4j").setLevel(logging.ERROR)
+    logging.getLogger("prefect").setLevel(logging.ERROR)
+    config.load_and_exit(config_file_name=config_file)


Perhaps only highlighted by this PR but we have this same setup in a number of the cli commands, might be that we should use a function for these lines or look at what we're doing in infrahubctl with CONFIG_PARAM

that sound like a good idea. I'm not going to do it right now

ogenstad · 2025-04-23T12:54:34Z

backend/infrahub/patch/edge_adder.py

+        for result in results:
+            abstract_id = result.get("abstract_id")
+            concrete_id = result.get("db_id")
+            abstract_to_concrete_id_map[abstract_id] = concrete_id


The use of .get() here will it always return a value or can abstract_id ever be None?

it should always have a value b/c the EdgeToAdd object has a required identifier field that uses a string UUID by default

ogenstad · 2025-04-23T13:01:27Z

backend/infrahub/patch/edge_deleter.py

+    async def execute(self, edges_to_delete: list[EdgeToDelete]) -> AsyncGenerator[set[str], None]:
+        for i in range(0, len(edges_to_delete), self.batch_size_limit):
+            edges_slice = edges_to_delete[i : i + self.batch_size_limit]
+            ids_to_delete = [e.db_id for e in edges_slice]


We should be able to avoid creating the extra list and just have:

ids_to_delete = [e.db_id for e in edges_to_delete[i : i + self.batch_size_limit]]

good point. made the change here and in PatchPlanVertexDeleter

LucasG0 · 2025-04-23T13:24:00Z

backend/infrahub/patch/vertex_adder.py

+            "cypher_variable_map": cypher_variable_map,
+            "id_func_name": self.db.get_id_function_name(),
+        }
+        results = await self.db.execute_query(


If this query is executed with 100 vertices to add, if for some reason it fails while only 60 vertices were created, would this function raise an error, implying these 60 created vertices ids would not later be added to the list of created vertices? If so, running this query in a transaction that would be rollback in case of failure would fix this?

very good question. thanks for reading through and really understanding what is going on here. I think that you are right and a transaction will be helpful here, so I've added it (and in the same place in PatchPlanEdgeAdder). I think there is still a possible failure point where the data written to the database will not match what we write to the file (say if the commit part of the transaction fails), but that failure space is much smaller now

If the transaction commit fails, my understanding is that an error would be raised so nothing would be written to the file?

LucasG0 · 2025-04-23T13:27:52Z

backend/infrahub/patch/runner.py

+        await self._revert_added_edges(patch_plan=patch_plan, patch_plan_directory=patch_plan_directory)
+        await self._revert_added_vertices(patch_plan=patch_plan, patch_plan_directory=patch_plan_directory)
+        vertices_to_update = []
+        for vertex_update_to_revert in patch_plan.vertices_to_update:


list comprehension could be used here, and on similar patterns below

good suggestion. I've made the change and it is more readable this way

LucasG0 · 2025-04-24T14:08:38Z

Nice one!

github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Apr 16, 2025

LucasG0 reviewed Apr 16, 2025

View reviewed changes

backend/infrahub/core/migrations/graph/patch_0001.py Outdated Show resolved Hide resolved

ajtmccarty added 6 commits April 21, 2025 13:32

WIP patch for removing dup kind-migrated nodes following a merge

eee2ad7

WIP patch framework

3c12ffa

patch commands and framework

3d18abc

add restore method

a69972a

update tests for reverting a patch

1e25251

generate cli docs

7e66f61

ajtmccarty force-pushed the ajtm-04152025-duplication-patch branch from 603e562 to 7e66f61 Compare April 21, 2025 20:32

github-actions bot added the type/documentation Improvements or additions to documentation label Apr 21, 2025

ajtmccarty added 2 commits April 21, 2025 14:14

update docs

6da34db

remove actual patch from this branch

5fb73e3

ajtmccarty changed the title ~~WIP patch for removing dup kind-migrated nodes following a merge~~ patching command framework infrahub db patch plan/apply/revert Apr 21, 2025

ajtmccarty added 3 commits April 21, 2025 14:48

consolidate file names for patch plans

87632ad

make elementId function use dynamic

b27d48c

one more docstring

f00e74f

ajtmccarty commented Apr 21, 2025

View reviewed changes

ajtmccarty marked this pull request as ready for review April 21, 2025 22:16

ajtmccarty requested review from a team as code owners April 21, 2025 22:16

ajtmccarty marked this pull request as draft April 22, 2025 00:21

ajtmccarty added 2 commits April 22, 2025 14:58

refactor to track deleted IDs in a file and add failure tests

7300e4a

delete edges first during revert

d4884a9

ajtmccarty marked this pull request as ready for review April 22, 2025 23:02

move call order around

fc01260

LucasG0 reviewed Apr 23, 2025

View reviewed changes

ogenstad approved these changes Apr 23, 2025

View reviewed changes

LucasG0 reviewed Apr 23, 2025

View reviewed changes

ajtmccarty added 4 commits April 23, 2025 08:50

small changes for PR feedback

57e0669

do vertex/edge adds in transactions

35dee9e

generate docs again

c4b3db0

fix for reverting deleted edge linked to deleted vertex

6261a39

ajtmccarty mentioned this pull request Apr 23, 2025

deduplication patches #6340

Merged

LucasG0 approved these changes Apr 24, 2025

View reviewed changes

ajtmccarty merged commit debb8a9 into stable Apr 24, 2025
35 checks passed

ajtmccarty deleted the ajtm-04152025-duplication-patch branch April 24, 2025 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

patching command framework `infrahub db patch plan/apply/revert` #6311

patching command framework `infrahub db patch plan/apply/revert` #6311

ajtmccarty commented Apr 16, 2025 •

edited

Loading

codspeed-hq bot commented Apr 18, 2025 •

edited

Loading

ajtmccarty Apr 21, 2025

LucasG0 Apr 23, 2025

ajtmccarty Apr 23, 2025

ajtmccarty Apr 21, 2025 •

edited

Loading

LucasG0 Apr 23, 2025

ajtmccarty Apr 23, 2025

ajtmccarty Apr 21, 2025

ajtmccarty Apr 21, 2025

ajtmccarty Apr 21, 2025

LucasG0 Apr 23, 2025

ajtmccarty Apr 23, 2025

LucasG0 Apr 23, 2025

ajtmccarty Apr 23, 2025

LucasG0 Apr 23, 2025

ajtmccarty Apr 23, 2025

ogenstad Apr 23, 2025

ajtmccarty Apr 23, 2025

ogenstad Apr 23, 2025

ajtmccarty Apr 23, 2025

ogenstad Apr 23, 2025

ajtmccarty Apr 23, 2025

ogenstad Apr 23, 2025

ajtmccarty Apr 23, 2025

ogenstad Apr 23, 2025

ajtmccarty Apr 23, 2025

LucasG0 Apr 23, 2025 •

edited

Loading

ajtmccarty Apr 23, 2025

LucasG0 Apr 24, 2025

LucasG0 Apr 23, 2025 •

edited

Loading

ajtmccarty Apr 23, 2025

LucasG0 commented Apr 24, 2025

patching command framework infrahub db patch plan/apply/revert #6311

patching command framework infrahub db patch plan/apply/revert #6311

Conversation

ajtmccarty commented Apr 16, 2025 • edited Loading

codspeed-hq bot commented Apr 18, 2025 • edited Loading

Merging #6311 will not alter performance

Summary

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajtmccarty Apr 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LucasG0 Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LucasG0 Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LucasG0 commented Apr 24, 2025

patching command framework `infrahub db patch plan/apply/revert` #6311

patching command framework `infrahub db patch plan/apply/revert` #6311

ajtmccarty commented Apr 16, 2025 •

edited

Loading

codspeed-hq bot commented Apr 18, 2025 •

edited

Loading

ajtmccarty Apr 21, 2025 •

edited

Loading

LucasG0 Apr 23, 2025 •

edited

Loading

LucasG0 Apr 23, 2025 •

edited

Loading