Skip to content

update NodeListGetRelationshipQuery to be more efficient #6387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 1, 2025

Conversation

ajtmccarty
Copy link
Contributor

@ajtmccarty ajtmccarty commented Apr 30, 2025

IFC-1489

Improve performance of NodeListGetRelationshipQuery. here they are:

  • exclude deleted paths within the cypher query instead of returning them from the database to analyze them in Python before discarding them
  • only return the data actually required. instead of returning 3 nodes, 2 edges, and a string for each row, only return the 4 strings that we need. and exclude duplicate rows before returning
  • divide the query up into 3 subqueries for inbound/outbound/bidirectional relationships and try to exclude deleted relationships as early and quickly as possible

@github-actions github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Apr 30, 2025
Copy link

codspeed-hq bot commented Apr 30, 2025

CodSpeed Performance Report

Merging #6387 will not alter performance

Comparing ajtm-04302025-node-rel-query-update (aece5b1) with stable (1f4588d)

Summary

✅ 10 untouched benchmarks

@ajtmccarty ajtmccarty marked this pull request as ready for review April 30, 2025 20:08
@ajtmccarty ajtmccarty requested a review from a team as a code owner April 30, 2025 20:08
CALL {
WITH n
MATCH (n)<-[:IS_RELATED]-(rel:Relationship)<-[:IS_RELATED]-(peer)
WHERE ($relationship_identifiers IS NULL OR rel.name in $relationship_identifiers)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we differentiate the relationship_identifiers for each direction or does it require a larger refactoring ?
we could have
relationship_identifiers_inbound and relationship_identifiers_outbound
the idea is to have the option to only query parent and not children for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this without too much refactoring

AND all(r IN relationships(paths_bidir) WHERE (%(filters)s))
AND n.uuid <> peer.uuid
RETURN n, rel, peer, r1, r2, "bidirectional" as direction
CALL {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to remember that we'll need to convert this one for 1.3, maybe we can open an issue and tag all impacted PRs on it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Fatih already made this issue to go over all the CALL subqueries before the 1.3 release. I'm not sure which issue it is though

@dgarros dgarros merged commit 448731f into stable May 1, 2025
35 checks passed
@dgarros dgarros deleted the ajtm-04302025-node-rel-query-update branch May 1, 2025 05:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group/backend Issue related to the backend (API Server, Git Agent)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants