Skip to content

fix memory leak cherry pick #9893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ziggie1984
Copy link
Collaborator

@ziggie1984 ziggie1984 commented Jun 3, 2025

I cherry picked the following PRs from Elle's bigger PR:

cherry picked from dd5c6d1 .. 95d34be (25 commits + 3 own commits)

#9681
#9680
#9685
#9694
#9690
#9695
#9704

The fix of the mem leak is in commit: c68a19c (master branch)

Copy link
Contributor

coderabbitai bot commented Jun 3, 2025

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@saubyk saubyk added this to the v0.19.1 milestone Jun 3, 2025
@ziggie1984 ziggie1984 changed the base branch from master to v0.9.1-beta-rc1-branch June 3, 2025 18:43
@guggero guggero changed the base branch from v0.9.1-beta-rc1-branch to 0-19-1-branch June 3, 2025 18:44
@ziggie1984 ziggie1984 marked this pull request as ready for review June 3, 2025 18:46
@ziggie1984 ziggie1984 self-assigned this Jun 3, 2025
@guggero guggero self-requested a review June 3, 2025 18:48
@ziggie1984 ziggie1984 force-pushed the fix-memory-leak-cherry-pick branch from 62d5800 to 4e81d8f Compare June 3, 2025 18:50
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending green CI 🎉

@ziggie1984 ziggie1984 force-pushed the fix-memory-leak-cherry-pick branch from 4e81d8f to 4f69df4 Compare June 3, 2025 19:36
@ziggie1984 ziggie1984 requested a review from ellemouton June 3, 2025 19:39
@guggero guggero self-requested a review June 3, 2025 19:40
@ziggie1984
Copy link
Collaborator Author

ziggie1984 commented Jun 3, 2025

We still need to think about a good release note for all those PRs ...

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're missing dd5c6d1 from the very first PR. I think cherry-picking a range doesn't include the first commit mentioned.

Although that commit only changes some unit tests, so it's not critical to include, I'd rather avoid needing to rebase this again.

But other than that I confirmed only the commits from the mentioned (already reviewed and merged to master) PRs and the two comment commits 93ee6e2 and 4f69df4 are added.

We still need to think about a good release note for all those PRs ...

My proposal:

- [Fixed a memory leak by threading through parent context](https://github.com/lightningnetwork/lnd/pull/9893).

@guggero
Copy link
Collaborator

guggero commented Jun 4, 2025

I verified the "check commits" CI step locally, each commit compiles just fine.

ellemouton added 14 commits June 4, 2025 10:54
We want `context.TODO()` to be high signal in the code-base. It should
signal clearly that work is required to thread parent context through to
the call-site. So to keep the signal-to-noise ratio high, we remove any
context.TODO() calls from tests since these will never need to be
replace by a parent context.
We want `context.TODO()` to be high signal in the code-base. It should
signal clearly that work is required to thread parent context through to
the call-site. So to keep the signal-to-noise ratio high, we remove any
context.TODO() calls from tests since these will never need to be
replace by a parent context.

After this commit, there is only a single context.TODO() left in the
code-base.
In preparation for starting to thread a single parent context through
LND, we update the main `server.Start` method to take a context so that
it can later pass it to any subsytem's Start method it calls. We also
pass the context to `newServer` since it makes some calls that will
eventually reach the DB (for example the graph db).
Pass the parent LND context to the gossiper, let it derive a child
context that gets cancelled on Stop. Pass the context through to any
methods that will eventually thread it through to any graph DB calls.

One `context.TODO()` is added here - this will be removed in the next
commit.

NOTE: for any internal methods that the context gets passed to, if those
methods already listen on the gossiper's `quit` channel, then then don't
need to also listen on the passed context's Done() channel because the
quit channel is closed at the same time that the context is cancelled.
And remove a context.TODO() that was added in the previous commit.
The `GossiperSyncer` makes various calls to the `ChannelGraphTimeSeries`
interface which threads through to the graph DB. So in preparation for
threading context through to all the methods on that interface, we
update the GossipSyncer accordingly by passing contexts through.

Two `context.TODO()`s are added in this commit. They will be removed in
the upcoming commits.
Here, we remove one context.TODO() by threading a context through to the
SyncManager.
With this, we move a context.TODO() out of the gossiper and into the
brontide package - this will be removed in a future PR which focuses on
threading contexts through that code.
Since the ChannelGraphBootstrapper implementation makes a call to the
graph DB.
For any method that takes a context that has a select that listens on
the systems quit channel, we should also listen on the ctx since we
should not need to worry about if this context is derived internally or
externally.
This commit cleans up the graph test code by removing unused kvdb type
parameters from the `createTextVertex` and `createLightningNode` helper
methods. We also pass in the testing parameter now so that we dont need
to check the error each time we call `createTestVertex`.
Remove the kvdb.Backend parameter from the `createChannelEdge` helper.
This is all in preparation for having the unit tests run against any DB
backend.
Remove unused kvdb.Backend param from `randEdgePolicy` and
`newEdgePolicy` test helpers.
ellemouton and others added 14 commits June 4, 2025 10:54
Replace all tests calls to the private `forEachNode` method on the
`KVStore` with the exported ForEachNode method. This is in preparation
for having the tests run against an abstract DB backend.
The `GraphSource` interface in the `autopilot` package is directly
implemented by the `graphdb.KVStore` and so we will eventually thread
contexts through to this interface. So in this commit, we start updating
the autopilot system to thread contexts through in preparation for
passing the context through to any calls made to the GraphSource.

Two context.TODOs are added here which will be addressed in follow up
commits.
Remove one context.TODO and add one more.
Continue threading context through the autopilot system and remove the
remaining context.TODOs.
Later on we will create an interface for the persisted graph data. We
want this interface to be as small and as neat as possible. In
preparation for this, we remove this unused `Wipe` method.
In preparation for creating a clean interface for the graph store, we
want to hide anything that is DB specific from the exposed methods on
the interface. Currently the `ForEachNodeChannel` and the
`FetchOtherNode` methods of the `KVStore` expose a `kvdb.RTx` parameter
which is bbolt specific. There is only one call-site of
`ForEachNodeChannel` actually makes use of the passed `kvdb.RTx`
parameter, and that is in the `establishPersistentConnections` method of
the `server` which then passes the tx parameter to `FetchOtherNode`.

So to clean-up the interface such that the `kvdb.RTx` is no longer
exposed: we instead create one new method called
`ForEachSourceNodeChannel` which can be used to replace the above
mentioned call-site. So as of this commit, all the remaining call-site
of `ForEachNodeChannel` pass in a nil param for `kvdb.RTx` - meaning we
can remove the parameter in a future commit.
Unexport the KVStore `FetchOtherNode` and `ForEachNodeChannelTx` methods
so that fewer exposed methods are leaking implementation details.
Replace all calls to bbolt specific methods on the KVStore to instead
use exported methods on the KVStore that are more db-agnostic.
Since we have not removed all call-sites that make use of this
parameter, we can remove it. This helps hide DB-specific details from
the interface we will introduce for the graph store.
We highlight why we do not use the returned cancel method of the
context guard.
@ziggie1984 ziggie1984 force-pushed the fix-memory-leak-cherry-pick branch from 4f69df4 to 7477a91 Compare June 4, 2025 08:59
@guggero guggero linked an issue Jun 4, 2025 that may be closed by this pull request
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. The only other place where we don't cancel is context is here,

lnd/peer/brontide.go

Lines 4165 to 4166 in aec16ee

ctx, _ := p.cg.Create(context.Background())
feeRate := defaultFeePerKw.FeePerVByte()

which is fine as it's low-frequency calls, likely once in a channel lifetime.

I guess the plan is to merge this into 0-19-1-branch, then the master is rebased on top of 0-19-1-branch? I tried to rebase master on top of this PR and found a few conflicts, mostly due to release notes.

Also ran ./scripts/check-commit.sh locally and it passed, don't understand why Testing 9aab68a6f graph/db: unexport various methods that expose kvdb.RTx failed.

@guggero
Copy link
Collaborator

guggero commented Jun 4, 2025

I guess the plan is to merge this into 0-19-1-branch, then the master is rebased on top of 0-19-1-branch? I tried to rebase master on top of this PR and found a few conflicts, mostly due to release notes.

No, no need to do that. All the PRs that are in this one have already been merged to master.

don't understand why Testing 9aab68a graph/db: unexport various methods that expose kvdb.RTx failed.

This is normal for large PRs, it's running out of space (but the error sometimes isn't visible in the log).

Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🙏

@guggero guggero merged commit 4b3fb11 into lightningnetwork:0-19-1-branch Jun 4, 2025
35 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug]: Possible Memory Leak Gossip context.AfterFunc() / v0.19.0-beta
5 participants