-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix memory leak cherry pick #9893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix memory leak cherry pick #9893
Conversation
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
62d5800
to
4e81d8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending green CI 🎉
4e81d8f
to
4f69df4
Compare
We still need to think about a good release note for all those PRs ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're missing dd5c6d1 from the very first PR. I think cherry-picking a range doesn't include the first commit mentioned.
Although that commit only changes some unit tests, so it's not critical to include, I'd rather avoid needing to rebase this again.
But other than that I confirmed only the commits from the mentioned (already reviewed and merged to master) PRs and the two comment commits 93ee6e2 and 4f69df4 are added.
We still need to think about a good release note for all those PRs ...
My proposal:
- [Fixed a memory leak by threading through parent context](https://github.com/lightningnetwork/lnd/pull/9893).
I verified the "check commits" CI step locally, each commit compiles just fine. |
We want `context.TODO()` to be high signal in the code-base. It should signal clearly that work is required to thread parent context through to the call-site. So to keep the signal-to-noise ratio high, we remove any context.TODO() calls from tests since these will never need to be replace by a parent context.
We want `context.TODO()` to be high signal in the code-base. It should signal clearly that work is required to thread parent context through to the call-site. So to keep the signal-to-noise ratio high, we remove any context.TODO() calls from tests since these will never need to be replace by a parent context. After this commit, there is only a single context.TODO() left in the code-base.
In preparation for starting to thread a single parent context through LND, we update the main `server.Start` method to take a context so that it can later pass it to any subsytem's Start method it calls. We also pass the context to `newServer` since it makes some calls that will eventually reach the DB (for example the graph db).
Pass the parent LND context to the gossiper, let it derive a child context that gets cancelled on Stop. Pass the context through to any methods that will eventually thread it through to any graph DB calls. One `context.TODO()` is added here - this will be removed in the next commit. NOTE: for any internal methods that the context gets passed to, if those methods already listen on the gossiper's `quit` channel, then then don't need to also listen on the passed context's Done() channel because the quit channel is closed at the same time that the context is cancelled.
And remove a context.TODO() that was added in the previous commit.
The `GossiperSyncer` makes various calls to the `ChannelGraphTimeSeries` interface which threads through to the graph DB. So in preparation for threading context through to all the methods on that interface, we update the GossipSyncer accordingly by passing contexts through. Two `context.TODO()`s are added in this commit. They will be removed in the upcoming commits.
Here, we remove one context.TODO() by threading a context through to the SyncManager.
With this, we move a context.TODO() out of the gossiper and into the brontide package - this will be removed in a future PR which focuses on threading contexts through that code.
Since the ChannelGraphBootstrapper implementation makes a call to the graph DB.
For any method that takes a context that has a select that listens on the systems quit channel, we should also listen on the ctx since we should not need to worry about if this context is derived internally or externally.
This commit cleans up the graph test code by removing unused kvdb type parameters from the `createTextVertex` and `createLightningNode` helper methods. We also pass in the testing parameter now so that we dont need to check the error each time we call `createTestVertex`.
Remove the kvdb.Backend parameter from the `createChannelEdge` helper. This is all in preparation for having the unit tests run against any DB backend.
Remove unused kvdb.Backend param from `randEdgePolicy` and `newEdgePolicy` test helpers.
Replace all tests calls to the private `forEachNode` method on the `KVStore` with the exported ForEachNode method. This is in preparation for having the tests run against an abstract DB backend.
The `GraphSource` interface in the `autopilot` package is directly implemented by the `graphdb.KVStore` and so we will eventually thread contexts through to this interface. So in this commit, we start updating the autopilot system to thread contexts through in preparation for passing the context through to any calls made to the GraphSource. Two context.TODOs are added here which will be addressed in follow up commits.
Remove one context.TODO and add one more.
Continue threading context through the autopilot system and remove the remaining context.TODOs.
Later on we will create an interface for the persisted graph data. We want this interface to be as small and as neat as possible. In preparation for this, we remove this unused `Wipe` method.
In preparation for creating a clean interface for the graph store, we want to hide anything that is DB specific from the exposed methods on the interface. Currently the `ForEachNodeChannel` and the `FetchOtherNode` methods of the `KVStore` expose a `kvdb.RTx` parameter which is bbolt specific. There is only one call-site of `ForEachNodeChannel` actually makes use of the passed `kvdb.RTx` parameter, and that is in the `establishPersistentConnections` method of the `server` which then passes the tx parameter to `FetchOtherNode`. So to clean-up the interface such that the `kvdb.RTx` is no longer exposed: we instead create one new method called `ForEachSourceNodeChannel` which can be used to replace the above mentioned call-site. So as of this commit, all the remaining call-site of `ForEachNodeChannel` pass in a nil param for `kvdb.RTx` - meaning we can remove the parameter in a future commit.
Unexport the KVStore `FetchOtherNode` and `ForEachNodeChannelTx` methods so that fewer exposed methods are leaking implementation details.
Replace all calls to bbolt specific methods on the KVStore to instead use exported methods on the KVStore that are more db-agnostic.
Since we have not removed all call-sites that make use of this parameter, we can remove it. This helps hide DB-specific details from the interface we will introduce for the graph store.
We highlight why we do not use the returned cancel method of the context guard.
4f69df4
to
7477a91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. The only other place where we don't cancel is context is here,
Lines 4165 to 4166 in aec16ee
ctx, _ := p.cg.Create(context.Background()) | |
feeRate := defaultFeePerKw.FeePerVByte() |
which is fine as it's low-frequency calls, likely once in a channel lifetime.
I guess the plan is to merge this into 0-19-1-branch
, then the master is rebased on top of 0-19-1-branch
? I tried to rebase master on top of this PR and found a few conflicts, mostly due to release notes.
Also ran ./scripts/check-commit.sh
locally and it passed, don't understand why Testing 9aab68a6f graph/db: unexport various methods that expose kvdb.RTx
failed.
No, no need to do that. All the PRs that are in this one have already been merged to master.
This is normal for large PRs, it's running out of space (but the error sometimes isn't visible in the log). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 🙏
4b3fb11
into
lightningnetwork:0-19-1-branch
I cherry picked the following PRs from Elle's bigger PR:
cherry picked from dd5c6d1 .. 95d34be (25 commits + 3 own commits)
#9681
#9680
#9685
#9694
#9690
#9695
#9704
The fix of the mem leak is in commit: c68a19c (master branch)