-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Improve config file robustness #6895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Improve config file robustness #6895
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the robustness of configuration file handling by implementing a recovery mechanism when loading the config fails, and by enhancing file operations with additional architecture-specific safeguards.
- Adds recovery logic in NodeDB.cpp to attempt loading from a temporary file if the primary config load fails.
- Updates SafeFile.cpp to conditionally remove the file for non-fullAtomic writes on ARCH_NRF52.
- Refines FSCommon.cpp's renameFile function to support multiple architectures and to fallback to a copy–remove strategy.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/mesh/NodeDB.cpp | Implements temporary file recovery for config load failures. |
src/SafeFile.cpp | Adjusts file open and close behavior for atomic and non-atomic writes. |
src/FSCommon.cpp | Improves file renaming with additional locking and fallback handling. |
@@ -1049,6 +1049,29 @@ LoadFileResult NodeDB::loadProto(const char *filename, size_t protoSize, size_t | |||
} else { | |||
LOG_ERROR("Could not open / read %s", filename); | |||
} | |||
|
|||
if (state != LoadFileResult::LOAD_SUCCESS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider deleting the temporary file (.tmp) if the recovery decode fails, to prevent persistent attempts to load a corrupted or incomplete file in subsequent runs.
Copilot uses AI. Check for mistakes.
if (!result) { | ||
if (copyFile(pathFrom, pathTo)) { | ||
concurrency::LockGuard g(spiLock); | ||
FSCom.remove(pathFrom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the fallback branch after a failed FSCom.rename, consider adding error handling or logging for FSCom.remove(pathFrom) in case the removal fails after a successful copy, to improve diagnosability.
FSCom.remove(pathFrom); | |
if (!FSCom.remove(pathFrom)) { | |
LOG_ERROR("Failed to remove source file %s after successful copy", pathFrom); | |
} |
Copilot uses AI. Check for mistakes.
Possibly fixes #6430.
My heltec wsl v3 running on battery forgot its config (name, hops count, probably something else). It was running on USB power backed by 12V battery, and once it got low the protection circuit kicked in and put it into a boot loop by providing then removing power every couple seconds that lasted a while.
I do not have the skill to properly debug it in all scenarios (the amount of ifdefs is frightening) so will appreciate your review.