-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lldb] Provide lr value in faulting frame on arm64 #138805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Provide lr value in faulting frame on arm64 #138805
Conversation
When a frameless function faults or is interrupted asynchronously, the UnwindPlan MAY have no register location rule for the return address register (lr on arm64); the value is simply live in the lr register when it was interrupted, and the frame below this on the stack -- e.g. sigtramp on a Unix system -- has the full register context, including that register. RegisterContextUnwind::SavedLocationForRegister, when asked to find the caller's pc value, will first see if there is a pc register location. If there isn't, on a Return Address Register architecture like arm/mips/riscv, we rewrite the register request from "pc" to "RA register", and search for a location. On frame 0 (the live frame) and an interrupted frame, the UnwindPlan may have no register location rule for the RA Reg, that is valid. A frameless function that never calls another may simply keep the return address in the live register the whole way. Our instruction emulation unwind plans explicitly add a rule (see Pavel's May 2024 change llvm#91321 ), but an UnwindPlan sourced from debug_frame may not. I've got a case where this exactly happens - clang debug_frame for arm64 where there is no register location for the lr in a frameless function. There is a fault in the middle of this frameless function and we only get the lr value from the fault handler below this frame if lr has a register location of `IsSame`, in line with Pavel's 2024 change. Similar to how we see a request of the RA Reg from frame 0 after failing to find an unwind location for the pc register, the same style of special casing is needed when this is a function that was interrupted. Without this change, we can find the pc of the frame that was executing when it was interrupted, but we need $lr to find its caller, and we don't descend down to the trap handler to get that value, truncating the stack. rdar://145614545
@llvm/pr-subscribers-lldb Author: Jason Molenda (jasonmolenda) ChangesWhen a frameless function faults or is interrupted asynchronously, the UnwindPlan MAY have no register location rule for the return address register (lr on arm64); the value is simply live in the lr register when it was interrupted, and the frame below this on the stack -- e.g. sigtramp on a Unix system -- has the full register context, including that register. RegisterContextUnwind::SavedLocationForRegister, when asked to find the caller's pc value, will first see if there is a pc register location. If there isn't, on a Return Address Register architecture like arm/mips/riscv, we rewrite the register request from "pc" to "RA register", and search for a location. On frame 0 (the live frame) and an interrupted frame, the UnwindPlan may have no register location rule for the RA Reg, that is valid. A frameless function that never calls another may simply keep the return address in the live register the whole way. Our instruction emulation unwind plans explicitly add a rule (see Pavel's May 2024 change #91321 ), but an UnwindPlan sourced from debug_frame may not. I've got a case where this exactly happens - clang debug_frame for arm64 where there is no register location for the lr in a frameless function. There is a fault in the middle of this frameless function and we only get the lr value from the fault handler below this frame if lr has a register location of Similar to how we see a request of the RA Reg from frame 0 after failing to find an unwind location for the pc register, the same style of special casing is needed when this is a function that was interrupted. Without this change, we can find the pc of the frame that was executing when it was interrupted, but we need $lr to find its caller, and we don't descend down to the trap handler to get that value, truncating the stack. rdar://145614545 Full diff: https://github.com/llvm/llvm-project/pull/138805.diff 1 Files Affected:
diff --git a/lldb/source/Target/RegisterContextUnwind.cpp b/lldb/source/Target/RegisterContextUnwind.cpp
index 3ed49e12476dd..23a86bee2518b 100644
--- a/lldb/source/Target/RegisterContextUnwind.cpp
+++ b/lldb/source/Target/RegisterContextUnwind.cpp
@@ -1377,6 +1377,7 @@ RegisterContextUnwind::SavedLocationForRegister(
}
}
+ // Check if the active_row has a register location listed.
if (regnum.IsValid() &&
active_row->GetRegisterInfo(regnum.GetAsKind(unwindplan_registerkind),
unwindplan_regloc)) {
@@ -1390,11 +1391,10 @@ RegisterContextUnwind::SavedLocationForRegister(
// This is frame 0 and we're retrieving the PC and it's saved in a Return
// Address register and it hasn't been saved anywhere yet -- that is,
// it's still live in the actual register. Handle this specially.
-
if (!have_unwindplan_regloc && return_address_reg.IsValid() &&
- IsFrameZero()) {
- if (return_address_reg.GetAsKind(eRegisterKindLLDB) !=
- LLDB_INVALID_REGNUM) {
+ return_address_reg.GetAsKind(eRegisterKindLLDB) !=
+ LLDB_INVALID_REGNUM) {
+ if (IsFrameZero()) {
lldb_private::UnwindLLDB::ConcreteRegisterLocation new_regloc;
new_regloc.type = UnwindLLDB::ConcreteRegisterLocation::
eRegisterInLiveRegisterContext;
@@ -1408,6 +1408,17 @@ RegisterContextUnwind::SavedLocationForRegister(
return_address_reg.GetAsKind(eRegisterKindLLDB),
return_address_reg.GetAsKind(eRegisterKindLLDB));
return UnwindLLDB::RegisterSearchResult::eRegisterFound;
+ } else if (BehavesLikeZerothFrame()) {
+ // This function was interrupted asynchronously -- it faulted,
+ // an async interrupt, a timer fired, a debugger expression etc.
+ // The caller's pc is in the Return Address register, but the
+ // UnwindPlan for this function may have no location rule for
+ // the RA reg.
+ // This means that the caller's return address is in the RA reg
+ // when the function was interrupted--descend down one stack frame
+ // to retrieve it from the trap handler's saved context.
+ unwindplan_regloc.SetSame();
+ have_unwindplan_regloc = true;
}
}
|
While working on this bug, I started musing about how we could switch the logic for how to handle registers more into UnwindPlans, wrote a little discourse with the idea. https://discourse.llvm.org/t/unhappiness-with-the-lldb-unwinder-register-passing-up-the-stack-interrup-trap-sigtramp-frames/86058 It's not something I'll be doing in the near term, but I know I'd like to get back to this. |
Also while initially working on this issue I found I could fix it in two places in |
to be clear, when I mention a big chunk of |
I am also not thrilled with the GetAsKind method for register numbering returning LLDB_INVALID_REGNUM, the code reads so poorly in this method because of it. But returning an optional would make other code more verbose unless I carried optionals through a lot further. I haven't been annoyed enough to try doing that yet. |
This makes sense to me, but is there any way to write a test case for it? It sounds like something similar to this test could work. In fact, given that this test is disabled on darwin, I'm wondering if this patch does not fix it by any chance? |
FTR that test is skipped on darwin because _sigtramp in libc on aarch64 doesn't have any hand-supplied eh_frame right now :/ I have a test case put together, with a few functions written in assembly with hand-added cfi directives. It's probably darwin specific assembly, obviously aarch64 only, but it exercises exactly the bug I am fixing. will wrap it up into a proper test in a tiny bit (hopefully tonight) |
via the UnwindPlan. And use the trap handler flag for zeroth frame so we can properly unwind a frameless function that was interrupted while in the trap handler function.
I'm thinking I'll write an API test that instruction steps through the whole binary and backtraces at each point, making sure it can get every function between the current one & main at all points, just for fun extra stress testing. I was testing it like that by hand as I wrote the test input files. |
to trap() and break_to_debugger() for fun, and add unwind instructions for the epilogue of trap(). When stopped in `break_to_debugger`, ``` * frame #0: 0x00000001000003e8 a.out`break_to_debugger + 4 frame swiftlang#1: 0x00000001000003d8 a.out`trap + 16 frame swiftlang#2: 0x00000001000003c0 a.out`to_be_interrupted + 20 frame swiftlang#3: 0x0000000100000398 a.out`main + 32 ``` Normally we can't provide a volatile register (e.g. x0) up the stack. And we can provide a non-volatile register (e.g. x20) up the stack. I added an IsSame rule for trap() and break_to_debugger() for x0, so it CAN be passed up the stack. And I added an Undefined rule for x20 to trap() so it CAN'T be provided up the stack. If a user selects `to_be_interrupted` and does `register read`, they'll get x0 and they won't get x20. From `main`, they will not get x0 or x20 (x0 because `to_be_interrupted` doesn't have an IsSame rule).
Also fix one bug in the by-hand unwind state in the assembly file.
ways in these hand-written UnwindPlans.
this compiling on linux? I think maybe the only difference is that the symbol names are prefixed with _ on darwin?
OK finished writing the test case. It starts at main() and instruction steps through the test program, checking that it can backtrace all the way to main() (including all the frames in the middle) at each instruction. I also added tests for the non-ABI compliant fetching of x0 mid-stack and x20 being unfetchable mid-stack, with the hand-written eh_frame instructions in the assembly file. It has nothing to do with the frameless function that faults that I'm trying to fix here, but it was interesting that it works as I wanted so I'll add a test on it. I have the test marked as skipUnlessDarwin but I'm not sure there's much to keep it from working on Linux. Maybe that I had to prefix the symbol names with _ on dawin? Might be easiest to update the .c file to call either to_be_interrupted or _to_be_interrupted depending on the OS if that's it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to com up with this test case. Pavel already signed off on the implementation so I think this is good to merge.
C_SOURCES := main.c | ||
|
||
interrupt-and-trap-funcs.o: interrupt-and-trap-funcs.s | ||
$(CC) $(CFLAGS) -c -o interrupt-and-trap-funcs.o $(SRCDIR)/interrupt-and-trap-funcs.s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this pass a triple? I assume this works locally because you're on an arm64 host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ach, that reminds me I need to skip this test unless arm64.
Darwin. There's a slight syntax diff for the ADRP + ADD pair and the label name. And the function names in Darwin are prepended with an _ and not on linux. clang on darwin runs the .s file through the preprocessor, but it doesn't seem to on linux. I renamed interrupt-and-trap-funcs.s to interrupt-and-trap-funcs.c and run it through the preprocessor then assemble it in Makefile now. The linux version would probably run on other AArch64 systems like FreeBSD but I haven't checked those.
because it can run on darwin or linux aarch64.
✅ With the latest revision this PR passed the C/C++ code formatter. |
Minor syntax adaptations to the assembly file to build on an aarch64 unbuntu 24.04, test updated and will run on linux or darwin now. |
Add two more checks for non-ABI compliant register availability from the custom UnwindPlans.
Ah, lldb-server doesn't secretly migrate the pc past a builtin_debugtrap() BRK instruction on linux like debugserver does, the test will inf loop never advancing past the BRK. |
where it does not advance past this BRK instruction automatically. Also add a limit on instruction stepping, so if this does get stuck in a loop it won't run forever. I haven't tested this on a linux system yet but in by-hand running on the binary, I saw the lldb-server behavior w.r.t builtin_debugtrap and this should be necessary.
.s file -- the Test python file also hardcodes the names of the functions and I don't want to prepend _ on linux to those names.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/22097 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/17444 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/15782 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/195/builds/8813 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/197/builds/5139 Here is the relevant piece of the build log for the reference
|
I had two commits to address bots. The first was my skipIf to run the test only darwin/linux only for arm64/aarch64 acted like an or so it could run on either of those. Second, the aarch64-ubuntu bot failed when I had it my assembly file named ".c" and did For now, I changed the test to only run on Darwin. |
This test is failing on the LLDB Incremental bot (arm64), which is running an older set of tools (Xcode 15.2) and OS (macOS 14.1) and the CFI directives must not be emitted correctly by either the tools or the OS. I will need to reproduce how this is compiling on that older setup and see what the issue is. Reverting for now so the bots are not blocked. This reverts commit e897cb1.
I reverted it entirely when the lldb incremental arm64 bot failed the test. On that older Xcode/OS (Xcode 15.2, macOS 14.1) there was something wrong with the CFI directives and the backtrace out of |
Re-landing this patch with small tweaks to address CI bot failures as it was run on many different configurations. I think the test may run on aarch64 Linux systems now. When a frameless function faults or is interrupted asynchronously, the UnwindPlan MAY have no register location rule for the return address register (lr on arm64); the value is simply live in the lr register when it was interrupted, and the frame below this on the stack -- e.g. sigtramp on a Unix system -- has the full register context, including that register. RegisterContextUnwind::SavedLocationForRegister, when asked to find the caller's pc value, will first see if there is a pc register location. If there isn't, on a Return Address Register architecture like arm/mips/riscv, we rewrite the register request from "pc" to "RA register", and search for a location. On frame 0 (the live frame) and an interrupted frame, the UnwindPlan may have no register location rule for the RA Reg, that is valid. A frameless function that never calls another may simply keep the return address in the live register the whole way. Our instruction emulation unwind plans explicitly add a rule (see Pavel's May 2024 change #91321 ), but an UnwindPlan sourced from debug_frame may not. I've got a case where this exactly happens - clang debug_frame for arm64 where there is no register location for the lr in a frameless function. There is a fault in the middle of this frameless function and we only get the lr value from the fault handler below this frame if lr has a register location of `IsSame`, in line with Pavel's 2024 change. Similar to how we see a request of the RA Reg from frame 0 after failing to find an unwind location for the pc register, the same style of special casing is needed when this is a function that was interrupted. Without this change, we can find the pc of the frame that was executing when it was interrupted, but we need $lr to find its caller, and we don't descend down to the trap handler to get that value, truncating the stack. rdar://145614545
Re-landing this patch with small tweaks to address CI bot failures as it was run on many different configurations. I think the test may run on aarch64 Linux systems now. When a frameless function faults or is interrupted asynchronously, the UnwindPlan MAY have no register location rule for the return address register (lr on arm64); the value is simply live in the lr register when it was interrupted, and the frame below this on the stack -- e.g. sigtramp on a Unix system -- has the full register context, including that register. RegisterContextUnwind::SavedLocationForRegister, when asked to find the caller's pc value, will first see if there is a pc register location. If there isn't, on a Return Address Register architecture like arm/mips/riscv, we rewrite the register request from "pc" to "RA register", and search for a location. On frame 0 (the live frame) and an interrupted frame, the UnwindPlan may have no register location rule for the RA Reg, that is valid. A frameless function that never calls another may simply keep the return address in the live register the whole way. Our instruction emulation unwind plans explicitly add a rule (see Pavel's May 2024 change llvm#91321 ), but an UnwindPlan sourced from debug_frame may not. I've got a case where this exactly happens - clang debug_frame for arm64 where there is no register location for the lr in a frameless function. There is a fault in the middle of this frameless function and we only get the lr value from the fault handler below this frame if lr has a register location of `IsSame`, in line with Pavel's 2024 change. Similar to how we see a request of the RA Reg from frame 0 after failing to find an unwind location for the pc register, the same style of special casing is needed when this is a function that was interrupted. Without this change, we can find the pc of the frame that was executing when it was interrupted, but we need $lr to find its caller, and we don't descend down to the trap handler to get that value, truncating the stack. rdar://145614545 (cherry picked from commit b957cc0)
test_common is force-included into every compilation, which causes problems when we're compiling assembly code, as we were in llvm#138805. This avoids that as we can include the header only when it's needed.
I looked at why the test fails on linux. It turns out it's due to a bunch of reasons, but most of them are unrelated to the problem at hand. #139545 and #139550 fix the surrounding issues, and c290e55 is enough to fix the test itself. I needed to change the compile command because, for some reason the compiler was not finding the temporary file. I didn't look at what causes the difference because I think this is more robust and idiomatic. I also needed to remove the Interestingly, the mac debug server does not step onto the breakpoint instruction at all. It skips over it when stepping over the previous instruction (i.e. that single step operation actually steps two instructions). This is most likely also some kind of a bug.
Oh... FWIW, we're in the same situation on aarch64 linux, but here we hard code the plan into lldb in |
When a frameless function faults or is interrupted asynchronously, the UnwindPlan MAY have no register location rule for the return address register (lr on arm64); the value is simply live in the lr register when it was interrupted, and the frame below this on the stack -- e.g. sigtramp on a Unix system -- has the full register context, including that register.
RegisterContextUnwind::SavedLocationForRegister, when asked to find the caller's pc value, will first see if there is a pc register location. If there isn't, on a Return Address Register architecture like arm/mips/riscv, we rewrite the register request from "pc" to "RA register", and search for a location.
On frame 0 (the live frame) and an interrupted frame, the UnwindPlan may have no register location rule for the RA Reg, that is valid. A frameless function that never calls another may simply keep the return address in the live register the whole way. Our instruction emulation unwind plans explicitly add a rule (see Pavel's May 2024 change #91321 ), but an UnwindPlan sourced from debug_frame may not.
I've got a case where this exactly happens - clang debug_frame for arm64 where there is no register location for the lr in a frameless function. There is a fault in the middle of this frameless function and we only get the lr value from the fault handler below this frame if lr has a register location of
IsSame
, in line with Pavel's 2024 change.Similar to how we see a request of the RA Reg from frame 0 after failing to find an unwind location for the pc register, the same style of special casing is needed when this is a function that was interrupted.
Without this change, we can find the pc of the frame that was executing when it was interrupted, but we need $lr to find its caller, and we don't descend down to the trap handler to get that value, truncating the stack.
rdar://145614545