[lldb] Change lldb's breakpoint handling behavior

lldb today has two rules: When a thread stops at a BreakpointSite, we set the thread's StopReason to be "breakpoint hit" (regardless if we've actually hit the breakpoint, or if we've merely stopped *at* the breakpoint instruction/point and haven't tripped it yet). And second, when resuming a process, any thread sitting at a BreakpointSite is silently stepped over the BreakpointSite -- because we've already flagged the breakpoint hit when we stopped there originally. In this patch, I change lldb to only set a thread's stop reason to breakpoint-hit when we've actually executed the instruction/triggered the breakpoint. When we resume, we only silently step past a BreakpointSite that we've registered as hit. We preserve this state across inferior function calls that the user may do while stopped, etc. Also, when a user adds a new breakpoint at $pc while stopped, or changes $pc to be the address of a BreakpointSite, we will silently step past that breakpoint when the process resumes. This is purely a UX call, I don't think there's any person who wants to set a breakpoint at $pc and then hit it immediately on resuming. One non-intuitive UX from this change, but I'm convinced it is necessary: If you're stopped at a BreakpointSite that has not yet executed, you `stepi`, you will hit the breakpoint and the pc will not yet advance. This thread has not completed its stepi, and the thread plan is still on the stack. If you then `continue` the thread, lldb will now stop and say, "instruction step completed", one instruction past the BreakpointSite. You can continue a second time to resume execution. I discussed this with Jim, and trying to paper over this behavior will lead to more complicated scenarios behaving non-intuitively. And mostly it's the testsuite that was trying to instruction step past a breakpoint and getting thrown off -- and I changed those tests to expect the new behavior. The bugs driving this change are all from lldb dropping the real stop reason for a thread and setting it to breakpoint-hit when that was not the case. Jim hit one where we have an aarch64 watchpoint that triggers one instruction before a BreakpointSite. On this arch we are notified of the watchpoint hit after the instruction has been unrolled -- we disable the watchpoint, instruction step, re-enable the watchpoint and collect the new value. But now we're on a BreakpointSite so the watchpoint-hit stop reason is lost. Another was reported by ZequanWu in https://discourse.llvm.org/t/lldb-unable-to-break-at-start/78282 we attach to/launch a process with the pc at a BreakpointSite and misbehave. Caroline Tice mentioned it is also a problem they've had with putting a breakpoint on _dl_debug_state. The change to each Process plugin that does execution control is that 1. If we've stopped at a BreakpointSite (whether we hit it or not), we call Thread::SetThreadStoppedAtBreakpointSite(pc) to record the state at the point when the thread stopped. (so we can detect newly-added breakpoints, or when the pc is changed to an instruction that is a BreakpointSite) 2. When we have actually hit a breakpoint, and it is enabled for this thread, we call Thread::SetThreadHitBreakpointAtAddr(pc) so we know that it should be silently stepped past when we resume execution. When resuming, we silently step over a breakpoint if we've hit it, or if it is newly added (or the pc was changed to an existing BreakpointSite). The biggest set of changes is to StopInfoMachException where we translate a Mach Exception into a stop reason. The Mach exception codes differ in a few places depending on the target (unambiguously), and I didn't want to duplicate the new code for each target so I've tested what mach exceptions we get for each action on each target, and reorganized StopInfoMachException::CreateStopReasonWithMachException to document these possible values, and handle them without specializing based on the target arch. rdar://123942164
llvm · jasonmolenda · Jul 20, 2024 · Jun 21, 2024 · Jun 21, 2024 · Jun 21, 2024
commit 9b541e6a035635e26c6a24eca022de8552fa4c17
@@ -128,6 +128,11 @@ class Thread : public std::enable_shared_from_this<Thread>,
         register_backup_sp; // You need to restore the registers, of course...
     uint32_t current_inlined_depth;
     lldb::addr_t current_inlined_pc;
+    lldb::addr_t
+        hit_bp_at_addr; // Set to the address of a breakpoint that we have hit.
+    lldb::addr_t bpsite_at_stop_pc; // Set to the address of a breakpoint
+                                    // instruction that we have not yet hit, but
+                                    // will hit when we resume.
   };
 
   /// Constructor
@@ -377,6 +382,19 @@ class Thread : public std::enable_shared_from_this<Thread>,
 
   virtual void SetQueueLibdispatchQueueAddress(lldb::addr_t dispatch_queue_t) {}
 
+  /// When a thread has executed/trapped a breakpoint, set the address of that
+  /// breakpoint so we know it has been hit already, and should be silently
+  /// stepped past on resume.
+  void SetThreadHitBreakpointAtAddr(lldb::addr_t pc) { m_hit_bp_at_addr = pc; }
+
+  /// When a thread stops at a breakpoint instruction/address, but has not yet
+  /// executed/triggered it, record that so we can detect when a user adds a
+  /// breakpoint (or changes a thread to a breakpoint site) and we need to
+  /// silently step past that when resuming.
+  void SetThreadStoppedAtBreakpointSite(lldb::addr_t pc) {
+    m_bpsite_at_stop_pc = pc;
+  }
+
   /// Whether this Thread already has all the Queue information cached or not
   ///
   /// A Thread may be associated with a libdispatch work Queue at a given
@@ -1311,6 +1329,17 @@ class Thread : public std::enable_shared_from_this<Thread>,
   bool m_should_run_before_public_stop;  // If this thread has "stop others" 
                                          // private work to do, then it will
                                          // set this.
+  lldb::addr_t m_hit_bp_at_addr;    // If this thread originally stopped at a
+                                    // breakpoint instruction, AND HIT IT,
+                                    // record the address of that breakpoint.
+                                    // LLDB_INVALID_ADDRESS if this thread did
+                                    // not stop at a breakpoint insn, or did not
+                                    // hit it yet.
+  lldb::addr_t m_bpsite_at_stop_pc; // If this thread originally stopped at a
+                                    // breakpoint site, record the address of
+                                    // that breakpoint site.
+                                    // LLDB_INVALID_ADDRESS if this thread did
+                                    // not stop at a breakpoint site.
   const uint32_t m_index_id; ///< A unique 1 based index assigned to each thread
                              /// for easy UI/command line access.
   lldb::RegisterContextSP m_reg_context_sp; ///< The register context for this

@@ -488,38 +488,6 @@ const char *StopInfoMachException::GetDescription() {
   return m_description.c_str();
 }
 
-static StopInfoSP GetStopInfoForHardwareBP(Thread &thread, Target *target,
-                                           uint32_t exc_data_count,
-                                           uint64_t exc_sub_code,
-                                           uint64_t exc_sub_sub_code) {
-  // Try hardware watchpoint.
-  if (target) {
-    // The exc_sub_code indicates the data break address.
-    WatchpointResourceSP wp_rsrc_sp =
-        target->GetProcessSP()->GetWatchpointResourceList().FindByAddress(
-            (addr_t)exc_sub_code);
-    if (wp_rsrc_sp && wp_rsrc_sp->GetNumberOfConstituents() > 0) {
-      return StopInfo::CreateStopReasonWithWatchpointID(
-          thread, wp_rsrc_sp->GetConstituentAtIndex(0)->GetID());
-    }
-  }
-
-  // Try hardware breakpoint.
-  ProcessSP process_sp(thread.GetProcess());
-  if (process_sp) {
-    // The exc_sub_code indicates the data break address.
-    lldb::BreakpointSiteSP bp_sp =
-        process_sp->GetBreakpointSiteList().FindByAddress(
-            (lldb::addr_t)exc_sub_code);
-    if (bp_sp && bp_sp->IsEnabled()) {
-      return StopInfo::CreateStopReasonWithBreakpointSiteID(thread,
-                                                            bp_sp->GetID());
-    }
-  }
-
-  return nullptr;
-}
-
 #if defined(__APPLE__)
 const char *
 StopInfoMachException::MachException::Name(exception_type_t exc_type) {
@@ -633,171 +601,143 @@ StopInfoSP StopInfoMachException::CreateStopReasonWithMachException(
     }
     break;
 
+  // [exc_type, exc_code, exc_sub_code, exc_sub_sub_code]
+  //
+  // Instruction step:
+  //   [6, 1, 0]
+  //   Intel KDP [6, 3, ??]
+  //   armv7 [6, 0x102, <stop-pc>]  Same as software breakpoint!
+  //
+  // Software breakpoint:
+  //   x86 [6, 2, 0]
+  //   Intel KDP [6, 2, <bp-addr + 1>]
+  //   arm64 [6, 1, <bp-addr>]
+  //   armv7 [6, 0x102, <bp-addr>]  Same as instruction step!
+  //
+  // Hardware breakpoint:
+  //   x86 [6, 1, <bp-addr>, 0]
+  //   x86/Rosetta not implemented, see software breakpoint
+  //   arm64 [6, 1, <bp-addr>]
+  //   armv7 not implemented, see software breakpoint
+  //
+  // Hardware watchpoint:
+  //   x86 [6, 1, <accessed-addr>, 0] (both Intel hw and Rosetta)
+  //   arm64 [6, 0x102, <accessed-addr>, 0]
+  //   armv7 [6, 0x102, <accessed-addr>, 0]
+  //
+  // arm64 BRK instruction (imm arg not reflected in the ME)
+  //   [ 6, 1, <addr-of-BRK-insn>]
+  //
+  // In order of codes mach exceptions:
+  //   [6, 1, 0] - instruction step
+  //   [6, 1, <bp-addr>] - hardware breakpoint or watchpoint
+  //
+  //   [6, 2, 0] - software breakpoint
+  //   [6, 2, <bp-addr + 1>] - software breakpoint
+  //
+  //   [6, 3] - instruction step
+  //
+  //   [6, 0x102, <stop-pc>] armv7 instruction step
+  //   [6, 0x102, <bp-addr>] armv7 software breakpoint
+  //   [6, 0x102, <accessed-addr>, 0] arm64/armv7 watchpoint
   case 6: // EXC_BREAKPOINT
   {
-    bool is_actual_breakpoint = false;
-    bool is_trace_if_actual_breakpoint_missing = false;
-    switch (cpu) {
-    case llvm::Triple::x86:
-    case llvm::Triple::x86_64:
-      if (exc_code == 1) // EXC_I386_SGL
-      {
-        if (!exc_sub_code) {
-          // This looks like a plain trap.
-          // Have to check if there is a breakpoint here as well.  When you
-          // single-step onto a trap, the single step stops you not to trap.
-          // Since we also do that check below, let's just use that logic.
-          is_actual_breakpoint = true;
-          is_trace_if_actual_breakpoint_missing = true;
-        } else {
-          if (StopInfoSP stop_info =
-                  GetStopInfoForHardwareBP(thread, target, exc_data_count,
-                                           exc_sub_code, exc_sub_sub_code))
-            return stop_info;
-        }
-      } else if (exc_code == 2 || // EXC_I386_BPT
-                 exc_code == 3)   // EXC_I386_BPTFLT
-      {
-        // KDP returns EXC_I386_BPTFLT for trace breakpoints
-        if (exc_code == 3)
-          is_trace_if_actual_breakpoint_missing = true;
-
-        is_actual_breakpoint = true;
-        if (!pc_already_adjusted)
-          pc_decrement = 1;
-      }
-      break;
+    bool stopped_by_hitting_breakpoint = false;
+    bool stopped_by_completing_stepi = false;
+    bool stopped_watchpoint = false;
+    std::optional<addr_t> value;
+
+    // exc_code 1
+    if (exc_code == 1 && exc_sub_code == 0)
+      stopped_by_completing_stepi = true;
+    if (exc_code == 1 && exc_sub_code != 0) {
+      stopped_by_hitting_breakpoint = true;
+      stopped_watchpoint = true;
+      value = exc_sub_code;
+    }
 
-    case llvm::Triple::arm:
-    case llvm::Triple::thumb:
-      if (exc_code == 0x102) // EXC_ARM_DA_DEBUG
-      {
-        // LWP_TODO: We need to find the WatchpointResource that matches
-        // the address, and evaluate its Watchpoints.
-
-        // It's a watchpoint, then, if the exc_sub_code indicates a
-        // known/enabled data break address from our watchpoint list.
-        lldb::WatchpointSP wp_sp;
-        if (target)
-          wp_sp = target->GetWatchpointList().FindByAddress(
-              (lldb::addr_t)exc_sub_code);
-        if (wp_sp && wp_sp->IsEnabled()) {
-          return StopInfo::CreateStopReasonWithWatchpointID(thread,
-                                                            wp_sp->GetID());
-        } else {
-          is_actual_breakpoint = true;
-          is_trace_if_actual_breakpoint_missing = true;
-        }
-      } else if (exc_code == 1) // EXC_ARM_BREAKPOINT
-      {
-        is_actual_breakpoint = true;
-        is_trace_if_actual_breakpoint_missing = true;
-      } else if (exc_code == 0) // FIXME not EXC_ARM_BREAKPOINT but a kernel
-                                // is currently returning this so accept it
-                                // as indicating a breakpoint until the
-                                // kernel is fixed
-      {
-        is_actual_breakpoint = true;
-        is_trace_if_actual_breakpoint_missing = true;
-      }
-      break;
+    // exc_code 2
+    if (exc_code == 2 && exc_sub_code == 0)
+      stopped_by_hitting_breakpoint = true;
+    if (exc_code == 2 && exc_sub_code != 0) {
+      stopped_by_hitting_breakpoint = true;
+      // Intel KDP software breakpoint
+      if (!pc_already_adjusted)
+        pc_decrement = 1;
+    }
 
-    case llvm::Triple::aarch64_32:
-    case llvm::Triple::aarch64: {
-      // xnu describes three things with type EXC_BREAKPOINT:
-      //
-      //   exc_code 0x102 [EXC_ARM_DA_DEBUG], exc_sub_code addr-of-insn
-      //      Watchpoint access.  exc_sub_code is the address of the
-      //      instruction which trigged the watchpoint trap.
-      //      debugserver may add the watchpoint number that was triggered
-      //      in exc_sub_sub_code.
-      //
-      //   exc_code 1 [EXC_ARM_BREAKPOINT], exc_sub_code 0
-      //      Instruction step has completed.
-      //
-      //   exc_code 1 [EXC_ARM_BREAKPOINT], exc_sub_code address-of-instruction
-      //      Software breakpoint instruction executed.
-
-      if (exc_code == 1 && exc_sub_code == 0) // EXC_ARM_BREAKPOINT
-      {
-        // This is hit when we single instruction step aka MDSCR_EL1 SS bit 0
-        // is set
-        is_actual_breakpoint = true;
-        is_trace_if_actual_breakpoint_missing = true;
-        if (thread.GetTemporaryResumeState() != eStateStepping)
-          not_stepping_but_got_singlestep_exception = true;
-      }
-      if (exc_code == 0x102) // EXC_ARM_DA_DEBUG
-      {
-        // LWP_TODO: We need to find the WatchpointResource that matches
-        // the address, and evaluate its Watchpoints.
-
-        // It's a watchpoint, then, if the exc_sub_code indicates a
-        // known/enabled data break address from our watchpoint list.
-        lldb::WatchpointSP wp_sp;
-        if (target)
-          wp_sp = target->GetWatchpointList().FindByAddress(
-              (lldb::addr_t)exc_sub_code);
-        if (wp_sp && wp_sp->IsEnabled()) {
-          return StopInfo::CreateStopReasonWithWatchpointID(thread,
-                                                            wp_sp->GetID());
-        }
-        // EXC_ARM_DA_DEBUG seems to be reused for EXC_BREAKPOINT as well as
-        // EXC_BAD_ACCESS
-        if (thread.GetTemporaryResumeState() == eStateStepping)
-          return StopInfo::CreateStopReasonToTrace(thread);
+    // exc_code 3
+    if (exc_code == 3)
+      stopped_by_completing_stepi = true;
+
+    // exc_code 0x102
+    if (exc_code == 0x102 && exc_sub_code != 0) {
+      if (cpu == llvm::Triple::arm || cpu == llvm::Triple::thumb) {
+        stopped_by_hitting_breakpoint = true;
+        stopped_by_completing_stepi = true;
       }
-      // It looks like exc_sub_code has the 4 bytes of the instruction that
-      // triggered the exception, i.e. our breakpoint opcode
-      is_actual_breakpoint = exc_code == 1;
-      break;
+      stopped_watchpoint = true;
+      value = exc_sub_code;
     }
 
-    default:
-      break;
-    }
+    // Go through the reasons why we stopped, starting
+    // with the easiest to detect unambiguously.  We
+    // may have multiple possible reasons set.
 
-    if (is_actual_breakpoint) {
+    if (stopped_by_hitting_breakpoint) {
+      ProcessSP process_sp(thread.GetProcess());
       RegisterContextSP reg_ctx_sp(thread.GetRegisterContext());
-      addr_t pc = reg_ctx_sp->GetPC() - pc_decrement;
-
-      ProcessSP process_sp(thread.CalculateProcess());
-
-      lldb::BreakpointSiteSP bp_site_sp;
-      if (process_sp)
+      BreakpointSiteSP bp_site_sp;
+      addr_t pc = LLDB_INVALID_ADDRESS;
+      if (reg_ctx_sp)
+        pc = reg_ctx_sp->GetPC() - pc_decrement;
+      else if (value)
+        pc = *value;
+
+      if (value)
+        bp_site_sp = process_sp->GetBreakpointSiteList().FindByAddress(*value);
+      if (!bp_site_sp && reg_ctx_sp) {
         bp_site_sp = process_sp->GetBreakpointSiteList().FindByAddress(pc);
+      }
       if (bp_site_sp && bp_site_sp->IsEnabled()) {
-        // Update the PC if we were asked to do so, but only do so if we find
-        // a breakpoint that we know about cause this could be a trap
-        // instruction in the code
-        if (pc_decrement > 0 && adjust_pc_if_needed)
-          reg_ctx_sp->SetPC(pc);
-
-        // If the breakpoint is for this thread, then we'll report the hit,
-        // but if it is for another thread, we can just report no reason.  We
-        // don't need to worry about stepping over the breakpoint here, that
-        // will be taken care of when the thread resumes and notices that
-        // there's a breakpoint under the pc. If we have an operating system
-        // plug-in, we might have set a thread specific breakpoint using the
-        // operating system thread ID, so we can't make any assumptions about
-        // the thread ID so we must always report the breakpoint regardless
-        // of the thread.
+        // If we have an operating system plug-in, we might have set a thread
+        // specific breakpoint using the operating system thread ID, so we
+        // can't make any assumptions about the thread ID so we must always
+        // report the breakpoint regardless of the thread.
         if (bp_site_sp->ValidForThisThread(thread) ||
-            thread.GetProcess()->GetOperatingSystem() != nullptr)
+            thread.GetProcess()->GetOperatingSystem() != nullptr) {
+          // Update the PC if we were asked to do so, but only do so if we find
+          // a breakpoint that we know about cause this could be a trap
+          // instruction in the code
+          if (pc_decrement > 0 && adjust_pc_if_needed && reg_ctx_sp)
+            reg_ctx_sp->SetPC(pc);
+          thread.SetThreadHitBreakpointAtAddr(pc);
           return StopInfo::CreateStopReasonWithBreakpointSiteID(
               thread, bp_site_sp->GetID());
-        else if (is_trace_if_actual_breakpoint_missing)
-          return StopInfo::CreateStopReasonToTrace(thread);
-        else
+        } else {
+          thread.SetThreadHitBreakpointAtAddr(pc);
           return StopInfoSP();
+        }
       }
+    }
 
-      // Don't call this a trace if we weren't single stepping this thread.
-      if (is_trace_if_actual_breakpoint_missing &&
-          thread.GetTemporaryResumeState() == eStateStepping) {
-        return StopInfo::CreateStopReasonToTrace(thread);
+    if (stopped_watchpoint && value) {
+      WatchpointResourceSP wp_rsrc_sp =
+          target->GetProcessSP()->GetWatchpointResourceList().FindByAddress(
+              *value);
+      if (wp_rsrc_sp && wp_rsrc_sp->GetNumberOfConstituents() > 0) {
+        return StopInfo::CreateStopReasonWithWatchpointID(
+            thread, wp_rsrc_sp->GetConstituentAtIndex(0)->GetID());
       }
     }
+
+    if (stopped_by_completing_stepi) {
+      if (thread.GetTemporaryResumeState() != eStateStepping)
+        not_stepping_but_got_singlestep_exception = true;
+      else
+        return StopInfo::CreateStopReasonToTrace(thread);
+    }
+
   } break;
 
   case 7:  // EXC_SYSCALL