Skip to content

[pull] master from torvalds:master #1942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 17, 2025
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Jun 17, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

xinli-intel and others added 9 commits June 9, 2025 08:50
…rn from SIGTRAP handler

Clear the software event flag in the augmented SS to prevent immediate
repeat of single step trap on return from SIGTRAP handler if the trap
flag (TF) is set without an external debugger attached.

Following is a typical single-stepping flow for a user process:

1) The user process is prepared for single-stepping by setting
   RFLAGS.TF = 1.
2) When any instruction in user space completes, a #DB is triggered.
3) The kernel handles the #DB and returns to user space, invoking the
   SIGTRAP handler with RFLAGS.TF = 0.
4) After the SIGTRAP handler finishes, the user process performs a
   sigreturn syscall, restoring the original state, including
   RFLAGS.TF = 1.
5) Goto step 2.

According to the FRED specification:

A) Bit 17 in the augmented SS is designated as the software event
   flag, which is set to 1 for FRED event delivery of SYSCALL,
   SYSENTER, or INT n.
B) If bit 17 of the augmented SS is 1 and ERETU would result in
   RFLAGS.TF = 1, a single-step trap will be pending upon completion
   of ERETU.

In step 4) above, the software event flag is set upon the sigreturn
syscall, and its corresponding ERETU would restore RFLAGS.TF = 1.
This combination causes a pending single-step trap upon completion of
ERETU.  Therefore, another #DB is triggered before any user space
instruction is executed, which leads to an infinite loop in which the
SIGTRAP handler keeps being invoked on the same user space IP.

Fixes: 14619d9 ("x86/fred: FRED entry/exit and dispatch code")
Suggested-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li (Intel) <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Cc:[email protected]
Link: https://lore.kernel.org/all/20250609084054.2083189-2-xin%40zytor.com
When FRED is enabled, if the Trap Flag (TF) is set without an external
debugger attached, it can lead to an infinite loop in the SIGTRAP
handler.  To avoid this, the software event flag in the augmented SS
must be cleared, ensuring that no single-step trap remains pending when
ERETU completes.

This test checks for that specific scenario—verifying whether the kernel
correctly prevents an infinite SIGTRAP loop in this edge case when FRED
is enabled.

The test should _always_ pass with IDT event delivery, thus no need to
disable the test even when FRED is not enabled.

Signed-off-by: Xin Li (Intel) <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Cc:[email protected]
Link: https://lore.kernel.org/all/20250609084054.2083189-3-xin%40zytor.com
Two 'static inline' TDX helper functions (sc_retry() and
sc_retry_prerr()) take function pointer arguments which refer to
assembly functions.  Normally, the compiler inlines the TDX helper,
realizes that the function pointer targets are completely static --
thus can be resolved at compile time -- and generates direct call
instructions.

But, other times (like when CONFIG_CC_OPTIMIZE_FOR_SIZE=y), the
compiler declines to inline the helpers and will instead generate
indirect call instructions.

Indirect calls to assembly functions require special annotation (for
various Control Flow Integrity mechanisms).  But TDX assembly
functions lack the special annotations and can only be called
directly.

Annotate both the helpers as '__always_inline' to prod the compiler
into maintaining the direct calls. There is no guarantee here, but
Peter has volunteered to report the compiler bug if this assumption
ever breaks[1].

Fixes: 1e66a7e ("x86/virt/tdx: Handle SEAMCALL no entropy error in common code")
Fixes: df01f5a ("x86/virt/tdx: Add SEAMCALL error printing for module initialization")
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/ [1]
Link: https://lore.kernel.org/all/20250606130737.30713-1-kai.huang%40intel.com
Collapsing pages to a leaf PMD or PUD should be done only if
X86_FEATURE_PSE is available, which is not the case when running e.g.
as a Xen PV guest.

Fixes: 41d8848 ("x86/mm/pat: restore large ROX pages after fragmentation")
Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
…is set

Currently ROX cache in execmem is enabled regardless of
STRICT_MODULE_RWX setting. This breaks an assumption that module memory
is writable when STRICT_MODULE_RWX is disabled, for instance for kernel
debuggin.

Only enable ROX cache in execmem when STRICT_MODULE_RWX is set to
restore the original behaviour of module text permissions.

Fixes: 64f6a4e ("x86: re-enable EXECMEM_ROX support")
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
The of pages with ITS thunks allocated for modules are tracked by an
array in 'struct module'.

Since this is very architecture specific data structure, move it to
'struct mod_arch_specific'.

No functional changes.

Fixes: 872df34 ("x86/its: Use dynamic thunks for indirect branches")
Suggested-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
execmem_alloc() sets permissions differently depending on the kernel
configuration, CPU support for PSE and whether a page is allocated
before or after mark_rodata_ro().

Add tracking for pages allocated for ITS when patching the core kernel
and make sure the permissions for ITS pages are explicitly managed for
both kernel and module allocations.

Fixes: 872df34 ("x86/its: Use dynamic thunks for indirect branches")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Co-developed-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Nikolay Borisov <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
The commit d6d1e3e ("mm/execmem: Unify early execmem_cache
behaviour") changed early behaviour of execemem ROX cache to allow its
usage in early x86 code that allocates text pages when
CONFIG_MITGATION_ITS is enabled.

The permission management of the pages allocated from execmem for ITS
mitigation is now completely contained in arch/x86/kernel/alternatives.c
and therefore there is no need to special case early allocations in
execmem.

This reverts commit d6d1e3e.

Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
…inux/kernel/git/tip/tip

Pull x86 fixes from Dave Hansen:
 "This is a pretty scattered set of fixes. The majority of them are
  further fixups around the recent ITS mitigations.

  The rest don't really have a coherent story:

   - Some flavors of Xen PV guests don't support large pages, but the
     set_memory.c code assumes all CPUs support them.

     Avoid problems with a quick CPU feature check.

   - The TDX code has some wrappers to help retry calls to the TDX
     module. They use function pointers to assembly functions and the
     compiler usually generates direct CALLs. But some new compilers,
     plus -Os turned them in to indirect CALLs and the assembly code was
     not annotated for indirect calls.

     Force inlining of the helper to fix it up.

   - Last, a FRED issue showed up when single-stepping. It's fine when
     using an external debugger, but was getting stuck returning from a
     SIGTRAP handler otherwise.

     Clear the FRED 'swevent' bit to ensure that forward progress is
     made"

* tag 'x86_urgent_for_6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "mm/execmem: Unify early execmem_cache behaviour"
  x86/its: explicitly manage permissions for ITS pages
  x86/its: move its_pages array to struct mod_arch_specific
  x86/Kconfig: only enable ROX cache in execmem when STRICT_MODULE_RWX is set
  x86/mm/pat: don't collapse pages without PSE set
  x86/virt/tdx: Avoid indirect calls to TDX assembly functions
  selftests/x86: Add a test to detect infinite SIGTRAP handler loop
  x86/fred/signal: Prevent immediate repeat of single step trap on return from SIGTRAP handler
@pull pull bot added the ⤵️ pull label Jun 17, 2025
@pull pull bot merged commit 9afe652 into AwesomeGitHubRepos:master Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants