Skip to content

[Deepin-Kernel-SIG] [linux 6.6-y] 修复 ARM64 架构下 HAOC 启用时可能出现的休眠后崩溃问题 #878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2025

Conversation

amjac27
Copy link
Contributor

@amjac27 amjac27 commented Jun 13, 2025

HAOC: Fix conflict between IEE and KPTI while modifying ASID

community inclusion
category: bugfix


Fix 2 bugs:

Bug 1: Crash upon resume from suspend.
This issue occurred due to missing the proper use of the ISB (Instruction Synchronization Barrier) instruction after writing to the TTBR0 register via the IEE SIP interface. The absence of ISB prevented the modification from immediately taking effect, causing errors in subsequent code accessing addresses within the TTBR0 range. The problem was fixed by adding an ISB instruction in the IEE SIP gate.

Bug 2: IEE ASID usage incompatible with KPTI.
When Kernel Page Table Isolation (KPTI) was enabled along with HAOC initialization, conflicts in the Translation Lookaside Buffer (TLB) due to ASID handling incompatibility led to hangs during initialization. To resolve this, the modifications by IEE to the original ARM kernel ASID handling logic were reverted. Now, the storage of the IEE ASID has been shifted from TTBR1 to TTBR0, while the kernel continues to use TTBR1 to store user ASIDs.

Fixes: 6d2d4fa ("HAOC: Add support for AArch64 Isolated Execution Environment(IEE).")

Summary by Sourcery

Fix suspend-resume crash under ARM64 HAOC by adding an ISB after TTBR0 updates, and restore kernel ASID handling to avoid TLB hangs with KPTI by moving the IEE ASID into TTBR0 and reverting conflicting logic, while refactoring IEE context-switch and initialization code for consistency.

Bug Fixes:

  • Add ISB barrier in the IEE SIP gate after writing TTBR0 to prevent stale TLB entries and eliminate resume-from-suspend crashes.
  • Revert IEE’s kernel ASID modifications and shift IEE ASID storage into TTBR0 to resolve TLB conflicts and initialization hangs when KPTI is enabled.

Enhancements:

  • Refactor IEE SI handler and context-switch paths to consistently pass and install both TTBR1 and TTBR0 ASID values, including a pre-init branch.
  • Introduce a TTBR0 consistency check under CONFIG_IEE_PTRP and remove legacy IEE initialization data setup functions.

Copy link

sourcery-ai bot commented Jun 13, 2025

Reviewer's Guide

This PR adds synchronization barriers and restructures ASID management by shifting the IEE ASID into TTBR0 and restoring the kernel’s use of TTBR1, refactors the IEE SIP handler to handle new context-switch flags, updates the kernel MM path to segregate user and IEE ASIDs, and cleans up legacy initialization code to prevent resume-from-suspend crashes and eliminate conflicts with KPTI.

Sequence Diagram for Updated IEE Context Switch

sequenceDiagram
    participant Kernel as cpu_do_switch_mm
    participant IEE_SIP as iee_rwx_gate
    participant IEE_SI as iee_si_handler
    participant CPU_Regs as CPU Registers

    Kernel->>IEE_SIP: Call iee_rwx_gate(IEE_SI_CONTEXT_SWITCH, ttbr1_with_user_ASID, ttbr0_base_pgd)
    IEE_SIP->>CPU_Regs: Modify TCR_EL1.A1 = 0 (selects TTBR0 for ASID lookup)
    IEE_SIP->>IEE_SI: Call iee_si_handler(IEE_SI_CONTEXT_SWITCH, ttbr1_with_user_ASID, ttbr0_base_pgd)
    IEE_SI->>CPU_Regs: Write ttbr1_with_user_ASID to TTBR1_EL1
    IEE_SI->>CPU_Regs: Write (ttbr0_base_pgd | IEE_ASID) to TTBR0_EL1
    IEE_SI-->>IEE_SIP: Return
    IEE_SIP->>IEE_SIP: Execute ISB instruction
    IEE_SIP->>CPU_Regs: Modify TCR_EL1.A1 = 1 (selects TTBR1 for user ASID lookup)
    IEE_SIP-->>Kernel: Return
Loading

Sequence Diagram for IEE SIP Call with ISB Synchronization

sequenceDiagram
    participant Caller as Kernel Code
    participant IEE_SIP as iee_rwx_gate
    participant IEE_SI as iee_si_handler
    participant CPU_Regs as CPU Registers

    Caller->>IEE_SIP: Call iee_rwx_gate(flag, args...)
    IEE_SIP->>CPU_Regs: Modify TCR_EL1.A1 = 0 (Enter IEE Mode)
    IEE_SIP->>IEE_SI: Call iee_si_handler(flag, args...)
    IEE_SI->>CPU_Regs: Perform register write (e.g., TTBR0_EL1)
    IEE_SI-->>IEE_SIP: Return from handler
    IEE_SIP->>IEE_SIP: Execute ISB instruction (Synchronization Fix)
    IEE_SIP->>CPU_Regs: Modify TCR_EL1.A1 = 1 (Exit IEE Mode)
    IEE_SIP-->>Caller: Return from gate
Loading

Updated Class Diagram for IEE ASID Kernel Components

classDiagram
    direction TB

    class IEE_SI_Flags {
      <<enumeration>>
      +IEE_SI_SET_SCTLR
      +IEE_SI_SET_VBAR
      +IEE_SI_SET_TTBR0
      +IEE_SI_CONTEXT_SWITCH
      +IEE_SI_CONTEXT_SWITCH_PRE_INIT
    }

    class `iee_si_handler()` {
      +unsigned long iee_si_handler(int flag, ...)
      +void iee_si_check_ttbr0() 
    }
    `iee_si_handler()` ..> IEE_SI_Flags : uses

    class `cpu_do_switch_mm()` {
      +void cpu_do_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm)
    }

    class `iee_setup_asid()` {
      +void iee_setup_asid()
    }

    class `iee_rwx_gate()` {
      <<Assembly>>
      +void iee_rwx_gate(int flag, ...)
    }

    class `asids_update_limit()` {
        +int asids_update_limit()
    }

    class `asids_init()` {
        +int asids_init()
    }

    `cpu_do_switch_mm()` --> `iee_rwx_gate()` : calls
    `iee_rwx_gate()` --> `iee_si_handler()` : calls
Loading

File-Level Changes

Change Details Files
Ensure immediate effect of IEE TTBR modifications and correct TCR bit toggling via added barriers and assembly updates
  • Insert ISB after the IEE SI handler call to enforce TTBR updates
  • Replace OR TCR_A1 with BIC/OR sequences in IEE entry/exit gates
  • Remove TCR_A1 from the IEE TCR mask during MMU setup
arch/arm64/kernel/haoc/iee/iee-si-gate.S
arch/arm64/kernel/haoc/iee/iee-gate.S
arch/arm64/kernel/haoc/iee/iee-mmu.c
Overhaul IEE SI handler to support new ASID placement and fix resume crashes
  • Add iee_si_check_ttbr0 to validate TTBR0 against current pgd
  • Implement SET_TTBR0 handling to load reserved IEE ASID and write TTBR0
  • Refactor CONTEXT_SWITCH and introduce CONTEXT_SWITCH_PRE_INIT with separate TTBR1/TTBR0 writes
arch/arm64/kernel/haoc/iee/iee-si.c
arch/arm64/include/asm/haoc/iee-si.h
Revise kernel MM context switch to segregate user and IEE ASIDs across TTBR1/TTBR0
  • Mask and set user ASID in TTBR1, reserve TTBR0 for IEE ASID
  • Invoke IEE_SI gates with both TTBR1 and TTBR0 arguments
  • Enforce direct sysreg writes and ISB in pre- and post-init paths
arch/arm64/mm/context.c
Adjust ASID mapping logic and macros for compatible even-numbered IEE ASID
  • Use IEE_ASID & ~ASID_BIT for asid_map and pinned_asid_map bit operations
  • Apply IEE ASID OR when writing reserved TTBR0 in cpu_set_reserved and cpu_install
  • Update IEE_ASID macro definition to reflect TTBR0 placement
arch/arm64/include/asm/mmu_context.h
arch/arm64/include/asm/haoc/iee-asm.h
arch/arm64/mm/context.c
Simplify IEE initialization by removing legacy ASID relocation and redundant setup
  • Streamline iee_setup_asid to only write the IEE ASID to TTBR0 and issue ISB
  • Remove legacy context data setup and TCR_A1 modifications
  • Omit per-page init data copying logic
arch/arm64/kernel/haoc/iee/iee-init.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign opsiff for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deepin-ci-robot
Copy link

Hi @amjac27. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @amjac27 - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@opsiff opsiff changed the title 修复 ARM64 架构下 HAOC 启用时可能出现的休眠后崩溃问题 [Deepin-Kernel-SIG] [linux 6.6-y] 修复 ARM64 架构下 HAOC 启用时可能出现的休眠后崩溃问题 Jun 13, 2025
community inclusion
category: bugfix

--------------------------------

Fix 2 bugs:

Bug 1: Crash upon resume from suspend.
This issue occurred due to missing the proper use of the ISB (Instruction
Synchronization Barrier) instruction after writing to the TTBR0 register via
the IEE SIP interface. The absence of ISB prevented the modification from
immediately taking effect, causing errors in subsequent code accessing
addresses within the TTBR0 range. The problem was fixed by adding an ISB
instruction in the IEE SIP gate.

Bug 2: IEE ASID usage incompatible with KPTI.
When Kernel Page Table Isolation (KPTI) was enabled along with HAOC
initialization, conflicts in the Translation Lookaside Buffer (TLB) due to ASID
handling incompatibility led to hangs during initialization. To resolve this,
the modifications by IEE to the original ARM kernel ASID handling logic were
reverted. Now, the storage of the IEE ASID has been shifted from TTBR1 to
TTBR0, while the kernel continues to use TTBR1 to store user ASIDs.

Fixes: 6d2d4fa ("HAOC: Add support for AArch64 Isolated Execution Environment(IEE).")
Signed-off-by: Lyu Jinglin <[email protected]>
Signed-off-by: Liu Zhehui <[email protected]>
@Avenger-285714
Copy link
Collaborator

/ok-to-test

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes two ARM64 HAOC issues: a suspend-resume crash due to missing ISB after TTBR0 updates, and a KPTI conflict by reverting ASID handling in TTBR1 and moving the IEE ASID into TTBR0. It refactors context-switch paths, updates SI handler logic, and cleans up legacy initialization code.

  • Add ISB in IEE SIP gate and adjust TCR.A1 toggling for proper barrier and isolation.
  • Move IEE ASID storage to TTBR0 and revert conflicting kernel ASID logic in TTBR1.
  • Refactor SI handler (iee-si.c), MMU init (iee-mmu.c), and init routines to streamline ASID setup and remove legacy data paths.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
arch/arm64/mm/context.c Refactored ASID assignment in cpu_do_switch_mm for IEE/KPTI
arch/arm64/kernel/haoc/iee/iee-si.c Rewrote SI handler, added TTBR0 consistency check and PTRP path
arch/arm64/kernel/haoc/iee/iee-si-gate.S Added isb after gate entry and flipped TCR.A1 setup/teardown
arch/arm64/kernel/haoc/iee/iee-mmu.c Updated TCR mask operations for IEE TCR initialization
arch/arm64/kernel/haoc/iee/iee-init.c Simplified ASID setup, removed legacy init-data
arch/arm64/kernel/haoc/iee/iee-gate.S Adjusted TCR.A1 bit toggling in protected RW gate
arch/arm64/include/asm/mmu_context.h Applied IEE ASID in TTBR helpers under CONFIG_IEE
arch/arm64/include/asm/haoc/iee-si.h Added IEE_SI_CONTEXT_SWITCH_PRE_INIT enum flag
arch/arm64/include/asm/haoc/iee-asm.h Updated comment and moved IEE_ASID definition to TTBR0 usage
Comments suppressed due to low confidence (3)

arch/arm64/kernel/haoc/iee/iee-init.c:45

  • The original implementation cleared TCR.A1 (write_sysreg(tcr_el1 & ~TCR_A1)) after loading the IEE ASID. That write was removed here, so TCR.A1 may remain set and prevent the TTBR0 ASID change from taking effect. Restore the TCR.A1-clear write to ensure the ASID switch happens correctly.
write_sysreg(ttbr0, ttbr0_el1);

arch/arm64/kernel/haoc/iee/iee-si-gate.S:44

  • [nitpick] The comment above still describes setting TCR_A1 with OR, but this line clears it. Please update the comment to reflect that TCR_A1 is being cleared here to avoid confusion.
bic x12, x12, #TCR_A1

arch/arm64/kernel/haoc/iee/iee-gate.S:53

  • [nitpick] The preceding comment indicates OR-ing TCR_A1, but this line actually clears it. Consider updating the comment block to match the new behavior and maintain clarity.
bic x12, x12, #TCR_A1

@@ -77,6 +102,11 @@ unsigned long __iee_si_code iee_si_handler(int flag, ...)
return 0;
}

static inline void iee_si_setup_data(void)
Copy link
Preview

Copilot AI Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function iee_si_setup_data initializes iee_si_reserved_pg_dir but is never called. This means TTBR0 consistency checks will use an uninitialized value and always trigger errors. Consider invoking it early (for example in iee_si_init) before any SI operations.

Copilot uses AI. Check for mistakes.

@opsiff
Copy link
Member

opsiff commented Jun 16, 2025

测试通过

@opsiff opsiff merged commit dc647ee into deepin-community:linux-6.6.y Jun 16, 2025
7 of 8 checks passed
Copy link

@dongert dongert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nr_pinned_asids ++;
__set_bit(ctxid2asid(IEE_ASID), asid_map);

@Jinglin-lyu
Copy link
Contributor

nr_pinned_asids ++; __set_bit(ctxid2asid(IEE_ASID), asid_map);

I indeed did not set nr_pinned_asids, but this doesn't seem to significantly impact the kernel's normal operation. This variable appears to only affect the refresh rate of ASIDs. If necessary, we will submit a new patch to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants