Skip to content

Rust inline assembly with static mutable #140059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
EvansJahja opened this issue Apr 20, 2025 · 12 comments
Closed

Rust inline assembly with static mutable #140059

EvansJahja opened this issue Apr 20, 2025 · 12 comments
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@EvansJahja
Copy link

EvansJahja commented Apr 20, 2025

Overview: I think Rust is optimizing away checks to reading from a static mut variable that is modified in inline assembly.

I tried this code:

#[unsafe(no_mangle)]
#[used]
pub static mut flag: bool = false;

fn set_flag() {
    unsafe {
        flag = true;
    }
}

fn set_callback() {
    unsafe {
        asm!("bl {}", in(reg) set_flag) ;
    }
}

#[unsafe(no_mangle)]
pub fn start() {
    set_callback();
    unsafe {
        while flag == false { 

        }
    }
}

I expected to see this happen: I have a program that loops to check if flag is false, and I expect it to break when it becomes true. The flag is set by a function that's referenced by inline assembly.

Instead, this happened: The program stuck on infinite loop, without checking the flag variable.

Currently, my only workaround is this:

        while flag == false { 
            black_box(flag);
        }

which of course works, but I'm curious if this is indeed a bug or if I'm using Rust wrong.

Meta

rustc --version --verbose:

rustc 1.88.0-nightly (9ffde4b08 2025-04-12)
binary: rustc
commit-hash: 9ffde4b089fe8e43d5891eb517001df27a8443ff
commit-date: 2025-04-12
host: x86_64-pc-windows-msvc
release: 1.88.0-nightly
LLVM version: 20.1.2

Note: I realized I was using outdated nightly build so I tried updating but I still have the same issue with

rustc --version --verbose:

rustc 1.88.0-nightly (077cedc2a 2025-04-19)
binary: rustc
commit-hash: 077cedc2afa8ac0b727b7a6cbe012940ba228deb
commit-date: 2025-04-19
host: x86_64-pc-windows-msvc
release: 1.88.0-nightly
LLVM version: 20.1.2

I'm targetting armv4t-none-eabi, this is a no_std, build-std=["core"] setup.

LLC from Godbolt

__rustc[6dc022cbd14ae54d]::rust_begin_unwind:
.LBB0_1:                                @ %bb1
        b       .LBB0_1
projecthello::set_flag::h03825ae23cdb2371:
        ldr     r0, .LCPI1_0
        mov     r1, #1
        strb    r1, [r0]
        bx      lr
.LCPI1_0:
        .long   flag
start:
        ldr     r0, .LCPI2_0
        bl      r0
        ldr     r0, .LCPI2_1
        ldrb    r0, [r0]
        cmp     r0, #0
        bxne    lr
.LBB2_1:                                @ %bb1
        b       .LBB2_1
.LCPI2_0:
        .long   projecthello::set_flag::h03825ae23cdb2371
.LCPI2_1:
        .long   flag
flag:
        .zero   1

LL

; ModuleID = 'projecthello.7400e4f62d9edb55-cgu.0'
source_filename = "projecthello.7400e4f62d9edb55-cgu.0"
target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv4t-unknown-none-eabi"

@flag = dso_local global [1 x i8] zeroinitializer, align 1
@llvm.compiler.used = appending global [1 x ptr] [ptr @flag], section "llvm.metadata"

; __rustc::rust_begin_unwind
; Function Attrs: nofree norecurse noreturn nosync nounwind memory(none)
define hidden void @_RNvCs9qcmSER0O71_7___rustc17rust_begin_unwind(ptr noalias nocapture noundef readonly align 4 dereferenceable(12) %_1) unnamed_addr #0 {
start:
  br label %bb1

bb1:                                              ; preds = %bb1, %start
  br label %bb1
}

; projecthello::set_flag
; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none)
define internal void @_ZN12projecthello8set_flag17h03825ae23cdb2371E() unnamed_addr #1 {
start:
  store i8 1, ptr @flag, align 1
  ret void
}

; Function Attrs: nounwind
define dso_local void @start() unnamed_addr #2 {
start:
  tail call void asm sideeffect alignstack "bl ${0}", "r,~{cc},~{memory}"(ptr nonnull @_ZN12projecthello8set_flag17h03825ae23cdb2371E) #3, !srcloc !1
  %0 = load i8, ptr @flag, align 1, !range !2, !noundef !3
  %_2 = trunc nuw i8 %0 to i1
  br i1 %_2, label %bb3, label %bb1

bb1:                                              ; preds = %start, %bb1
  br label %bb1

bb3:                                              ; preds = %start
  ret void
}

attributes #0 = { nofree norecurse noreturn nosync nounwind memory(none) "target-cpu"="generic" "target-features"="+soft-float,+strict-align,+atomics-32" }
attributes #1 = { mustprogress nofree norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none) "target-cpu"="generic" "target-features"="+soft-float,+strict-align,+atomics-32" }
attributes #2 = { nounwind "target-cpu"="generic" "target-features"="+soft-float,+strict-align,+atomics-32" }
attributes #3 = { nounwind }

!llvm.ident = !{!0}

!0 = !{!"rustc version 1.88.0-nightly (077cedc2a 2025-04-19)"}
!1 = !{i64 12846247185322}
!2 = !{i8 0, i8 2}
!3 = !{}

@EvansJahja EvansJahja added the C-bug Category: This is a bug. label Apr 20, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 20, 2025
@EvansJahja
Copy link
Author

EvansJahja commented Apr 20, 2025

Other version

#[unsafe(no_mangle)]
#[used]
pub static mut flag: bool = false;

fn set_flag(flag_ptr: *mut bool) {
    unsafe {
        asm!(
            "strb {}, [{}]", in(reg) 1, in(reg) flag_ptr) ;
    }
}

#[unsafe(no_mangle)]
pub fn start() {
    set_flag(&raw mut flag);
    unsafe {
        while flag == false { 

        }
    }
}
LL

; ModuleID = 'projecthello.7400e4f62d9edb55-cgu.0'
source_filename = "projecthello.7400e4f62d9edb55-cgu.0"
target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv4t-unknown-none-eabi"

@flag = dso_local global [1 x i8] zeroinitializer, align 1
@llvm.compiler.used = appending global [1 x ptr] [ptr @flag], section "llvm.metadata"

; __rustc::rust_begin_unwind
; Function Attrs: nofree norecurse noreturn nosync nounwind memory(none)
define hidden void @_RNvCs9qcmSER0O71_7___rustc17rust_begin_unwind(ptr noalias nocapture noundef readonly align 4 dereferenceable(12) %_1) unnamed_addr #0 {
start:
  br label %bb1

bb1:                                              ; preds = %bb1, %start
  br label %bb1
}

; Function Attrs: nounwind
define dso_local void @start() unnamed_addr #1 {
start:
  tail call void asm sideeffect alignstack "strb ${0}, [${1}]", "r,r,~{cc},~{memory}"(i32 1, ptr nonnull @flag) #2, !srcloc !1
  %0 = load i8, ptr @flag, align 1, !range !2, !noundef !3
  %_3 = trunc nuw i8 %0 to i1
  br i1 %_3, label %bb3, label %bb1

bb1:                                              ; preds = %start, %bb1
  br label %bb1

bb3:                                              ; preds = %start
  ret void
}

attributes #0 = { nofree norecurse noreturn nosync nounwind memory(none) "target-cpu"="generic" "target-features"="+soft-float,+strict-align,+atomics-32" }
attributes #1 = { nounwind "target-cpu"="generic" "target-features"="+soft-float,+strict-align,+atomics-32" }
attributes #2 = { nounwind }

!llvm.ident = !{!0}

!0 = !{!"rustc version 1.88.0-nightly (077cedc2a 2025-04-19)"}
!1 = !{i64 12747462937483}
!2 = !{i8 0, i8 2}
!3 = !{}


LLC

__rustc[6dc022cbd14ae54d]::rust_begin_unwind:
.LBB0_1:                                @ %bb1
        b       .LBB0_1
start:
        ldr     r1, .LCPI1_0
        mov     r0, #1
        strb    r0, [r1]
        ldrb    r0, [r1]
        cmp     r0, #0
        bxne    lr
.LBB1_1:                                @ %bb1
        b       .LBB1_1
.LCPI1_0:
        .long   flag
flag:
        .zero   1

@usamoi
Copy link
Contributor

usamoi commented Apr 20, 2025

When calling a function using core::arch::asm, you need to specify the clobbers. See https://doc.rust-lang.org/reference/inline-assembly.html.

The following code works on my machine.

#[unsafe(no_mangle)]
#[used]
pub static mut FLAG: bool = false;

extern "C" fn set_flag() {
    unsafe {
        FLAG = true;
    }
}

fn set_callback() {
    unsafe {
        core::arch::asm!("bl {}", sym set_flag, clobber_abi("C"));
    }
}

fn main() {
    set_callback();
    unsafe { while FLAG == false {} }
}

@asquared31415
Copy link
Contributor

The problem here is because this introduces a data race. Part of the reason static mut need unsafe is because reading or writing can cause a data race, which is undefined behavior. The compiler assumes data races do not happen, so in some cases it can perform optimizations to assume the value never changes.

@asquared31415
Copy link
Contributor

@rustbot label -C-bug -needs-triage +C-discussion +T-compiler

@rustbot rustbot added C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 20, 2025
@EvansJahja
Copy link
Author

@usamoi

The following code works on my machine.

I tried your code there with some addition for linker and no-std stuffs. repo here

when building with release, this is the disassembly output from binary ninja

Image

the b 0x20134 is not what I expect...

built with cargo build --release

@EvansJahja
Copy link
Author

@asquared31415
https://github.com/EvansJahja/rust_140059/tree/interior_mut

The compiler assumes data races do not happen, so in some cases it can perform optimizations to assume the value never changes.

how do I tell the compiler that data races do happen then? Is it something that interior mutability solves? I tried using SyncUnsafeCell here and it still produces the same output. Is it because there's nothing on Sync trait?

Image

@usamoi
Copy link
Contributor

usamoi commented Apr 21, 2025

the b 0x20134 is not what I expect...

Why it's not expected?

cmp r0, #0
popne {r11, lr}
bxne lr
b 0x20134

It's just

bool test = FLAG != 0;
if (test) {
    return;
}
while (true) {}

optimized from

bool test = FLAG != 0;
if (test) {
    return;
}
while (!test) {}

optimized from

if (!(FLAG == 0)) {
    return;
}
while (FLAG == 0) {}

optimized from

while (FLAG == 0) {}
return;

@EvansJahja
Copy link
Author

EvansJahja commented Apr 21, 2025

@usamoi it's not expected because I wrote

        while flag == false {  }

so I expect flag to be evaluated every iteration.

Perhaps my original code was not clear. The whole set_callback operation should've been something like

fn set_callback() {
    unsafe {
        core::arch::asm!("register TIMER_0's interrupt handler to address: {}", sym set_flag, clobber_abi("C"));
    }
}

instead of simply calling a function by using bl, it's meant to do some platform-specific interrupt registration (ISR) so it will be called at a later time (not immediately). I was under the impression that Rust shouldn't care too much about what's inside the assembly block (ref)

@usamoi
Copy link
Contributor

usamoi commented Apr 21, 2025

I was under the impression that Rust shouldn't care too much about what's inside the assembly block

This has nothing to do with asm blocks; as long as you're still reading from or writing to FLAG using Rust, the compiler can always assume that undefined behavior does not exist and optimize accordingly.

it's meant to do some platform-specific interrupt registration (ISR)

See https://doc.rust-lang.org/std/sync/atomic/fn.compiler_fence.html.

@EvansJahja
Copy link
Author

@usamoi
Thank you for pointing to compiler_fence, I didn't know about it and looking at it, it looks like the right direction.

However trying to use one in my code results in linking error...

 rust-lld: error: undefined symbol: __sync_synchronize␍
          >>> referenced by concurrent.5d00e7467260b4cd-cgu.0␍
          >>>               C:\Users\EvansGrace02\Documents\Workspace\concurrent\target\armv4t-none-eabi\release\deps\concurrent-c7b7b77a0070874c.concurrent.5d00e7467260b4cd-cgu.0.rcgu.o:(concurrent::set_flag::hea7b7e83239fb54a)␍
          >>> referenced by concurrent.5d00e7467260b4cd-cgu.0␍
          >>>               C:\Users\EvansGrace02\Documents\Workspace\concurrent\target\armv4t-none-eabi\release\deps\concurrent-c7b7b77a0070874c.concurrent.5d00e7467260b4cd-cgu.0.rcgu.o:(main)␍

@usamoi
Copy link
Contributor

usamoi commented Apr 21, 2025

However trying to use one in my code results in linking error...

I believe this is a real bug. Please open another issue for this.

If the target is an embedded platform, you can define this symbol manually.

#[unsafe(no_mangle)]
extern "C" fn __sync_synchronize() {}

@EvansJahja
Copy link
Author

Thank you everyone I think it is expected behavior then, I'll read more about fences. As for the issue, let's discuss further in #140105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants