Skip to content

Commit 2e6882a

Browse files
committed
Auto merge of #140664 - RalfJung:miri-sync, r=RalfJung
Miri subtree update r? `@ghost`
2 parents 0f73f0f + 4b8f88b commit 2e6882a

File tree

143 files changed

+1796
-1541
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+1796
-1541
lines changed

src/tools/miri/.gitignore

+4-1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ tex/*/out
99
*.mm_profdata
1010
perf.data
1111
perf.data.old
12-
flamegraph.svg
12+
flamegraph*.svg
13+
rustc-ice*.txt
1314
tests/native-lib/libtestlib.so
1415
.auto-*
16+
17+
/genmc/

src/tools/miri/CONTRIBUTING.md

+4-6
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,10 @@ When you get a review, please take care of the requested changes in new commits.
1919
existing commits. Generally avoid force-pushing. The only time you should force push is when there
2020
is a conflict with the master branch (in that case you should rebase across master, not merge), and
2121
all the way at the end of the review process when the reviewer tells you that the PR is done and you
22-
should squash the commits. For the latter case, use `git rebase --keep-base ...` to squash without
23-
changing the base commit your PR branches off of. Use your own judgment and the reviewer's guidance
24-
to decide whether the PR should be squashed into a single commit or multiple logically separate
25-
commits. (All this is to work around the fact that Github is quite bad at dealing with force pushes
26-
and does not support `git range-diff`. Maybe one day Github will be good at git and then life can
27-
become easier.)
22+
should squash the commits. If you are unsure how to use `git rebase` to squash commits, use `./miri
23+
squash` which automates the process but leaves little room for customization. (All this is to work
24+
around the fact that Github is quite bad at dealing with force pushes and does not support `git
25+
range-diff`. Maybe one day Github will be good at git and then life can become easier.)
2826

2927
Most PRs bounce back and forth between the reviewer and the author several times, so it is good to
3028
keep track of who is expected to take the next step. We are using the `S-waiting-for-review` and

src/tools/miri/Cargo.toml

+1
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ harness = false
6565

6666
[features]
6767
default = ["stack-cache"]
68+
genmc = []
6869
stack-cache = []
6970
stack-cache-consistency-check = ["stack-cache"]
7071

src/tools/miri/README.md

+33-22
Original file line numberDiff line numberDiff line change
@@ -277,22 +277,15 @@ Try running `cargo miri clean`.
277277
Miri adds its own set of `-Z` flags, which are usually set via the `MIRIFLAGS`
278278
environment variable. We first document the most relevant and most commonly used flags:
279279

280-
* `-Zmiri-address-reuse-rate=<rate>` changes the probability that a freed *non-stack* allocation
281-
will be added to the pool for address reuse, and the probability that a new *non-stack* allocation
282-
will be taken from the pool. Stack allocations never get added to or taken from the pool. The
283-
default is `0.5`.
284-
* `-Zmiri-address-reuse-cross-thread-rate=<rate>` changes the probability that an allocation which
285-
attempts to reuse a previously freed block of memory will also consider blocks freed by *other
286-
threads*. The default is `0.1`, which means by default, in 90% of the cases where an address reuse
287-
attempt is made, only addresses from the same thread will be considered. Reusing an address from
288-
another thread induces synchronization between those threads, which can mask data races and weak
289-
memory bugs.
290-
* `-Zmiri-compare-exchange-weak-failure-rate=<rate>` changes the failure rate of
291-
`compare_exchange_weak` operations. The default is `0.8` (so 4 out of 5 weak ops will fail).
292-
You can change it to any value between `0.0` and `1.0`, where `1.0` means it
293-
will always fail and `0.0` means it will never fail. Note that setting it to
294-
`1.0` will likely cause hangs, since it means programs using
295-
`compare_exchange_weak` cannot make progress.
280+
* `-Zmiri-deterministic-concurrency` makes Miri's concurrency-related behavior fully deterministic.
281+
Strictly speaking, Miri is always fully deterministic when isolation is enabled (the default
282+
mode), but this determinism is achieved by using an RNG with a fixed seed. Seemingly harmless
283+
changes to the program, or just running it for a different target architecture, can thus lead to
284+
completely different program behavior down the line. This flag disables the use of an RNG for
285+
concurrency-related decisions. Therefore, Miri cannot find bugs that only occur under some
286+
specific circumstances, but Miri's behavior will also be more stable across versions and targets.
287+
This is equivalent to `-Zmiri-fixed-schedule -Zmiri-compare-exchange-weak-failure-rate=0.0
288+
-Zmiri-address-reuse-cross-thread-rate=0.0 -Zmiri-disable-weak-memory-emulation`.
296289
* `-Zmiri-disable-isolation` disables host isolation. As a consequence,
297290
the program has access to host resources such as environment variables, file
298291
systems, and randomness.
@@ -334,9 +327,6 @@ environment variable. We first document the most relevant and most commonly used
334327
This will necessarily miss some bugs as those operations are not efficiently and accurately
335328
implementable in a sanitizer, but it will only miss bugs that concern memory/pointers which is
336329
subject to these operations.
337-
* `-Zmiri-preemption-rate` configures the probability that at the end of a basic block, the active
338-
thread will be preempted. The default is `0.01` (i.e., 1%). Setting this to `0` disables
339-
preemption.
340330
* `-Zmiri-report-progress` makes Miri print the current stacktrace every now and then, so you can
341331
tell what it is doing when a program just keeps running. You can customize how frequently the
342332
report is printed via `-Zmiri-report-progress=<blocks>`, which prints the report every N basic
@@ -365,6 +355,22 @@ The remaining flags are for advanced use only, and more likely to change or be r
365355
Some of these are **unsound**, which means they can lead
366356
to Miri failing to detect cases of undefined behavior in a program.
367357

358+
* `-Zmiri-address-reuse-rate=<rate>` changes the probability that a freed *non-stack* allocation
359+
will be added to the pool for address reuse, and the probability that a new *non-stack* allocation
360+
will be taken from the pool. Stack allocations never get added to or taken from the pool. The
361+
default is `0.5`.
362+
* `-Zmiri-address-reuse-cross-thread-rate=<rate>` changes the probability that an allocation which
363+
attempts to reuse a previously freed block of memory will also consider blocks freed by *other
364+
threads*. The default is `0.1`, which means by default, in 90% of the cases where an address reuse
365+
attempt is made, only addresses from the same thread will be considered. Reusing an address from
366+
another thread induces synchronization between those threads, which can mask data races and weak
367+
memory bugs.
368+
* `-Zmiri-compare-exchange-weak-failure-rate=<rate>` changes the failure rate of
369+
`compare_exchange_weak` operations. The default is `0.8` (so 4 out of 5 weak ops will fail).
370+
You can change it to any value between `0.0` and `1.0`, where `1.0` means it
371+
will always fail and `0.0` means it will never fail. Note that setting it to
372+
`1.0` will likely cause hangs, since it means programs using
373+
`compare_exchange_weak` cannot make progress.
368374
* `-Zmiri-disable-alignment-check` disables checking pointer alignment, so you
369375
can focus on other failures, but it means Miri can miss bugs in your program.
370376
Using this flag is **unsound**.
@@ -383,6 +389,10 @@ to Miri failing to detect cases of undefined behavior in a program.
383389
this flag is **unsound**.
384390
* `-Zmiri-disable-weak-memory-emulation` disables the emulation of some C++11 weak
385391
memory effects.
392+
* `-Zmiri-fixed-schedule` disables preemption (like `-Zmiri-preemption-rate=0.0`) and furthermore
393+
disables the randomization of the next thread to be picked, instead fixing a round-robin schedule.
394+
Note however that other aspects of Miri's concurrency behavior are still randomize; use
395+
`-Zmiri-deterministic-concurrency` to disable them all.
386396
* `-Zmiri-native-lib=<path to a shared object file>` is an experimental flag for providing support
387397
for calling native functions from inside the interpreter via FFI. The flag is supported only on
388398
Unix systems. Functions not provided by that file are still executed via the usual Miri shims.
@@ -412,6 +422,10 @@ to Miri failing to detect cases of undefined behavior in a program.
412422
without an explicit value), `none` means it never recurses, `scalar` means it only recurses for
413423
types where we would also emit `noalias` annotations in the generated LLVM IR (types passed as
414424
individual scalars or pairs of scalars). Setting this to `none` is **unsound**.
425+
* `-Zmiri-preemption-rate` configures the probability that at the end of a basic block, the active
426+
thread will be preempted. The default is `0.01` (i.e., 1%). Setting this to `0` disables
427+
preemption. Note that even without preemption, the schedule is still non-deterministic:
428+
if a thread blocks or yields, the next thread is chosen randomly.
415429
* `-Zmiri-provenance-gc=<blocks>` configures how often the pointer provenance garbage collector runs.
416430
The default is to search for and remove unreachable provenance once every `10000` basic blocks. Setting
417431
this to `0` disables the garbage collector, which causes some programs to have explosive memory
@@ -443,9 +457,6 @@ to Miri failing to detect cases of undefined behavior in a program.
443457
casts are not supported in this mode, but that may change in the future.
444458
* `-Zmiri-force-page-size=<num>` overrides the default page size for an architecture, in multiples of 1k.
445459
`4` is default for most targets. This value should always be a power of 2 and nonzero.
446-
* `-Zmiri-unique-is-unique` performs additional aliasing checks for `core::ptr::Unique` to ensure
447-
that it could theoretically be considered `noalias`. This flag is experimental and has
448-
an effect only when used with `-Zmiri-tree-borrows`.
449460

450461
[function ABI]: https://doc.rust-lang.org/reference/items/functions.html#extern-function-qualifier
451462

src/tools/miri/bench-cargo-miri/big-allocs/src/main.rs

+1-4
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,7 @@ fn main() {
77
// We can't use too big of an allocation or this code will encounter an allocation failure in
88
// CI. Since the allocation can't be huge, we need to do a few iterations so that the effect
99
// we're trying to measure is clearly visible above the interpreter's startup time.
10-
// FIXME (https://github.com/rust-lang/miri/issues/4253): On 32bit targets, we can run out of
11-
// usable addresses if we don't reuse, leading to random test failures.
12-
let count = if cfg!(target_pointer_width = "32") { 8 } else { 12 };
13-
for _ in 0..count {
10+
for _ in 0..20 {
1411
drop(Vec::<u8>::with_capacity(512 * 1024 * 1024));
1512
}
1613
}

src/tools/miri/miri-script/src/commands.rs

+72-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
use std::collections::HashMap;
22
use std::ffi::{OsStr, OsString};
3-
use std::fs::File;
4-
use std::io::{BufReader, BufWriter, Write};
3+
use std::fmt::Write as _;
4+
use std::fs::{self, File};
5+
use std::io::{self, BufRead, BufReader, BufWriter, Write as _};
56
use std::ops::Not;
67
use std::path::PathBuf;
78
use std::time::Duration;
@@ -169,7 +170,8 @@ impl Command {
169170
| Command::Toolchain { .. }
170171
| Command::Bench { .. }
171172
| Command::RustcPull { .. }
172-
| Command::RustcPush { .. } => {}
173+
| Command::RustcPush { .. }
174+
| Command::Squash => {}
173175
}
174176
// Then run the actual command.
175177
match self {
@@ -188,6 +190,7 @@ impl Command {
188190
Command::Toolchain { flags } => Self::toolchain(flags),
189191
Command::RustcPull { commit } => Self::rustc_pull(commit.clone()),
190192
Command::RustcPush { github_user, branch } => Self::rustc_push(github_user, branch),
193+
Command::Squash => Self::squash(),
191194
}
192195
}
193196

@@ -383,6 +386,72 @@ impl Command {
383386
Ok(())
384387
}
385388

389+
fn squash() -> Result<()> {
390+
let sh = Shell::new()?;
391+
sh.change_dir(miri_dir()?);
392+
// Figure out base wrt latest upstream master.
393+
// (We can't trust any of the local ones, they can all be outdated.)
394+
let origin_master = {
395+
cmd!(sh, "git fetch https://github.com/rust-lang/miri/")
396+
.quiet()
397+
.ignore_stdout()
398+
.ignore_stderr()
399+
.run()?;
400+
cmd!(sh, "git rev-parse FETCH_HEAD").read()?
401+
};
402+
let base = cmd!(sh, "git merge-base HEAD {origin_master}").read()?;
403+
// Rebase onto that, setting ourselves as the sequence editor so that we can edit the sequence programmatically.
404+
// We want to forward the host stdin so apparently we cannot use `cmd!`.
405+
let mut cmd = process::Command::new("git");
406+
cmd.arg("rebase").arg(&base).arg("--interactive");
407+
cmd.env("GIT_SEQUENCE_EDITOR", env::current_exe()?);
408+
cmd.env("MIRI_SCRIPT_IS_GIT_SEQUENCE_EDITOR", "1");
409+
cmd.current_dir(sh.current_dir());
410+
let result = cmd.status()?;
411+
if !result.success() {
412+
bail!("`git rebase` failed");
413+
}
414+
Ok(())
415+
}
416+
417+
pub fn squash_sequence_editor() -> Result<()> {
418+
let sequence_file = env::args().nth(1).expect("git should pass us a filename");
419+
if sequence_file == "fmt" {
420+
// This is probably us being called as a git hook as part of the rebase. Let's just
421+
// ignore this. Sadly `git rebase` does not have a flag to skip running hooks.
422+
return Ok(());
423+
}
424+
// Read the provided sequence and adjust it.
425+
let rebase_sequence = {
426+
let mut rebase_sequence = String::new();
427+
let file = fs::File::open(&sequence_file).with_context(|| {
428+
format!("failed to read rebase sequence from {sequence_file:?}")
429+
})?;
430+
let file = io::BufReader::new(file);
431+
for line in file.lines() {
432+
let line = line?;
433+
// The first line is left unchanged.
434+
if rebase_sequence.is_empty() {
435+
writeln!(rebase_sequence, "{line}").unwrap();
436+
continue;
437+
}
438+
// If this is a "pick" like, make it "squash".
439+
if let Some(rest) = line.strip_prefix("pick ") {
440+
writeln!(rebase_sequence, "squash {rest}").unwrap();
441+
continue;
442+
}
443+
// We've reached the end of the relevant part of the sequence, and we can stop.
444+
break;
445+
}
446+
rebase_sequence
447+
};
448+
// Write out the adjusted sequence.
449+
fs::write(&sequence_file, rebase_sequence).with_context(|| {
450+
format!("failed to write adjusted rebase sequence to {sequence_file:?}")
451+
})?;
452+
Ok(())
453+
}
454+
386455
fn bench(
387456
target: Option<String>,
388457
no_install: bool,

src/tools/miri/miri-script/src/main.rs

+8-1
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,8 @@ pub enum Command {
133133
#[arg(default_value = "miri-sync")]
134134
branch: String,
135135
},
136+
/// Squash the commits of the current feature branch into one.
137+
Squash,
136138
}
137139

138140
impl Command {
@@ -154,7 +156,7 @@ impl Command {
154156
flags.extend(remainder);
155157
Ok(())
156158
}
157-
Self::Bench { .. } | Self::RustcPull { .. } | Self::RustcPush { .. } =>
159+
Self::Bench { .. } | Self::RustcPull { .. } | Self::RustcPush { .. } | Self::Squash =>
158160
bail!("unexpected \"--\" found in arguments"),
159161
}
160162
}
@@ -170,6 +172,11 @@ pub struct Cli {
170172
}
171173

172174
fn main() -> Result<()> {
175+
// If we are invoked as the git sequence editor, jump to that logic.
176+
if !std::env::var_os("MIRI_SCRIPT_IS_GIT_SEQUENCE_EDITOR").unwrap_or_default().is_empty() {
177+
return Command::squash_sequence_editor();
178+
}
179+
173180
// Split the arguments into the part before the `--` and the part after.
174181
// The `--` itself ends up in the second part.
175182
let miri_args: Vec<_> = std::env::args().take_while(|x| *x != "--").collect();

src/tools/miri/rust-version

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1b8ab72680f36e783af84c1a3c4f8508572bd9f9
1+
2ad5f8607d0e192b60b130e5cc416b477b351c18

src/tools/miri/src/alloc_addresses/mod.rs

+15-2
Original file line numberDiff line numberDiff line change
@@ -114,8 +114,16 @@ trait EvalContextExtPriv<'tcx>: crate::MiriInterpCxExt<'tcx> {
114114
memory_kind: MemoryKind,
115115
) -> InterpResult<'tcx, u64> {
116116
let this = self.eval_context_ref();
117-
let mut rng = this.machine.rng.borrow_mut();
118117
let info = this.get_alloc_info(alloc_id);
118+
119+
// Miri's address assignment leaks state across thread boundaries, which is incompatible
120+
// with GenMC execution. So we instead let GenMC assign addresses to allocations.
121+
if let Some(genmc_ctx) = this.machine.data_race.as_genmc_ref() {
122+
let addr = genmc_ctx.handle_alloc(&this.machine, info.size, info.align, memory_kind)?;
123+
return interp_ok(addr);
124+
}
125+
126+
let mut rng = this.machine.rng.borrow_mut();
119127
// This is either called immediately after allocation (and then cached), or when
120128
// adjusting `tcx` pointers (which never get freed). So assert that we are looking
121129
// at a live allocation. This also ensures that we never re-assign an address to an
@@ -197,6 +205,11 @@ trait EvalContextExtPriv<'tcx>: crate::MiriInterpCxExt<'tcx> {
197205
if global_state.next_base_addr > this.target_usize_max() {
198206
throw_exhaust!(AddressSpaceFull);
199207
}
208+
// If we filled up more than half the address space, start aggressively reusing
209+
// addresses to avoid running out.
210+
if global_state.next_base_addr > u64::try_from(this.target_isize_max()).unwrap() {
211+
global_state.reuse.address_space_shortage();
212+
}
200213

201214
interp_ok(base_addr)
202215
}
@@ -490,7 +503,7 @@ impl<'tcx> MiriMachine<'tcx> {
490503
// Also remember this address for future reuse.
491504
let thread = self.threads.active_thread();
492505
global_state.reuse.add_addr(rng, addr, size, align, kind, thread, || {
493-
if let Some(data_race) = &self.data_race {
506+
if let Some(data_race) = self.data_race.as_vclocks_ref() {
494507
data_race.release_clock(&self.threads, |clock| clock.clone())
495508
} else {
496509
VClock::default()

src/tools/miri/src/alloc_addresses/reuse_pool.rs

+11-4
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ pub struct ReusePool {
2020
/// allocations as address-size pairs, the list must be sorted by the size and then the thread ID.
2121
///
2222
/// Each of these maps has at most MAX_POOL_SIZE elements, and since alignment is limited to
23-
/// less than 64 different possible value, that bounds the overall size of the pool.
23+
/// less than 64 different possible values, that bounds the overall size of the pool.
2424
///
2525
/// We also store the ID and the data-race clock of the thread that donated this pool element,
2626
/// to ensure synchronization with the thread that picks up this address.
@@ -36,6 +36,15 @@ impl ReusePool {
3636
}
3737
}
3838

39+
/// Call this when we are using up a lot of the address space: if memory reuse is enabled at all,
40+
/// this will bump the intra-thread reuse rate to 100% so that we can keep running this program as
41+
/// long as possible.
42+
pub fn address_space_shortage(&mut self) {
43+
if self.address_reuse_rate > 0.0 {
44+
self.address_reuse_rate = 1.0;
45+
}
46+
}
47+
3948
fn subpool(&mut self, align: Align) -> &mut Vec<(u64, Size, ThreadId, VClock)> {
4049
let pool_idx: usize = align.bytes().trailing_zeros().try_into().unwrap();
4150
if self.pool.len() <= pool_idx {
@@ -55,9 +64,7 @@ impl ReusePool {
5564
clock: impl FnOnce() -> VClock,
5665
) {
5766
// Let's see if we even want to remember this address.
58-
// We don't remember stack addresses: there's a lot of them (so the perf impact is big),
59-
// and we only want to reuse stack slots within the same thread or else we'll add a lot of
60-
// undesired synchronization.
67+
// We don't remember stack addresses since there's so many of them (so the perf impact is big).
6168
if kind == MemoryKind::Stack || !rng.random_bool(self.address_reuse_rate) {
6269
return;
6370
}

0 commit comments

Comments
 (0)