Skip to content

[pull] master from torvalds:master #1953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Jun 23, 2025
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Jun 23, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

Peter Zijlstra and others added 30 commits June 5, 2025 14:37
Baisheng Gao reported an ARM64 crash, which Mark decoded as being a
synchronous external abort -- most likely due to trying to access
MMIO in bad ways.

The crash further shows perf trying to do a user stack sample while in
exit_mmap()'s tlb_finish_mmu() -- i.e. while tearing down the address
space it is trying to access.

It turns out that we stop perf after we tear down the userspace mm; a
receipie for disaster, since perf likes to access userspace for
various reasons.

Flip this order by moving up where we stop perf in do_exit().

Additionally, harden PERF_SAMPLE_CALLCHAIN and PERF_SAMPLE_STACK_USER
to abort when the current task does not have an mm (exit_mm() makes
sure to set current->mm = NULL; before commencing with the actual
teardown). Such that CPU wide events don't trip on this same problem.

Fixes: c5ebced ("perf: Add ability to attach user stack dump to sample")
Reported-by: Baisheng Gao <[email protected]>
Suggested-by: Mark Rutland <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
While chasing down a missing perf_cgroup_event_disable() elsewhere,
Leo Yan found that both perf_put_aux_event() and
perf_remove_sibling_event() were also missing one.

Specifically, the rule is that events that switch to OFF,ERROR need to
call perf_cgroup_event_disable().

Unify the disable paths to ensure this.

Fixes: ab43762 ("perf: Allow normal events to output AUX data")
Fixes: 9f0c4fa ("perf/core: Add a new PERF_EV_CAP_SIBLING event capability")
Reported-by: Leo Yan <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Commit a3c3c66("perf/core: Fix child_total_time_enabled accounting
bug at task exit") moves the event->state update to before
list_del_event(). This makes the event->state test in list_del_event()
always false; never calling perf_cgroup_event_disable().

As a result, cpuctx->cgrp won't be cleared properly; causing havoc.

Fixes: a3c3c66("perf/core: Fix child_total_time_enabled accounting bug at task exit")
Signed-off-by: Yeoreum Yun <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: David Wang <[email protected]>
Link: https://lore.kernel.org/all/aD2TspKH%[email protected]/
There may be concurrency between perf_cgroup_switch and
perf_cgroup_event_disable. Consider the following scenario: after a new
perf cgroup event is created on CPU0, the new event may not trigger
a reprogramming, causing ctx->is_active to be 0. In this case, when CPU1
disables this perf event, it executes __perf_remove_from_context->
list _del_event->perf_cgroup_event_disable on CPU1, which causes a race
with perf_cgroup_switch running on CPU0.

The following describes the details of this concurrency scenario:

CPU0						CPU1

perf_cgroup_switch:
   ...
   # cpuctx->cgrp is not NULL here
   if (READ_ONCE(cpuctx->cgrp) == NULL)
   	return;

						perf_remove_from_context:
						   ...
						   raw_spin_lock_irq(&ctx->lock);
						   ...
						   # ctx->is_active == 0 because reprogramm is not
						   # tigger, so CPU1 can do __perf_remove_from_context
						   # for CPU0
						   __perf_remove_from_context:
						         perf_cgroup_event_disable:
							    ...
							    if (--ctx->nr_cgroups)
							    ...

   # this warning will happened because CPU1 changed
   # ctx.nr_cgroups to 0.
   WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0);

[peterz: use guard instead of goto unlock]
Fixes: db4a835 ("perf/core: Set cgroup in CPU contexts for new cgroup events")
Signed-off-by: Luo Gengkun <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Better describe the event states.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Leo Yan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Mark reported that futex_priv_hash fails on ARM64.
It turns out that the command line parsing does not terminate properly
and ends in the default case assuming an invalid option was passed.

Use an int as the return type for getopt().

Closes: https://lore.kernel.org/all/[email protected]/
Fixes: 3163369 ("selftests/futex: Add futex_numa_mpol")
Fixes: cda95fa ("selftests/futex: Add futex_priv_hash")
Reported-by: Mark Brown <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
The test fails at the MPOL step if multiple nodes are available. The
reason is that mbind() sets the policy but the home_node, which is
retrieved by the futex code, is not set. This causes to retrieve the
current node and with multiple nodes it fails on one of the iterations.

Use numa_set_mempolicy_home_node() to set the expected node.
Use ksft_exit_fail_msg() to fail and exit in order not to confuse ktap.

Fixes: 3163369 ("selftests/futex: Add futex_numa_mpol")
Suggested-by: Vlastimil Babka <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
syzbot used a negative node number which was not rejected early and led
to invalid memory access in node_possible().

Reject negative node numbers except for FUTEX_NO_NODE.

[bigeasy: Keep the FUTEX_NO_NODE check]

Closes: https://lore.kernel.org/all/[email protected]/
Fixes: cec199c ("futex: Implement FUTEX2_NUMA")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reported-by: [email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Both ARM and IBM CI reports RCU stall, which can be reproduced by the
below perf command.
  perf record -a -e cpu-clock -- sleep 2

The issue is introduced by the generic throttle patch set, which
unconditionally invoke the event_stop() when throttle is triggered.

The cpu-clock and task-clock are two special SW events, which rely on
the hrtimer. The throttle is invoked in the hrtimer handler. The
event_stop()->hrtimer_cancel() waits for the handler to finish, which is
a deadlock. Instead of invoking the stop(), the HRTIMER_NORESTART should
be used to stop the timer.

There may be two ways to fix it:
 - Introduce a PMU flag to track the case. Avoid the event_stop in
   perf_event_throttle() if the flag is detected.
   It has been implemented in the
   https://lore.kernel.org/lkml/[email protected]/
   The new flag was thought to be an overkill for the issue.
 - Add a check in the event_stop. Return immediately if the throttle is
   invoked in the hrtimer handler. Rely on the existing HRTIMER_NORESTART
   method to stop the timer.

The latter is implemented here.

Move event->hw.interrupts = MAX_INTERRUPTS before the stop(). It makes
the order the same as perf_event_unthrottle(). Except the patch, no one
checks the hw.interrupts in the stop(). There is no impact from the
order change.

When stops in the throttle, the event should not be updated,
stop(event, 0). But the cpu_clock_event_stop() doesn't handle the flag.
In logic, it's wrong. But it didn't bring any problems with the old
code, because the stop() was not invoked when handling the throttle.
Checking the flag before updating the event.

Fixes: 9734e25 ("perf: Fix the throttle logic for a group")
Closes: https://lore.kernel.org/lkml/[email protected]/
Closes: https://lore.kernel.org/lkml/djxlh5fx326gcenwrr52ry3pk4wxmugu4jccdjysza7tlc5fef@ktp4rffawgcw/
Closes: https://lore.kernel.org/lkml/[email protected]/
Closes: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Leo Yan <[email protected]>
Reported-by: Aishwarya TCV <[email protected]>
Reported-by: Alexei Starovoitov <[email protected]>
Reported-by: Venkat Rao Bagalkote <[email protected]>
Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Once the global hash is requested there is no way back to switch back to
the per-task private hash. This is checked at the begin of the function.

It is possible that two threads simultaneously request the global hash
and both pass the initial check and block later on the
mm::futex_hash_lock. In this case the first thread performs the switch
to the global hash. The second thread will also attempt to switch to the
global hash and while doing so, accessing the nonexisting slot 1 of the
struct futex_private_hash.
The same applies if the hash is made immutable: There is no reference
counting and the hash must not be replaced.

Verify under mm_struct::futex_phash that neither the global hash nor an
immutable hash in use.

Tested-by: "Lai, Yi" <[email protected]>
Reported-by: "Lai, Yi" <[email protected]>
Closes: https://lore.kernel.org/all/aDwDw9Aygqo6oAx+@ly-workstation/
Fixes: bd54df5 ("futex: Allow to resize the private local hash")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Specify the properties which are essential and which are not for the
Tegra I2C driver to function correctly. This was not added correctly when
the TXT binding was converted to yaml. All the existing DT nodes have
these properties already and hence this does not break the ABI.

dmas and dma-names which were specified as a must in the TXT binding
is now made optional since the driver can work in PIO mode if dmas are
missing.

Fixes: f10a9b7 ("dt-bindings: i2c: tegra: Convert to json-schema”)
Signed-off-by: Akhil R <[email protected]>
Cc: <[email protected]> # v5.17+
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Convert the I2C subsystem to drop using the 'master_'-prefixed callbacks
in favor of the simplified ones. Fix alignment of '=' while here.

Signed-off-by: Wolfram Sang <[email protected]>
The perf_fuzzer found a hard-lockup crash on a RaptorLake machine:

  Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000
  CPU: 23 UID: 0 PID: 0 Comm: swapper/23
  Tainted: [W]=WARN
  Hardware name: Dell Inc. Precision 9660/0VJ762
  RIP: 0010:native_read_pmc+0x7/0x40
  Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ...
  RSP: 000:fffb03100273de8 EFLAGS: 00010046
  ....
  Call Trace:
    <TASK>
    icl_update_topdown_event+0x165/0x190
    ? ktime_get+0x38/0xd0
    intel_pmu_read_event+0xf9/0x210
    __perf_event_read+0xf9/0x210

CPUs 16-23 are E-core CPUs that don't support the perf metrics feature.
The icl_update_topdown_event() should not be invoked on these CPUs.

It's a regression of commit:

  f9bdf1f ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")

The bug introduced by that commit is that the is_topdown_event() function
is mistakenly used to replace the is_topdown_count() call to check if the
topdown functions for the perf metrics feature should be invoked.

Fix it.

Fixes: f9bdf1f ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
Closes: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Vince Weaver <[email protected]>
Cc: [email protected] # v6.15+
Link: https://lore.kernel.org/r/[email protected]
Commit 788019e ("genirq: Retain disable depth for managed interrupts
across CPU hotplug") intended to only decrement the disable depth once per
managed shutdown, but instead it decrements for each CPU hotplug in the
affinity mask, until its depth reaches a point where it finally gets
re-started.

For example, consider:

1. Interrupt is affine to CPU {M,N}
2. disable_irq() -> depth is 1
3. CPU M goes offline -> interrupt migrates to CPU N / depth is still 1
4. CPU N goes offline -> irq_shutdown() / depth is 2
5. CPU N goes online
    -> irq_restore_affinity_of_irq()
       -> irqd_is_managed_and_shutdown()==true
          -> irq_startup_managed() -> depth is 1
6. CPU M goes online
    -> irq_restore_affinity_of_irq()
       -> irqd_is_managed_and_shutdown()==true
          -> irq_startup_managed() -> depth is 0
          *** BUG: driver expects the interrupt is still disabled ***
             -> irq_startup() -> irqd_clr_managed_shutdown()
7. enable_irq() -> depth underflow / unbalanced enable_irq() warning

This should clear the managed-shutdown flag at step 6, so that further
hotplugs don't cause further imbalance.

Note: It might be cleaner to also remove the irqd_clr_managed_shutdown()
invocation from __irq_startup_managed(). But this is currently not possible
because of irq_update_affinity_desc() as it sets IRQD_MANAGED_SHUTDOWN and
expects irq_startup() to clear it.

Fixes: 788019e ("genirq: Retain disable depth for managed interrupts across CPU hotplug")
Reported-by: Aleksandrs Vinarskis <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Aleksandrs Vinarskis <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Commit 788019e ("genirq: Retain disable depth for managed interrupts
across CPU hotplug") tried to make managed shutdown/startup properly
reference counted, but it missed the fact that the unplug and hotplug code
has an intentional imbalance by skipping IRQS_SUSPENDED interrupts on
the "restore" path.

This means that if a managed-affinity interrupt was both suspended and
managed-shutdown (such as may happen during system suspend / S3), resume
skips calling irq_startup_managed(), and would again have an unbalanced
depth this time, with a positive value (i.e., remaining unexpectedly
masked).

This IRQS_SUSPENDED check was introduced in commit a60dd06
("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity")
for essentially the same reason as commit 788019e, to prevent that
irq_startup() would unconditionally re-enable an interrupt too early.

Because irq_startup_managed() now respsects the disable-depth count, the
IRQS_SUSPENDED check is not longer needed, and instead, it causes harm.

Thus, drop the IRQS_SUSPENDED check, and restore balance.

This effectively reverts commit a60dd06 ("genirq/cpuhotplug: Skip
suspended interrupts when restoring affinity"), because it is replaced
by commit 788019e ("genirq: Retain disable depth for managed
interrupts across CPU hotplug").

Fixes: 788019e ("genirq: Retain disable depth for managed interrupts across CPU hotplug")
Reported-by: Aleksandrs Vinarskis <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Aleksandrs Vinarskis <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Closes: https://lore.kernel.org/lkml/[email protected]/
Initialize `ops` member's pointers properly by using kzalloc() instead of
kmalloc() when allocating the simulation work context. Otherwise the
pointers contain random content leading to invalid dereferencing.

Signed-off-by: Gyeyoung Baek <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
…esctrl subsystem

In the resctrl subsystem's Sub-NUMA Cluster (SNC) mode, the rdt_mon_domain
structure representing a NUMA node relies on the cacheinfo interface
(rdt_mon_domain::ci) to store L3 cache information (e.g., shared_cpu_map)
for monitoring. The L3 cache information of a SNC NUMA node determines
which domains are summed for the "top level" L3-scoped events.

rdt_mon_domain::ci is initialized using the first online CPU of a NUMA
node. When this CPU goes offline, its shared_cpu_map is cleared to contain
only the offline CPU itself. Subsequently, attempting to read counters
via smp_call_on_cpu(offline_cpu) fails (and error ignored), returning
zero values for "top-level events" without any error indication.

Replace the cacheinfo references in struct rdt_mon_domain and struct
rmid_read with the cacheinfo ID (a unique identifier for the L3 cache).

rdt_domain_hdr::cpu_mask contains the online CPUs associated with that
domain. When reading "top-level events", select a CPU from
rdt_domain_hdr::cpu_mask and utilize its L3 shared_cpu_map to determine
valid CPUs for reading RMID counter via the MSR interface.

Considering all CPUs associated with the L3 cache improves the chances
of picking a housekeeping CPU on which the counter reading work can be
queued, avoiding an unnecessary IPI.

Fixes: 328ea68 ("x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files")
Signed-off-by: Qinyun Tan <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/[email protected]
AMD's Family 19h-based Models 70h-7fh support 4 unified memory controllers
(UMC) per processor die.

The amd64_edac driver, however, assumes only 2 UMCs are supported since
max_mcs variable for the models has not been explicitly set to 4. The same
results in incomplete or incorrect memory information being logged to dmesg by
the module during initialization in some instances.

Fixes: 6c79e42 ("EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh")
Closes: https://lore.kernel.org/all/[email protected]/
Reported-by: reox <[email protected]>
Signed-off-by: Avadhut Naik <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/[email protected]
As-per the SBI specification, an SBI remote fence operation applies
to the entire address space if either:
1) start_addr and size are both 0
2) size is equal to 2^XLEN-1

>From the above, only #1 is checked by SBI SFENCE calls so fix the
size parameter check in SBI SFENCE calls to cover #2 as well.

Fixes: 13acfec ("RISC-V: KVM: Add remote HFENCE functions based on VCPU requests")
Reviewed-by: Atish Patra <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Anup Patel <[email protected]>
The SBI specification clearly states that SBI HFENCE calls should
return SBI_ERR_NOT_SUPPORTED when one of the target hart doesn’t
support hypervisor extension (aka nested virtualization in-case
of KVM RISC-V).

Fixes: c7fa3c4 ("RISC-V: KVM: Treat SBI HFENCE calls as NOPs")
Reviewed-by: Atish Patra <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Anup Patel <[email protected]>
…inux/kernel/git/andi.shyti/linux into i2c/for-current

i2c-host-fixes for v6.16-rc2

tegra: fix YAML conversion of device tree bindings
PTI uses separate ASIDs (aka. PCIDs) for kernel and user address
spaces. When the kernel needs to flush the user address space, it
just sets a bit in a bitmap and then flushes the entire PCID on
the next switch to userspace.

This bitmap is a single 'unsigned long' which is plenty for all 6
dynamic ASIDs. But, unfortunately, the INVLPGB support brings along a
bunch more user ASIDs, as many as ~2k more. The bitmap can't address
that many.

Fortunately, the bitmap is only needed for PTI and all the CPUs
with INVLPGB are AMD CPUs that aren't vulnerable to Meltdown and
don't need PTI. The only way someone can run into an issue in
practice is by booting with pti=on on a newer AMD CPU.

Disable INVLPGB if PTI is enabled. Avoid overrunning the small
bitmap.

Note: this will be fixed up properly by making the bitmap bigger.
For now, just avoid the mostly theoretical bug.

Fixes: 4afeb0e ("x86/mm: Enable broadcast TLB invalidation for multi-threaded processes")
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Cc:[email protected]
Link: https://lore.kernel.org/all/20250610222420.E8CBF472%40davehans-spike.ostc.intel.com
Commit a82b264 ("x86/its: explicitly manage permissions for ITS
pages") reworks its_alloc() and introduces a typo in an ifdef
conditional, referring to CONFIG_MODULE instead of CONFIG_MODULES.

Fix this typo in its_alloc().

Fixes: a82b264 ("x86/its: explicitly manage permissions for ITS pages")
Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/all/20250616100432.22941-1-lukas.bulwahn%40redhat.com
The INVLPGB instruction has limits on how many pages it can invalidate
at once. That limit is enumerated in CPUID, read by the kernel, and
stored in 'invpgb_count_max'. Ranged invalidation, like
invlpgb_kernel_range_flush() break up their invalidations so
that they do not exceed the limit.

However, early boot code currently attempts to do ranged
invalidation before populating 'invlpgb_count_max'. There is a
for loop which is basically:

	for (...; addr < end; addr += invlpgb_count_max*PAGE_SIZE)

If invlpgb_kernel_range_flush is called before the kernel has read
the value of invlpgb_count_max from the hardware, the normally
bounded loop can become an infinite loop if invlpgb_count_max is
initialized to zero.

Fix that issue by initializing invlpgb_count_max to 1.

This way INVPLGB at early boot time will be a little bit slower
than normal (with initialized invplgb_count_max), and not an
instant hang at bootup time.

Fixes: b7aa05c ("x86/mm: Add INVLPGB support code")
Signed-off-by: Rik van Riel <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/all/20250606171112.4013261-3-riel%40surriel.com
Since smp_text_poke_single() does not expect there is another
text_poke request is queued, it can make text_poke_array not
sorted or cause a buffer overflow on the text_poke_array.vec[].
This will cause an Oops in int3 because of bsearch failing;

   CPU 0                        CPU 1                      CPU 2
   -----                        -----                      -----

 smp_text_poke_batch_add()

			    smp_text_poke_single() <<-- Adds out of order

							<int3>
                                                	[Fails o find address
                                                        in text_poke_array ]
                                                        OOPS!

Or unhandled page fault because of a buffer overflow;

   CPU 0                        CPU 1
   -----                        -----

 smp_text_poke_batch_add() <<+
 ...                         |
 smp_text_poke_batch_add() <<-- Adds TEXT_POKE_ARRAY_MAX times.

			     smp_text_poke_single() {
			     	__smp_text_poke_batch_add() <<-- Adds entry at
								TEXT_POKE_ARRAY_MAX + 1

                		smp_text_poke_batch_finish()
                        	  [Unhandled page fault because
				   text_poke_array.nr_entries is
				   overwritten]
				   BUG!
			     }

Use smp_text_poke_batch_add() instead of __smp_text_poke_batch_add()
so that it correctly flush the queue if needed.

Closes: https://lore.kernel.org/all/CA+G9fYsLu0roY3DV=tKyqP7FEKbOEETRvTDhnpPxJGbA=Cg+4w@mail.gmail.com/
Fixes: c8976ad ("x86/alternatives: Simplify smp_text_poke_single() by using tp_vec and existing APIs")
Reported-by: Linux Kernel Functional Testing <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Tested-by: Linux Kernel Functional Testing <[email protected]>
Link: https://lkml.kernel.org/r/\ 175020512308.3582717.13631440385506146631.stgit@mhiramat.tok.corp.google.com
A kernel panic was reported with the following kernel log:

  EDAC igen6: Expected 2 mcs, but only 1 detected.
  BUG: unable to handle page fault for address: 000000000000d570
  ...
  Hardware name: Notebook V54x_6x_TU/V54x_6x_TU, BIOS Dasharo (coreboot+UEFI) v0.9.0 07/17/2024
  RIP: e030:ecclog_handler+0x7e/0xf0 [igen6_edac]
  ...
  igen6_probe+0x2a0/0x343 [igen6_edac]
  ...
  igen6_init+0xc5/0xff0 [igen6_edac]
  ...

This issue occurred because one memory controller was disabled by
the BIOS but the igen6_edac driver still checked all the memory
controllers, including this absent one, to identify the source of
the error. Accessing the null MMIO for the absent memory controller
resulted in the oops above.

Fix this issue by reverting the configuration structure to non-const
and updating the field 'res_cfg->num_imc' to reflect the number of
detected memory controllers.

Fixes: 20e190b ("EDAC/igen6: Skip absent memory controllers")
Reported-by: Marek Marczykowski-Górecki <[email protected]>
Closes: https://lore.kernel.org/all/aFFN7RlXkaK_loQb@mail-itl/
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Marek Marczykowski-Górecki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
… hostname when adding channels

When mounting a share with kerberos authentication with multichannel
support, share mounts correctly, but fails to create secondary
channels. This occurs because the hostname is not populated when
adding the channels. The hostname is necessary for the userspace
cifs.upcall program to retrieve the required credentials and pass
it back to kernel, without hostname secondary channels fails
establish.

Cc: [email protected]
Reviewed-by: Shyam Prasad N <[email protected]>
Signed-off-by: Bharath SM <[email protected]>
Reported-by: xfuren <[email protected]>
Link: https://bugzilla.samba.org/show_bug.cgi?id=15824
Signed-off-by: Steve French <[email protected]>
… function

Commit 8bd25b6 ("smb: client: set correct d_type for reparse DFS/DFSR
and mount point") deduplicated assignment of fattr->cf_dtype member from
all places to end of the function cifs_reparse_point_to_fattr(). The only
one missing place which was not deduplicated is wsl_to_fattr(). Fix it.

Fixes: 8bd25b6 ("smb: client: set correct d_type for reparse DFS/DFSR and mount point")
Signed-off-by: Pali Rohár <[email protected]>
Signed-off-by: Steve French <[email protected]>
Wei-Lin reports that the tracking of shadow list registers is
majorly broken when resync'ing the L2 state after a run, as
we confuse the guest's LR index with the host's, potentially
losing the interrupt state.

While this could be fixed by adding yet another side index to
track it (Wei-Lin's fix), it may be better to refactor this
code to avoid having a side index altogether, limiting the
risk to introduce this class of bugs.

A key observation is that the shadow index is always the number
of bits in the lr_map bitmap. With that, the parallel indexing
scheme can be completely dropped.

While doing this, introduce a couple of helpers that abstract
the index conversion and some of the LR repainting, making the
whole exercise much simpler.

Reported-by: Wei-Lin Chang <[email protected]>
Reviewed-by: Wei-Lin Chang <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Explicitly treat type differences as GSI routing changes, as comparing MSI
data between two entries could get a false negative, e.g. if userspace
changed the type but left the type-specific data as-

Note, the same bug was fixed in x86 by commit bcda70c ("KVM: x86:
Explicitly treat routing entry type changes as changes").

Fixes: 4bf3693 ("KVM: arm64: Unmap vLPIs affected by changes to GSI routing information")
Signed-off-by: Sean Christopherson <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
mrutland-arm and others added 28 commits June 19, 2025 13:06
The NVHE/HVHE and VHE modes have separate implementations of
__activate_cptr_traps() and __deactivate_cptr_traps() in their
respective switch.c files. There's some duplication of logic, and it's
not currently possible to reuse this logic elsewhere.

Move the logic into the common switch.h header so that it can be reused,
and de-duplicate the common logic.

This rework changes the way SVE traps are deactivated in VHE mode,
aligning it with NVHE/HVHE modes:

* Before this patch, VHE's __deactivate_cptr_traps() would
  unconditionally enable SVE for host EL2 (but not EL0), regardless of
  whether the ARM64_SVE cpucap was set.

* After this patch, VHE's __deactivate_cptr_traps() will take the
  ARM64_SVE cpucap into account. When ARM64_SVE is not set, SVE will be
  trapped from EL2 and below.

The old and new behaviour are both benign:

* When ARM64_SVE is not set, the host will not touch SVE state, and will
  not reconfigure SVE traps. Host EL0 access to SVE will be trapped as
  expected.

* When ARM64_SVE is set, the host will configure EL0 SVE traps before
  returning to EL0 as part of reloading the EL0 FPSIMD/SVE/SME state.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Fuad Tabba <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
There's no need for fpsimd_sve_sync() to write to CPTR/CPACR. All
relevant traps are always disabled earlier within __kvm_vcpu_run(), when
__deactivate_cptr_traps() configures CPTR/CPACR.

With irrelevant details elided, the flow is:

handle___kvm_vcpu_run(...)
{
	flush_hyp_vcpu(...) {
		fpsimd_sve_flush(...);
	}

	__kvm_vcpu_run(...) {
		__activate_traps(...) {
			__activate_cptr_traps(...);
		}

		do {
			__guest_enter(...);
		} while (...);

		__deactivate_traps(....) {
			__deactivate_cptr_traps(...);
		}
	}

	sync_hyp_vcpu(...) {
		fpsimd_sve_sync(...);
	}
}

Remove the unnecessary write to CPTR/CPACR. An ISB is still necessary,
so a comment is added to describe this requirement.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Fuad Tabba <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
The hyp code FPSIMD/SVE/SME trap handling logic has some rather messy
open-coded manipulation of CPTR/CPACR. This is benign for non-nested
guests, but broken for nested guests, as the guest hypervisor's CPTR
configuration is not taken into account.

Consider the case where L0 provides FPSIMD+SVE to an L1 guest
hypervisor, and the L1 guest hypervisor only provides FPSIMD to an L2
guest (with L1 configuring CPTR/CPACR to trap SVE usage from L2). If the
L2 guest triggers an FPSIMD trap to the L0 hypervisor,
kvm_hyp_handle_fpsimd() will see that the vCPU supports FPSIMD+SVE, and
will configure CPTR/CPACR to NOT trap FPSIMD+SVE before returning to the
L2 guest. Consequently the L2 guest would be able to manipulate SVE
state even though the L1 hypervisor had configured CPTR/CPACR to forbid
this.

Clean this up, and fix the nested virt issue by always using
__deactivate_cptr_traps() and __activate_cptr_traps() to manage the CPTR
traps. This removes the need for the ad-hoc fixup in
kvm_hyp_save_fpsimd_host(), and ensures that any guest hypervisor
configuration of CPTR/CPACR is taken into account.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Fuad Tabba <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
We no longer use cpacr_clear_set().

Remove cpacr_clear_set() and its helper functions.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Fuad Tabba <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
The VHE hyp code has recently gained a few ISBs. Simplify this to one
unconditional ISB in __kvm_vcpu_run_vhe(), and remove the unnecessary
ISB from the kvm_call_hyp_ret() macro.

While kvm_call_hyp_ret() is also used to invoke
__vgic_v3_get_gic_config(), but no ISB is necessary in that case either.

For the moment, an ISB is left in kvm_call_hyp(), as there are many more
users, and removing the ISB would require a more thorough audit.

Suggested-by: Marc Zyngier <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Fuad Tabba <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
…/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.16, take #3

- Fix another set of FP/SIMD/SVE bugs affecting NV, and plugging some
  missing synchronisation

- A small fix for the irqbypass hook fixes, tightening the check and
  ensuring that we only deal with MSI for both the old and the new
  route entry

- Rework the way the shadow LRs are addressed in a nesting
  configuration, plugging an embarrassing bug as well as simplifying
  the whole process

- Add yet another fix for the dreaded arch_timer_edge_cases selftest
 into HEAD

KVM/riscv fixes for 6.16, take #1

- Fix the size parameter check in SBI SFENCE calls
- Don't treat SBI HFENCE calls as NOPs
Add the new TDVMCALL status code TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED and
return it for unimplemented TDVMCALL subfunctions.

Returning TDVMCALL_STATUS_INVALID_OPERAND when a subfunction is not
implemented is vague because TDX guests can't tell the error is due to
the subfunction is not supported or an invalid input of the subfunction.
New GHCI spec adds TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED to avoid the
ambiguity. Use it instead of TDVMCALL_STATUS_INVALID_OPERAND.

Before the change, for common guest implementations, when a TDX guest
receives TDVMCALL_STATUS_INVALID_OPERAND, it has two cases:
1. Some operand is invalid. It could change the operand to another value
   retry.
2. The subfunction is not supported.

For case 1, an invalid operand usually means the guest implementation bug.
Since the TDX guest can't tell which case is, the best practice for
handling TDVMCALL_STATUS_INVALID_OPERAND is stopping calling such leaf,
treating the failure as fatal if the TDVMCALL is essential or ignoring
it if the TDVMCALL is optional.

With this change, TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED could be sent to
old TDX guest that do not know about it, but it is expected that the
guest will make the same action as TDVMCALL_STATUS_INVALID_OPERAND.
Currently, no known TDX guest checks TDVMCALL_STATUS_INVALID_OPERAND
specifically; for example Linux just checks for success.

Signed-off-by: Binbin Wu <[email protected]>
[Return it for untrapped KVM_HC_MAP_GPA_RANGE. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
Handle TDVMCALL for GetQuote to generate a TD-Quote.

GetQuote is a doorbell-like interface used by TDX guests to request VMM
to generate a TD-Quote signed by a service hosting TD-Quoting Enclave
operating on the host.  A TDX guest passes a TD Report (TDREPORT_STRUCT) in
a shared-memory area as parameter.  Host VMM can access it and queue the
operation for a service hosting TD-Quoting enclave.  When completed, the
Quote is returned via the same shared-memory area.

KVM only checks the GPA from the TDX guest has the shared-bit set and drops
the shared-bit before exiting to userspace to avoid bleeding the shared-bit
into KVM's exit ABI.  KVM forwards the request to userspace VMM (e.g. QEMU)
and userspace VMM queues the operation asynchronously.  KVM sets the return
code according to the 'ret' field set by userspace to notify the TDX guest
whether the request has been queued successfully or not.  When the request
has been queued successfully, the TDX guest can poll the status field in
the shared-memory area to check whether the Quote generation is completed
or not.  When completed, the generated Quote is returned via the same
buffer.

Add KVM_EXIT_TDX as a new exit reason to userspace. Userspace is
required to handle the KVM exit reason as the initial support for TDX,
by reentering KVM to ensure that the TDVMCALL is complete.  While at it,
add a note that KVM_EXIT_HYPERCALL also requires reentry with KVM_RUN.

Signed-off-by: Binbin Wu <[email protected]>
Tested-by: Mikko Ylinen <[email protected]>
Acked-by: Kai Huang <[email protected]>
[Adjust userspace API. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
Exit to userspace for TDG.VP.VMCALL<GetTdVmCallInfo> via KVM_EXIT_TDX,
to allow userspace to provide information about the support of
TDVMCALLs when r12 is 1 for the TDVMCALLs beyond the GHCI base API.

GHCI spec defines the GHCI base TDVMCALLs: <GetTdVmCallInfo>, <MapGPA>,
<ReportFatalError>, <Instruction.CPUID>, <#VE.RequestMMIO>,
<Instruction.HLT>, <Instruction.IO>, <Instruction.RDMSR> and
<Instruction.WRMSR>. They must be supported by VMM to support TDX guests.

For GetTdVmCallInfo
- When leaf (r12) to enumerate TDVMCALL functionality is set to 0,
  successful execution indicates all GHCI base TDVMCALLs listed above are
  supported.

  Update the KVM TDX document with the set of the GHCI base APIs.

- When leaf (r12) to enumerate TDVMCALL functionality is set to 1, it
  indicates the TDX guest is querying the supported TDVMCALLs beyond
  the GHCI base TDVMCALLs.
  Exit to userspace to let userspace set the TDVMCALL sub-function bit(s)
  accordingly to the leaf outputs.  KVM could set the TDVMCALL bit(s)
  supported by itself when the TDVMCALLs don't need support from userspace
  after returning from userspace and before entering guest. Currently, no
  such TDVMCALLs implemented, KVM just sets the values returned from
  userspace.

Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Binbin Wu <[email protected]>
[Adjust userspace API. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
ath79_misc_irq_init() was defined but unused since commit 51fa4f8
("MIPS: ath79: drop legacy IRQ code"), so it's time to drop it.

The build also warns about a missing prototype of get_c0_perfcount_int().

Remove the stale leftover function and add the missing include.

Signed-off-by: Shiji Yang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/OSBPR01MB167032D2017645200787AAEBBC72A@OSBPR01MB1670.jpnprd01.prod.outlook.com
after fabc4ed, server_unresponsive add a condition to check whether client
need to reconnect depending on server->lstrp. When client failed to reconnect
for some time and abort connection, server->lstrp is updated for the last time.
In the following scene, server->lstrp is too old. This cause next command
failure in re-negotiation rather than waiting for re-negotiation done.

1. mount -t cifs -o username=Everyone,echo_internal=10 //$server_ip/export /mnt
2. ssh $server_ip "echo b > /proc/sysrq-trigger &"
3. ls /mnt
4. sleep 21s
5. ssh $server_ip "service firewalld stop"
6. ls # return EHOSTDOWN

If the interval between 5 and 6 is too small, 6 may trigger sending negotiation
request. Before backgrounding cifsd thread try to receive negotiation response
from server in cifs_readv_from_socket, server_unresponsive may trigger
cifs_reconnect which cause 6 to be failed:

ls thread
----------------
  smb2_negotiate
    server->tcpStatus = CifsInNegotiate
    compound_send_recv
      wait_for_compound_request

cifsd thread
----------------
  cifs_readv_from_socket
    server_unresponsive
      server->tcpStatus == CifsInNegotiate && jiffies > server->lstrp + 20s
        cifs_reconnect
          cifs_abort_connection: mid_state = MID_RETRY_NEEDED

ls thread
----------------
      cifs_sync_mid_result return EAGAIN
  smb2_negotiate return EHOSTDOWN

Though server->lstrp means last server response time, it is updated in
cifs_abort_connection and cifs_get_tcp_session. We can also update server->lstrp
before switching into CifsInNegotiate state to avoid failure in 6.

Fixes: 7ccc146 ("smb: client: fix hang in wait_for_response() for negproto")
Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
Acked-by: Meetakshi Setiya <[email protected]>
Signed-off-by: zhangjian <[email protected]>
Signed-off-by: Steve French <[email protected]>
This fixes the following problem:

[  749.901015] [   T8673] run fstests cifs/001 at 2025-06-17 09:40:30
[  750.346409] [   T9870] ==================================================================
[  750.346814] [   T9870] BUG: KASAN: slab-out-of-bounds in smb_set_sge+0x2cc/0x3b0 [cifs]
[  750.347330] [   T9870] Write of size 8 at addr ffff888011082890 by task xfs_io/9870
[  750.347705] [   T9870]
[  750.348077] [   T9870] CPU: 0 UID: 0 PID: 9870 Comm: xfs_io Kdump: loaded Not tainted 6.16.0-rc2-metze.02+ #1 PREEMPT(voluntary)
[  750.348082] [   T9870] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  750.348085] [   T9870] Call Trace:
[  750.348086] [   T9870]  <TASK>
[  750.348088] [   T9870]  dump_stack_lvl+0x76/0xa0
[  750.348106] [   T9870]  print_report+0xd1/0x640
[  750.348116] [   T9870]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  750.348120] [   T9870]  ? kasan_complete_mode_report_info+0x26/0x210
[  750.348124] [   T9870]  kasan_report+0xe7/0x130
[  750.348128] [   T9870]  ? smb_set_sge+0x2cc/0x3b0 [cifs]
[  750.348262] [   T9870]  ? smb_set_sge+0x2cc/0x3b0 [cifs]
[  750.348377] [   T9870]  __asan_report_store8_noabort+0x17/0x30
[  750.348381] [   T9870]  smb_set_sge+0x2cc/0x3b0 [cifs]
[  750.348496] [   T9870]  smbd_post_send_iter+0x1990/0x3070 [cifs]
[  750.348625] [   T9870]  ? __pfx_smbd_post_send_iter+0x10/0x10 [cifs]
[  750.348741] [   T9870]  ? update_stack_state+0x2a0/0x670
[  750.348749] [   T9870]  ? cifs_flush+0x153/0x320 [cifs]
[  750.348870] [   T9870]  ? cifs_flush+0x153/0x320 [cifs]
[  750.348990] [   T9870]  ? update_stack_state+0x2a0/0x670
[  750.348995] [   T9870]  smbd_send+0x58c/0x9c0 [cifs]
[  750.349117] [   T9870]  ? __pfx_smbd_send+0x10/0x10 [cifs]
[  750.349231] [   T9870]  ? unwind_get_return_address+0x65/0xb0
[  750.349235] [   T9870]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[  750.349242] [   T9870]  ? arch_stack_walk+0xa7/0x100
[  750.349250] [   T9870]  ? stack_trace_save+0x92/0xd0
[  750.349254] [   T9870]  __smb_send_rqst+0x931/0xec0 [cifs]
[  750.349374] [   T9870]  ? kernel_text_address+0x173/0x190
[  750.349379] [   T9870]  ? kasan_save_stack+0x39/0x70
[  750.349382] [   T9870]  ? kasan_save_track+0x18/0x70
[  750.349385] [   T9870]  ? __kasan_slab_alloc+0x9d/0xa0
[  750.349389] [   T9870]  ? __pfx___smb_send_rqst+0x10/0x10 [cifs]
[  750.349508] [   T9870]  ? smb2_mid_entry_alloc+0xb4/0x7e0 [cifs]
[  750.349626] [   T9870]  ? cifs_call_async+0x277/0xb00 [cifs]
[  750.349746] [   T9870]  ? cifs_issue_write+0x256/0x610 [cifs]
[  750.349867] [   T9870]  ? netfs_do_issue_write+0xc2/0x340 [netfs]
[  750.349900] [   T9870]  ? netfs_advance_write+0x45b/0x1270 [netfs]
[  750.349929] [   T9870]  ? netfs_write_folio+0xd6c/0x1be0 [netfs]
[  750.349958] [   T9870]  ? netfs_writepages+0x2e9/0xa80 [netfs]
[  750.349987] [   T9870]  ? do_writepages+0x21f/0x590
[  750.349993] [   T9870]  ? filemap_fdatawrite_wbc+0xe1/0x140
[  750.349997] [   T9870]  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.350002] [   T9870]  smb_send_rqst+0x22e/0x2f0 [cifs]
[  750.350131] [   T9870]  ? __pfx_smb_send_rqst+0x10/0x10 [cifs]
[  750.350255] [   T9870]  ? local_clock_noinstr+0xe/0xd0
[  750.350261] [   T9870]  ? kasan_save_alloc_info+0x37/0x60
[  750.350268] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.350271] [   T9870]  ? _raw_spin_lock+0x81/0xf0
[  750.350275] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.350278] [   T9870]  ? smb2_setup_async_request+0x293/0x580 [cifs]
[  750.350398] [   T9870]  cifs_call_async+0x477/0xb00 [cifs]
[  750.350518] [   T9870]  ? __pfx_smb2_writev_callback+0x10/0x10 [cifs]
[  750.350636] [   T9870]  ? __pfx_cifs_call_async+0x10/0x10 [cifs]
[  750.350756] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.350760] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.350763] [   T9870]  ? __smb2_plain_req_init+0x933/0x1090 [cifs]
[  750.350891] [   T9870]  smb2_async_writev+0x15ff/0x2460 [cifs]
[  750.351008] [   T9870]  ? sched_clock_noinstr+0x9/0x10
[  750.351012] [   T9870]  ? local_clock_noinstr+0xe/0xd0
[  750.351018] [   T9870]  ? __pfx_smb2_async_writev+0x10/0x10 [cifs]
[  750.351144] [   T9870]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  750.351150] [   T9870]  ? _raw_spin_unlock+0xe/0x40
[  750.351154] [   T9870]  ? cifs_pick_channel+0x242/0x370 [cifs]
[  750.351275] [   T9870]  cifs_issue_write+0x256/0x610 [cifs]
[  750.351554] [   T9870]  ? cifs_issue_write+0x256/0x610 [cifs]
[  750.351677] [   T9870]  netfs_do_issue_write+0xc2/0x340 [netfs]
[  750.351710] [   T9870]  netfs_advance_write+0x45b/0x1270 [netfs]
[  750.351740] [   T9870]  ? rolling_buffer_append+0x12d/0x440 [netfs]
[  750.351769] [   T9870]  netfs_write_folio+0xd6c/0x1be0 [netfs]
[  750.351798] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.351804] [   T9870]  netfs_writepages+0x2e9/0xa80 [netfs]
[  750.351835] [   T9870]  ? __pfx_netfs_writepages+0x10/0x10 [netfs]
[  750.351864] [   T9870]  ? exit_files+0xab/0xe0
[  750.351867] [   T9870]  ? do_exit+0x148f/0x2980
[  750.351871] [   T9870]  ? do_group_exit+0xb5/0x250
[  750.351874] [   T9870]  ? arch_do_signal_or_restart+0x92/0x630
[  750.351879] [   T9870]  ? exit_to_user_mode_loop+0x98/0x170
[  750.351882] [   T9870]  ? do_syscall_64+0x2cf/0xd80
[  750.351886] [   T9870]  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.351890] [   T9870]  do_writepages+0x21f/0x590
[  750.351894] [   T9870]  ? __pfx_do_writepages+0x10/0x10
[  750.351897] [   T9870]  filemap_fdatawrite_wbc+0xe1/0x140
[  750.351901] [   T9870]  __filemap_fdatawrite_range+0xba/0x100
[  750.351904] [   T9870]  ? __pfx___filemap_fdatawrite_range+0x10/0x10
[  750.351912] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.351916] [   T9870]  filemap_write_and_wait_range+0x7d/0xf0
[  750.351920] [   T9870]  cifs_flush+0x153/0x320 [cifs]
[  750.352042] [   T9870]  filp_flush+0x107/0x1a0
[  750.352046] [   T9870]  filp_close+0x14/0x30
[  750.352049] [   T9870]  put_files_struct.part.0+0x126/0x2a0
[  750.352053] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.352058] [   T9870]  exit_files+0xab/0xe0
[  750.352061] [   T9870]  do_exit+0x148f/0x2980
[  750.352065] [   T9870]  ? __pfx_do_exit+0x10/0x10
[  750.352069] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.352072] [   T9870]  ? _raw_spin_lock_irq+0x8a/0xf0
[  750.352076] [   T9870]  do_group_exit+0xb5/0x250
[  750.352080] [   T9870]  get_signal+0x22d3/0x22e0
[  750.352086] [   T9870]  ? __pfx_get_signal+0x10/0x10
[  750.352089] [   T9870]  ? fpregs_assert_state_consistent+0x68/0x100
[  750.352101] [   T9870]  ? folio_add_lru+0xda/0x120
[  750.352105] [   T9870]  arch_do_signal_or_restart+0x92/0x630
[  750.352109] [   T9870]  ? __pfx_arch_do_signal_or_restart+0x10/0x10
[  750.352115] [   T9870]  exit_to_user_mode_loop+0x98/0x170
[  750.352118] [   T9870]  do_syscall_64+0x2cf/0xd80
[  750.352123] [   T9870]  ? __kasan_check_read+0x11/0x20
[  750.352126] [   T9870]  ? count_memcg_events+0x1b4/0x420
[  750.352132] [   T9870]  ? handle_mm_fault+0x148/0x690
[  750.352136] [   T9870]  ? _raw_spin_lock_irq+0x8a/0xf0
[  750.352140] [   T9870]  ? __kasan_check_read+0x11/0x20
[  750.352143] [   T9870]  ? fpregs_assert_state_consistent+0x68/0x100
[  750.352146] [   T9870]  ? irqentry_exit_to_user_mode+0x2e/0x250
[  750.352151] [   T9870]  ? irqentry_exit+0x43/0x50
[  750.352154] [   T9870]  ? exc_page_fault+0x75/0xe0
[  750.352160] [   T9870]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.352163] [   T9870] RIP: 0033:0x7858c94ab6e2
[  750.352167] [   T9870] Code: Unable to access opcode bytes at 0x7858c94ab6b8.
[  750.352175] [   T9870] RSP: 002b:00007858c9248ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000022
[  750.352179] [   T9870] RAX: fffffffffffffdfe RBX: 00007858c92496c0 RCX: 00007858c94ab6e2
[  750.352182] [   T9870] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  750.352184] [   T9870] RBP: 00007858c9248d10 R08: 0000000000000000 R09: 0000000000000000
[  750.352185] [   T9870] R10: 0000000000000000 R11: 0000000000000246 R12: fffffffffffffde0
[  750.352187] [   T9870] R13: 0000000000000020 R14: 0000000000000002 R15: 00007ffc072d2230
[  750.352191] [   T9870]  </TASK>
[  750.352195] [   T9870]
[  750.395206] [   T9870] Allocated by task 9870 on cpu 0 at 750.346406s:
[  750.395523] [   T9870]  kasan_save_stack+0x39/0x70
[  750.395532] [   T9870]  kasan_save_track+0x18/0x70
[  750.395536] [   T9870]  kasan_save_alloc_info+0x37/0x60
[  750.395539] [   T9870]  __kasan_slab_alloc+0x9d/0xa0
[  750.395543] [   T9870]  kmem_cache_alloc_noprof+0x13c/0x3f0
[  750.395548] [   T9870]  mempool_alloc_slab+0x15/0x20
[  750.395553] [   T9870]  mempool_alloc_noprof+0x135/0x340
[  750.395557] [   T9870]  smbd_post_send_iter+0x63e/0x3070 [cifs]
[  750.395694] [   T9870]  smbd_send+0x58c/0x9c0 [cifs]
[  750.395819] [   T9870]  __smb_send_rqst+0x931/0xec0 [cifs]
[  750.395950] [   T9870]  smb_send_rqst+0x22e/0x2f0 [cifs]
[  750.396081] [   T9870]  cifs_call_async+0x477/0xb00 [cifs]
[  750.396232] [   T9870]  smb2_async_writev+0x15ff/0x2460 [cifs]
[  750.396359] [   T9870]  cifs_issue_write+0x256/0x610 [cifs]
[  750.396492] [   T9870]  netfs_do_issue_write+0xc2/0x340 [netfs]
[  750.396544] [   T9870]  netfs_advance_write+0x45b/0x1270 [netfs]
[  750.396576] [   T9870]  netfs_write_folio+0xd6c/0x1be0 [netfs]
[  750.396608] [   T9870]  netfs_writepages+0x2e9/0xa80 [netfs]
[  750.396639] [   T9870]  do_writepages+0x21f/0x590
[  750.396643] [   T9870]  filemap_fdatawrite_wbc+0xe1/0x140
[  750.396647] [   T9870]  __filemap_fdatawrite_range+0xba/0x100
[  750.396651] [   T9870]  filemap_write_and_wait_range+0x7d/0xf0
[  750.396656] [   T9870]  cifs_flush+0x153/0x320 [cifs]
[  750.396787] [   T9870]  filp_flush+0x107/0x1a0
[  750.396791] [   T9870]  filp_close+0x14/0x30
[  750.396795] [   T9870]  put_files_struct.part.0+0x126/0x2a0
[  750.396800] [   T9870]  exit_files+0xab/0xe0
[  750.396803] [   T9870]  do_exit+0x148f/0x2980
[  750.396808] [   T9870]  do_group_exit+0xb5/0x250
[  750.396813] [   T9870]  get_signal+0x22d3/0x22e0
[  750.396817] [   T9870]  arch_do_signal_or_restart+0x92/0x630
[  750.396822] [   T9870]  exit_to_user_mode_loop+0x98/0x170
[  750.396827] [   T9870]  do_syscall_64+0x2cf/0xd80
[  750.396832] [   T9870]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.396836] [   T9870]
[  750.397150] [   T9870] The buggy address belongs to the object at ffff888011082800
                           which belongs to the cache smbd_request_0000000008f3bd7b of size 144
[  750.397798] [   T9870] The buggy address is located 0 bytes to the right of
                           allocated 144-byte region [ffff888011082800, ffff888011082890)
[  750.398469] [   T9870]
[  750.398800] [   T9870] The buggy address belongs to the physical page:
[  750.399141] [   T9870] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11082
[  750.399148] [   T9870] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[  750.399155] [   T9870] page_type: f5(slab)
[  750.399161] [   T9870] raw: 000fffffc0000000 ffff888022d65640 dead000000000122 0000000000000000
[  750.399165] [   T9870] raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
[  750.399169] [   T9870] page dumped because: kasan: bad access detected
[  750.399172] [   T9870]
[  750.399505] [   T9870] Memory state around the buggy address:
[  750.399863] [   T9870]  ffff888011082780: fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  750.400247] [   T9870]  ffff888011082800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  750.400618] [   T9870] >ffff888011082880: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  750.400982] [   T9870]                          ^
[  750.401370] [   T9870]  ffff888011082900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  750.401774] [   T9870]  ffff888011082980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  750.402171] [   T9870] ==================================================================
[  750.402696] [   T9870] Disabling lock debugging due to kernel taint
[  750.403202] [   T9870] BUG: unable to handle page fault for address: ffff8880110a2000
[  750.403797] [   T9870] #PF: supervisor write access in kernel mode
[  750.404204] [   T9870] #PF: error_code(0x0003) - permissions violation
[  750.404581] [   T9870] PGD 5ce01067 P4D 5ce01067 PUD 5ce02067 PMD 78aa063 PTE 80000000110a2021
[  750.404969] [   T9870] Oops: Oops: 0003 [#1] SMP KASAN PTI
[  750.405394] [   T9870] CPU: 0 UID: 0 PID: 9870 Comm: xfs_io Kdump: loaded Tainted: G    B               6.16.0-rc2-metze.02+ #1 PREEMPT(voluntary)
[  750.406510] [   T9870] Tainted: [B]=BAD_PAGE
[  750.406967] [   T9870] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  750.407440] [   T9870] RIP: 0010:smb_set_sge+0x15c/0x3b0 [cifs]
[  750.408065] [   T9870] Code: 48 83 f8 ff 0f 84 b0 00 00 00 48 ba 00 00 00 00 00 fc ff df 4c 89 e1 48 c1 e9 03 80 3c 11 00 0f 85 69 01 00 00 49 8d 7c 24 08 <49> 89 04 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f
[  750.409283] [   T9870] RSP: 0018:ffffc90005e2e758 EFLAGS: 00010246
[  750.409803] [   T9870] RAX: ffff888036c53400 RBX: ffffc90005e2e878 RCX: 1ffff11002214400
[  750.410323] [   T9870] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff8880110a2008
[  750.411217] [   T9870] RBP: ffffc90005e2e798 R08: 0000000000000001 R09: 0000000000000400
[  750.411770] [   T9870] R10: ffff888011082800 R11: 0000000000000000 R12: ffff8880110a2000
[  750.412325] [   T9870] R13: 0000000000000000 R14: ffffc90005e2e888 R15: ffff88801a4b6000
[  750.412901] [   T9870] FS:  0000000000000000(0000) GS:ffff88812bc68000(0000) knlGS:0000000000000000
[  750.413477] [   T9870] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  750.414077] [   T9870] CR2: ffff8880110a2000 CR3: 000000005b0a6005 CR4: 00000000000726f0
[  750.414654] [   T9870] Call Trace:
[  750.415211] [   T9870]  <TASK>
[  750.415748] [   T9870]  smbd_post_send_iter+0x1990/0x3070 [cifs]
[  750.416449] [   T9870]  ? __pfx_smbd_post_send_iter+0x10/0x10 [cifs]
[  750.417128] [   T9870]  ? update_stack_state+0x2a0/0x670
[  750.417685] [   T9870]  ? cifs_flush+0x153/0x320 [cifs]
[  750.418380] [   T9870]  ? cifs_flush+0x153/0x320 [cifs]
[  750.419055] [   T9870]  ? update_stack_state+0x2a0/0x670
[  750.419624] [   T9870]  smbd_send+0x58c/0x9c0 [cifs]
[  750.420297] [   T9870]  ? __pfx_smbd_send+0x10/0x10 [cifs]
[  750.420936] [   T9870]  ? unwind_get_return_address+0x65/0xb0
[  750.421456] [   T9870]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[  750.421954] [   T9870]  ? arch_stack_walk+0xa7/0x100
[  750.422460] [   T9870]  ? stack_trace_save+0x92/0xd0
[  750.422948] [   T9870]  __smb_send_rqst+0x931/0xec0 [cifs]
[  750.423579] [   T9870]  ? kernel_text_address+0x173/0x190
[  750.424056] [   T9870]  ? kasan_save_stack+0x39/0x70
[  750.424813] [   T9870]  ? kasan_save_track+0x18/0x70
[  750.425323] [   T9870]  ? __kasan_slab_alloc+0x9d/0xa0
[  750.425831] [   T9870]  ? __pfx___smb_send_rqst+0x10/0x10 [cifs]
[  750.426548] [   T9870]  ? smb2_mid_entry_alloc+0xb4/0x7e0 [cifs]
[  750.427231] [   T9870]  ? cifs_call_async+0x277/0xb00 [cifs]
[  750.427882] [   T9870]  ? cifs_issue_write+0x256/0x610 [cifs]
[  750.428909] [   T9870]  ? netfs_do_issue_write+0xc2/0x340 [netfs]
[  750.429425] [   T9870]  ? netfs_advance_write+0x45b/0x1270 [netfs]
[  750.429882] [   T9870]  ? netfs_write_folio+0xd6c/0x1be0 [netfs]
[  750.430345] [   T9870]  ? netfs_writepages+0x2e9/0xa80 [netfs]
[  750.430809] [   T9870]  ? do_writepages+0x21f/0x590
[  750.431239] [   T9870]  ? filemap_fdatawrite_wbc+0xe1/0x140
[  750.431652] [   T9870]  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.432041] [   T9870]  smb_send_rqst+0x22e/0x2f0 [cifs]
[  750.432586] [   T9870]  ? __pfx_smb_send_rqst+0x10/0x10 [cifs]
[  750.433108] [   T9870]  ? local_clock_noinstr+0xe/0xd0
[  750.433482] [   T9870]  ? kasan_save_alloc_info+0x37/0x60
[  750.433855] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.434214] [   T9870]  ? _raw_spin_lock+0x81/0xf0
[  750.434561] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.434903] [   T9870]  ? smb2_setup_async_request+0x293/0x580 [cifs]
[  750.435394] [   T9870]  cifs_call_async+0x477/0xb00 [cifs]
[  750.435892] [   T9870]  ? __pfx_smb2_writev_callback+0x10/0x10 [cifs]
[  750.436388] [   T9870]  ? __pfx_cifs_call_async+0x10/0x10 [cifs]
[  750.436881] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.437237] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.437579] [   T9870]  ? __smb2_plain_req_init+0x933/0x1090 [cifs]
[  750.438062] [   T9870]  smb2_async_writev+0x15ff/0x2460 [cifs]
[  750.438557] [   T9870]  ? sched_clock_noinstr+0x9/0x10
[  750.438906] [   T9870]  ? local_clock_noinstr+0xe/0xd0
[  750.439293] [   T9870]  ? __pfx_smb2_async_writev+0x10/0x10 [cifs]
[  750.439786] [   T9870]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  750.440143] [   T9870]  ? _raw_spin_unlock+0xe/0x40
[  750.440495] [   T9870]  ? cifs_pick_channel+0x242/0x370 [cifs]
[  750.440989] [   T9870]  cifs_issue_write+0x256/0x610 [cifs]
[  750.441492] [   T9870]  ? cifs_issue_write+0x256/0x610 [cifs]
[  750.441987] [   T9870]  netfs_do_issue_write+0xc2/0x340 [netfs]
[  750.442387] [   T9870]  netfs_advance_write+0x45b/0x1270 [netfs]
[  750.442969] [   T9870]  ? rolling_buffer_append+0x12d/0x440 [netfs]
[  750.443376] [   T9870]  netfs_write_folio+0xd6c/0x1be0 [netfs]
[  750.443768] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.444145] [   T9870]  netfs_writepages+0x2e9/0xa80 [netfs]
[  750.444541] [   T9870]  ? __pfx_netfs_writepages+0x10/0x10 [netfs]
[  750.444936] [   T9870]  ? exit_files+0xab/0xe0
[  750.445312] [   T9870]  ? do_exit+0x148f/0x2980
[  750.445672] [   T9870]  ? do_group_exit+0xb5/0x250
[  750.446028] [   T9870]  ? arch_do_signal_or_restart+0x92/0x630
[  750.446402] [   T9870]  ? exit_to_user_mode_loop+0x98/0x170
[  750.446762] [   T9870]  ? do_syscall_64+0x2cf/0xd80
[  750.447132] [   T9870]  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.447499] [   T9870]  do_writepages+0x21f/0x590
[  750.447859] [   T9870]  ? __pfx_do_writepages+0x10/0x10
[  750.448236] [   T9870]  filemap_fdatawrite_wbc+0xe1/0x140
[  750.448595] [   T9870]  __filemap_fdatawrite_range+0xba/0x100
[  750.448953] [   T9870]  ? __pfx___filemap_fdatawrite_range+0x10/0x10
[  750.449336] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.449697] [   T9870]  filemap_write_and_wait_range+0x7d/0xf0
[  750.450062] [   T9870]  cifs_flush+0x153/0x320 [cifs]
[  750.450592] [   T9870]  filp_flush+0x107/0x1a0
[  750.450952] [   T9870]  filp_close+0x14/0x30
[  750.451322] [   T9870]  put_files_struct.part.0+0x126/0x2a0
[  750.451678] [   T9870]  ? __pfx__raw_spin_lock+0x10/0x10
[  750.452033] [   T9870]  exit_files+0xab/0xe0
[  750.452401] [   T9870]  do_exit+0x148f/0x2980
[  750.452751] [   T9870]  ? __pfx_do_exit+0x10/0x10
[  750.453109] [   T9870]  ? __kasan_check_write+0x14/0x30
[  750.453459] [   T9870]  ? _raw_spin_lock_irq+0x8a/0xf0
[  750.453787] [   T9870]  do_group_exit+0xb5/0x250
[  750.454082] [   T9870]  get_signal+0x22d3/0x22e0
[  750.454406] [   T9870]  ? __pfx_get_signal+0x10/0x10
[  750.454709] [   T9870]  ? fpregs_assert_state_consistent+0x68/0x100
[  750.455031] [   T9870]  ? folio_add_lru+0xda/0x120
[  750.455347] [   T9870]  arch_do_signal_or_restart+0x92/0x630
[  750.455656] [   T9870]  ? __pfx_arch_do_signal_or_restart+0x10/0x10
[  750.455967] [   T9870]  exit_to_user_mode_loop+0x98/0x170
[  750.456282] [   T9870]  do_syscall_64+0x2cf/0xd80
[  750.456591] [   T9870]  ? __kasan_check_read+0x11/0x20
[  750.456897] [   T9870]  ? count_memcg_events+0x1b4/0x420
[  750.457280] [   T9870]  ? handle_mm_fault+0x148/0x690
[  750.457616] [   T9870]  ? _raw_spin_lock_irq+0x8a/0xf0
[  750.457925] [   T9870]  ? __kasan_check_read+0x11/0x20
[  750.458297] [   T9870]  ? fpregs_assert_state_consistent+0x68/0x100
[  750.458672] [   T9870]  ? irqentry_exit_to_user_mode+0x2e/0x250
[  750.459191] [   T9870]  ? irqentry_exit+0x43/0x50
[  750.459600] [   T9870]  ? exc_page_fault+0x75/0xe0
[  750.460130] [   T9870]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  750.460570] [   T9870] RIP: 0033:0x7858c94ab6e2
[  750.461206] [   T9870] Code: Unable to access opcode bytes at 0x7858c94ab6b8.
[  750.461780] [   T9870] RSP: 002b:00007858c9248ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000022
[  750.462327] [   T9870] RAX: fffffffffffffdfe RBX: 00007858c92496c0 RCX: 00007858c94ab6e2
[  750.462653] [   T9870] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  750.462969] [   T9870] RBP: 00007858c9248d10 R08: 0000000000000000 R09: 0000000000000000
[  750.463290] [   T9870] R10: 0000000000000000 R11: 0000000000000246 R12: fffffffffffffde0
[  750.463640] [   T9870] R13: 0000000000000020 R14: 0000000000000002 R15: 00007ffc072d2230
[  750.463965] [   T9870]  </TASK>
[  750.464285] [   T9870] Modules linked in: siw ib_uverbs ccm cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs softdog vboxsf vboxguest cpuid intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec polyval_clmulni ghash_clmulni_intel sha1_ssse3 aesni_intel rapl i2c_piix4 i2c_smbus joydev input_leds mac_hid sunrpc binfmt_misc kvm_intel kvm irqbypass sch_fq_codel efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs ip_tables x_tables autofs4 hid_generic vboxvideo usbhid drm_vram_helper psmouse vga16fb vgastate drm_ttm_helper serio_raw hid ahci libahci ttm pata_acpi video wmi [last unloaded: vboxguest]
[  750.467127] [   T9870] CR2: ffff8880110a2000

cc: Tom Talpey <[email protected]>
cc: [email protected]
Reviewed-by: David Howells <[email protected]>
Reviewed-by: Tom Talpey <[email protected]>
Fixes: c45ebd6 ("cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs")
Signed-off-by: Stefan Metzmacher <[email protected]>
Signed-off-by: Steve French <[email protected]>
Fix cifs_prepare_write() to negotiate the wsize if it is unset.

Reviewed-by: Shyam Prasad N <[email protected]>
Reviewed-by: Bharath SM <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Paulo Alcantara <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
Signed-off-by: Steve French <[email protected]>
Under low-memory conditions, close_all_cached_dirs() can't move the
dentries to a separate list to dput() them once the locks are dropped.
This will result in a "Dentry still in use" error, so add an error
message that makes it clear this is what happened:

[  495.281119] CIFS: VFS: \\otters.example.com\share Out of memory while dropping dentries
[  495.281595] ------------[ cut here ]------------
[  495.281887] BUG: Dentry ffff888115531138{i=78,n=/}  still in use (2) [unmount of cifs cifs]
[  495.282391] WARNING: CPU: 1 PID: 2329 at fs/dcache.c:1536 umount_check+0xc8/0xf0

Also, bail out of looping through all tcons as soon as a single
allocation fails, since we're already in trouble, and kmalloc() attempts
for subseqeuent tcons are likely to fail just like the first one did.

Signed-off-by: Paul Aurich <[email protected]>
Acked-by: Bharath SM <[email protected]>
Suggested-by: Ruben Devos <[email protected]>
Cc: [email protected]
Signed-off-by: Steve French <[email protected]>
Change the pos field in struct cached_dirents from int to loff_t
to support large directory offsets. This avoids overflow and
matches kernel conventions for directory positions.

Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
Signed-off-by: Bharath SM <[email protected]>
Signed-off-by: Steve French <[email protected]>
Replaced hardcoded length with sizeof(flags_string).

Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
Signed-off-by: Bharath SM <[email protected]>
Signed-off-by: Steve French <[email protected]>
Replaced hardcoded value 16 with SMB2_NTLMV2_SESSKEY_SIZE
in the auth_key definition and memcpy call.

Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
Signed-off-by: Bharath SM <[email protected]>
Signed-off-by: Steve French <[email protected]>
If spacemit_i2c_xfer_msg() times out waiting for a message transfer to
complete, or if the hardware reports an error, it returns a negative
error code (-ETIMEDOUT, -EAGAIN, -ENXIO. or -EIO).

The sole caller of spacemit_i2c_xfer_msg() is spacemit_i2c_xfer(),
which is the i2c_algorithm->xfer callback function.  It currently
does not save the value returned by spacemit_i2c_xfer_msg().

The result is that transfer errors go unreported, and a caller
has no indication anything is wrong.

When this code was out for review, the return value *was* checked
in early versions.  But for some reason, that assignment got dropped
between versions 5 and 6 of the series, perhaps related to reworking
the code to merge spacemit_i2c_xfer_core() into spacemit_i2c_xfer().

Simply assigning the value returned to "ret" fixes the problem.

Fixes: 5ea5584 ("i2c: spacemit: add support for SpacemiT K1 SoC")
Signed-off-by: Alex Elder <[email protected]>
Cc: <[email protected]> # v6.15+
Reviewed-by: Troy Mitchell <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Andi Shyti <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
…ench/cifs-2.6

Pull smb client fixes from Steve French:

 - Multichannel channel allocation fix for Kerberos mounts

 - Two reconnect fixes

 - Fix netfs_writepages crash with smbdirect/RDMA

 - Directory caching fix

 - Three minor cleanup fixes

 - Log error when close cached dirs fails

* tag 'v6.16-rc2-smb3-client-fixes-v2' of git://git.samba.org/sfrench/cifs-2.6:
  smb: minor fix to use SMB2_NTLMV2_SESSKEY_SIZE for auth_key size
  smb: minor fix to use sizeof to initialize flags_string buffer
  smb: Use loff_t for directory position in cached_dirents
  smb: Log an error when close_all_cached_dirs fails
  cifs: Fix prepare_write to negotiate wsize if needed
  smb: client: fix max_sge overflow in smb_extract_folioq_to_rdma()
  smb: client: fix first command failure during re-negotiation
  cifs: Remove duplicate fattr->cf_dtype assignment from wsl_to_fattr() function
  smb: fix secondary channel creation issue with kerberos by populating hostname when adding channels
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix another set of FP/SIMD/SVE bugs affecting NV, and plugging some
     missing synchronisation

   - A small fix for the irqbypass hook fixes, tightening the check and
     ensuring that we only deal with MSI for both the old and the new
     route entry

   - Rework the way the shadow LRs are addressed in a nesting
     configuration, plugging an embarrassing bug as well as simplifying
     the whole process

   - Add yet another fix for the dreaded arch_timer_edge_cases selftest

  RISC-V:

   - Fix the size parameter check in SBI SFENCE calls

   - Don't treat SBI HFENCE calls as NOPs

  x86 TDX:

   - Complete API for handling complex TDVMCALLs in userspace.

     This was delayed because the spec lacked a way for userspace to
     deny supporting these calls; the new exit code is now approved"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: TDX: Exit to userspace for GetTdVmCallInfo
  KVM: TDX: Handle TDG.VP.VMCALL<GetQuote>
  KVM: TDX: Add new TDVMCALL status code for unsupported subfuncs
  KVM: arm64: VHE: Centralize ISBs when returning to host
  KVM: arm64: Remove cpacr_clear_set()
  KVM: arm64: Remove ad-hoc CPTR manipulation from kvm_hyp_handle_fpsimd()
  KVM: arm64: Remove ad-hoc CPTR manipulation from fpsimd_sve_sync()
  KVM: arm64: Reorganise CPTR trap manipulation
  KVM: arm64: VHE: Synchronize CPTR trap deactivation
  KVM: arm64: VHE: Synchronize restore of host debug registers
  KVM: arm64: selftests: Close the GIC FD in arch_timer_edge_cases
  KVM: arm64: Explicitly treat routing entry type changes as changes
  KVM: arm64: nv: Fix tracking of shadow list registers
  RISC-V: KVM: Don't treat SBI HFENCE calls as NOPs
  RISC-V: KVM: Fix the size parameter check in SBI SFENCE calls
…/linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:

 - amd64: Correct the number of memory controllers on some AMD Zen
   clients

 - igen6: Handle firmware-disabled memory controllers properly

* tag 'edac_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/igen6: Fix NULL pointer dereference
  EDAC/amd64: Correct number of UMCs for family 19h models 70h-7fh
…scm/linux/kernel/git/tip/tip

Pull locking fixes from Borislav Petkov:

 - Make sure the switch to the global hash is requested always under a
   lock so that two threads requesting that simultaneously cannot get to
   inconsistent state

 - Reject negative NUMA nodes earlier in the futex NUMA interface
   handling code

 - Selftests fixes

* tag 'locking_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Verify under the lock if hash can be replaced
  futex: Handle invalid node numbers supplied by user
  selftests/futex: Set the home_node in futex_numa_mpol
  selftests/futex: getopt() requires int as return value.
…/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Avoid a crash on a heterogeneous machine where not all cores support
   the same hw events features

 - Avoid a deadlock when throttling events

 - Document the perf event states more

 - Make sure a number of perf paths switching off or rescheduling events
   call perf_cgroup_event_disable()

 - Make sure perf does task sampling before its userspace mapping is
   torn down, and not after

* tag 'perf_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix crash in icl_update_topdown_event()
  perf: Fix the throttle error of some clock events
  perf: Add comment to enum perf_event_state
  perf/core: Fix WARN in perf_cgroup_switch()
  perf: Fix dangling cgroup pointer in cpuctx
  perf: Fix cgroup state vs ERROR
  perf: Fix sample vs do_exit()
…linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Fix missing prototypes warnings

 - Properly initialize work context when allocating it

 - Remove a method tracking when managed interrupts are suspended during
   hotplug, in favor of the code using a IRQ disable depth tracking now,
   and have interrupts get properly enabled again on restore

 - Make sure multiple CPUs getting hotplugged don't cause wrong tracking
   of the managed IRQ disable depth

* tag 'irq_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/ath79-misc: Fix missing prototypes warnings
  genirq/irq_sim: Initialize work context pointers properly
  genirq/cpuhotplug: Restore affinity even for suspended IRQ
  genirq/cpuhotplug: Rebalance managed interrupts across multi-CPU hotplug
…linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Make sure the array tracking which kernel text positions need to be
   alternatives-patched doesn't get mishandled by out-of-order
   modifications, leading to it overflowing and causing page faults when
   patching

 - Avoid an infinite loop when early code does a ranged TLB invalidation
   before the broadcast TLB invalidation count of how many pages it can
   flush, has been read from CPUID

 - Fix a CONFIG_MODULES typo

 - Disable broadcast TLB invalidation when PTI is enabled to avoid an
   overflow of the bitmap tracking dynamic ASIDs which need to be
   flushed when the kernel switches between the user and kernel address
   space

 - Handle the case of a CPU going offline and thus reporting zeroes when
   reading top-level events in the resctrl code

* tag 'x86_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Fix int3 handling failure from broken text_poke array
  x86/mm: Fix early boot use of INVPLGB
  x86/its: Fix an ifdef typo in its_alloc()
  x86/mm: Disable INVLPGB when PTI is enabled
  x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem
…rnel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - subsystem: convert drivers to use recent callbacks of struct
   i2c_algorithm A typical after-rc1 cleanup, which I couldn't send in
   time for rc2

 - tegra: fix YAML conversion of device tree bindings

 - k1: re-add a check which got lost during upstreaming

* tag 'i2c-for-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: k1: check for transfer error
  i2c: use inclusive callbacks in struct i2c_algorithm
  dt-bindings: i2c: nvidia,tegra20-i2c: Specify the required properties
@pull pull bot added the ⤵️ pull label Jun 23, 2025
@pull pull bot merged commit 86731a2 into AwesomeGitHubRepos:master Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.