linux-stable-mirror

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2026-05-14 21:38:46 +02:00

Author	SHA1	Message	Date
Rafael J. Wysocki	771e8f4835	Merge branches 'acpi-cppc' and 'acpi-docs' Merge two documentation fixes for 6.18-rc5, a commet typo fix in the ACPI CPPC library (Chu Guangqing) and fixes for two ASL examples in the firmware guide (Jonas Gorski). * acpi-cppc: ACPI: CPPC: Fix typo in a comment * acpi-docs: Documentation: ACPI: i2c-muxes: fix I2C device references	2025-11-06 22:10:27 +01:00
Linus Torvalds	a1388fcb52	Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fixes from Eric Biggers: "Two Curve25519 related fixes: - Re-enable KASAN support on curve25519-hacl64.c with gcc. - Disable the arm optimized Curve25519 code on CPU_BIG_ENDIAN kernels. It has always been broken in that configuration" * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: arm/curve25519: Disable on CPU_BIG_ENDIAN lib/crypto: curve25519-hacl64: Fix older clang KASAN workaround for GCC	2025-11-06 12:48:18 -08:00
Linus Torvalds	c668da99b9	Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux Pull fscrypt fix from Eric Biggers: "Fix an UBSAN warning that started occurring when the block layer started supporting logical_block_size > PAGE_SIZE" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: fix left shift underflow when inode->i_blkbits > PAGE_SHIFT	2025-11-06 12:45:15 -08:00
Gao Xiang	f2a12cc3b9	erofs: avoid infinite loop due to incomplete zstd-compressed data Currently, the decompression logic incorrectly spins if compressed data is truncated in crafted (deliberately corrupted) images. Fixes: `7c35de4df1` ("erofs: Zstandard compression support") Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/r/50958.1761605413@localhost Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org>	2025-11-07 04:10:45 +08:00
Linus Torvalds	c90841db35	Merge tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: "This is a work-around for a (now fixed) corner case in the arm32 build with Clang KCFI enabled. - Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)" * tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk() ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS compiler_types: Introduce __nocfi_generic	2025-11-06 11:54:59 -08:00
Johannes Berg	a9da90e618	wifi: mac80211: reject address change while connecting While connecting, the MAC address can already no longer be changed. The change is already rejected if netif_carrier_ok(), but of course that's not true yet while connecting. Check for auth_data or assoc_data, so the MAC address cannot be changed. Also more comprehensively check that there are no stations on the interface being changed - if any peer station is added it will know about our address already, so we cannot change it. Cc: stable@vger.kernel.org Fixes: `3c06e91b40` ("wifi: mac80211: Support POWERED_ADDR_CHANGE feature") Link: https://patch.msgid.link/20251105154119.f9f6c1df81bb.I9bb3760ede650fb96588be0d09a5a7bdec21b217@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-11-06 19:07:47 +01:00
Krzysztof Kozlowski	4436f484cb	gpio: tb10x: Drop unused tb10x_set_bits() function tb10x_set_bits() is not referenced anywhere leading to W=1 warning: gpio-tb10x.c:59:20: error: unused function 'tb10x_set_bits' [-Werror,-Wunused-function] After its removal, tb10x_reg_write() becomes unused as well. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20251106-gpio-of-match-v1-1-50c7115a045e@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-11-06 18:19:44 +01:00
Haotian Zhang	3dc8c73365	ASoC: codecs: va-macro: fix resource leak in probe error path In the commit referenced by the Fixes tag, clk_hw_get_clk() was added in va_macro_probe() to get the fsgen clock, but forgot to add the corresponding clk_put() in va_macro_remove(). This leads to a clock reference leak when the driver is unloaded. Switch to devm_clk_hw_get_clk() to automatically manage the clock resource. Fixes: `30097967e0` ("ASoC: codecs: va-macro: use fsgen as clock") Suggested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20251106143114.729-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org>	2025-11-06 17:07:05 +00:00
Wayne Lin	3c6a743c69	drm/amd/display: Enable mst when it's detected but yet to be initialized [Why] drm_dp_mst_topology_queue_probe() is used under the assumption that mst is already initialized. If we connect system with SST first then switch to the mst branch during suspend, we will fail probing topology by calling the wrong API since the mst manager is yet to be initialized. [How] At dm_resume(), once it's detected as mst branc connected, check if the mst is initialized already. If not, call dm_helpers_dp_mst_start_top_mgr() instead to initialize mst V2: Adjust the commit msg a bit Fixes: `bc068194f5` ("drm/amd/display: Don't write DP_MSTM_CTRL after LT") Cc: Fangzhi Zuo <jerry.zuo@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `62320fb8d9`) Cc: stable@vger.kernel.org	2025-11-06 11:58:55 -05:00
Lijo Lazar	570a66b48c	drm/amdgpu: Fix wait after reset sequence in S3 For a mode-1 reset done at the end of S3 on PSPv11 dGPUs, only check if TOS is unloaded. Fixes: `32f73741d6` ("drm/amdgpu: Wait for bootloader after PSPv11 reset") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4649 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1ad25fd272`)	2025-11-06 11:58:32 -05:00
Mario Limonciello	b09cb2996c	drm/amd: Fix suspend failure with secure display TA commit `c760bcda83` ("drm/amd: Check whether secure display TA loaded successfully") attempted to fix extra messages, but failed to port the cleanup that was in commit `5c6d52ff4b` ("drm/amd: Don't try to enable secure display TA multiple times") to prevent multiple tries. Add that to the failure handling path even on a quick failure. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4679 Fixes: `c760bcda83` ("drm/amd: Check whether secure display TA loaded successfully") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `4104c0a454`)	2025-11-06 11:58:10 -05:00
Samuel Zhang	eb6e7f520d	drm/amdgpu: fix gpu page fault after hibernation on PF passthrough On PF passthrough environment, after hibernate and then resume, coralgemm will cause gpu page fault. Mode1 reset happens during hibernate, but partition mode is not restored on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right after resume. When CP access the MQD BO, wrong stride size is used, this will cause out of bound access on the MQD BO, resulting page fault. The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called when resume from a hibernation. KFD resume is called separately during a reset recovery or resume from suspend sequence. Hence it's not required to be called as part of partition switch. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `5d1b32cfe4`)	2025-11-06 11:57:08 -05:00
Linus Torvalds	c2c2ccfd4b	Merge tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: Including fixes from bluetooth and wireless. Current release - new code bugs: - ptp: expose raw cycles only for clocks with free-running counter - bonding: fix null-deref in actor_port_prio setting - mdio: ERR_PTR-check regmap pointer returned by device_node_to_regmap() - eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG Previous releases - regressions: - virtio_net: fix perf regression due to bad alignment of virtio_net_hdr_v1_hash - Revert "wifi: ath10k: avoid unnecessary wait for service ready message" caused regressions for QCA988x and QCA9984 - Revert "wifi: ath12k: Fix missing station power save configuration" caused regressions for WCN7850 - eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory corruptions after kexec Previous releases - always broken: - virtio-net: fix received packet length check for big packets - sctp: fix races in socket diag handling - wifi: add an hrtimer-based delayed work item to avoid low granularity of timers set relatively far in the future, and use it where it matters (e.g. when performing AP-scheduled channel switch) - eth: mlx5e: - correctly propagate error in case of module EEPROM read failure - fix HW-GRO on systems with PAGE_SIZE == 64kB - dsa: b53: fixes for tagging, link configuration / RMII, FDB, multicast - phy: lan8842: implement latest errata" * tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits) selftests/vsock: avoid false-positives when checking dmesg net: bridge: fix MST static key usage net: bridge: fix use-after-free due to MST port state bypass lan966x: Fix sleeping in atomic context bonding: fix NULL pointer dereference in actor_port_prio setting net: dsa: microchip: Fix reserved multicast address table programming net: wan: framer: pef2256: Switch to devm_mfd_add_devices() net: libwx: fix device bus LAN ID net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages net/mlx5e: SHAMPO, Fix skb size check for 64K pages net/mlx5e: SHAMPO, Fix header mapping for 64K pages net: ti: icssg-prueth: Fix fdb hash size configuration net/mlx5e: Fix return value in case of module EEPROM read error net: gro_cells: Reduce lock scope in gro_cell_poll libie: depend on DEBUG_FS when building LIBIE_FWLOG wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup netpoll: Fix deadlock in memory allocation under spinlock net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error virtio-net: fix received length check in big packets bnxt_en: Fix warning in bnxt_dl_reload_down() ...	2025-11-06 08:52:30 -08:00
Nathan Chancellor	a26a6c93ed	kbuild: Strip trailing padding bytes from modules.builtin.modinfo After commit `d50f210913` ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"), running modules_install with certain versions of kmod (such as 29.1 in Ubuntu Jammy) in certain configurations may fail with: depmod: ERROR: kmod_builtin_iter_next: unexpected string without modname prefix The additional padding bytes to ensure .modinfo is aligned within vmlinux.unstripped are unexpected by kmod, as this section has always just been null-terminated strings. Strip the trailing padding bytes from modules.builtin.modinfo after it has been extracted from vmlinux.unstripped to restore the format that kmod expects while keeping .modinfo aligned within vmlinux.unstripped to avoid regressing the Authenticode calculation fix for EDK2. Cc: stable@vger.kernel.org Fixes: `d50f210913` ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat") Reported-by: Omar Sandoval <osandov@fb.com> Reported-by: Samir M <samir@linux.ibm.com> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/7fef7507-ad64-4e51-9bb8-c9fb6532e51e@linux.ibm.com/ Tested-by: Omar Sandoval <osandov@fb.com> Tested-by: Samir M <samir@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/20251105-kbuild-fix-builtin-modinfo-for-kmod-v1-1-b419d8ad4606@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2025-11-06 09:50:23 -07:00
Bobby Eshleman	3534e03e0e	selftests/vsock: avoid false-positives when checking dmesg Sometimes VMs will have some intermittent dmesg warnings that are unrelated to vsock. Change the dmesg parsing to filter on strings containing 'vsock' to avoid false positive failures that are unrelated to vsock. The downside is that it is possible for some vsock related warnings to not contain the substring 'vsock', so those will be missed. Fixes: `a4a65c6fe0` ("selftests/vsock: add initial vmtest.sh for vsock") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20251105-vsock-vmtest-dmesg-fix-v2-1-1a042a14892c@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:34:50 -08:00
Jakub Kicinski	13fef4fb05	Merge branch 'net-bridge-fix-two-mst-bugs' Nikolay Aleksandrov says: ==================== net: bridge: fix two MST bugs Patch 01 fixes a race condition that exists between expired fdb deletion and port deletion when MST is enabled. Learning can happen after the port's state has been changed to disabled which could lead to that port's memory being used after it's been freed. The issue was reported by syzbot, more information in patch 01. Patch 02 fixes an issue with MST's static key which Ido spotted, we can have multiple bridges with MST and a single bridge can erroneously disable it for all. ==================== Link: https://patch.msgid.link/20251105111919.1499702-1-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:32:20 -08:00
Nikolay Aleksandrov	ee87c63f9b	net: bridge: fix MST static key usage As Ido pointed out, the static key usage in MST is buggy and should use inc/dec instead of enable/disable because we can have multiple bridges with MST enabled which means a single bridge can disable MST for all. Use static_branch_inc/dec to avoid that. When destroying a bridge decrement the key if MST was enabled. Fixes: `ec7328b591` ("net: bridge: mst: Multiple Spanning Tree (MST) mode") Reported-by: Ido Schimmel <idosch@nvidia.com> Closes: https://lore.kernel.org/netdev/20251104120313.1306566-1-razor@blackwall.org/T/#m6888d87658f94ed1725433940f4f4ebb00b5a68b Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251105111919.1499702-3-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:32:17 -08:00
Nikolay Aleksandrov	8dca36978a	net: bridge: fix use-after-free due to MST port state bypass syzbot reported[1] a use-after-free when deleting an expired fdb. It is due to a race condition between learning still happening and a port being deleted, after all its fdbs have been flushed. The port's state has been toggled to disabled so no learning should happen at that time, but if we have MST enabled, it will bypass the port's state, that together with VLAN filtering disabled can lead to fdb learning at a time when it shouldn't happen while the port is being deleted. VLAN filtering must be disabled because we flush the port VLANs when it's being deleted which will stop learning. This fix adds a check for the port's vlan group which is initialized to NULL when the port is getting deleted, that avoids the port state bypass. When MST is enabled there would be a minimal new overhead in the fast-path because the port's vlan group pointer is cache-hot. [1] https://syzkaller.appspot.com/bug?extid=dd280197f0f7ab3917be Fixes: `ec7328b591` ("net: bridge: mst: Multiple Spanning Tree (MST) mode") Reported-by: syzbot+dd280197f0f7ab3917be@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69088ffa.050a0220.29fc44.003d.GAE@google.com/ Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251105111919.1499702-2-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:32:17 -08:00
Horatiu Vultur	0216721ce7	lan966x: Fix sleeping in atomic context The following warning was seen when we try to connect using ssh to the device. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:575 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 104, name: dropbear preempt_count: 1, expected: 0 INFO: lockdep is turned off. CPU: 0 UID: 0 PID: 104 Comm: dropbear Tainted: G W 6.18.0-rc2-00399-g6f1ab1b109b9-dirty #530 NONE Tainted: [W]=WARN Hardware name: Generic DT based system Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x7c/0xac dump_stack_lvl from __might_resched+0x16c/0x2b0 __might_resched from __mutex_lock+0x64/0xd34 __mutex_lock from mutex_lock_nested+0x1c/0x24 mutex_lock_nested from lan966x_stats_get+0x5c/0x558 lan966x_stats_get from dev_get_stats+0x40/0x43c dev_get_stats from dev_seq_printf_stats+0x3c/0x184 dev_seq_printf_stats from dev_seq_show+0x10/0x30 dev_seq_show from seq_read_iter+0x350/0x4ec seq_read_iter from seq_read+0xfc/0x194 seq_read from proc_reg_read+0xac/0x100 proc_reg_read from vfs_read+0xb0/0x2b0 vfs_read from ksys_read+0x6c/0xec ksys_read from ret_fast_syscall+0x0/0x1c Exception stack(0xf0b11fa8 to 0xf0b11ff0) 1fa0: 00000001 00001000 00000008 be9048d8 00001000 00000001 1fc0: 00000001 00001000 00000008 00000003 be905920 0000001e 00000000 00000001 1fe0: 0005404c be9048c0 00018684 b6ec2cd8 It seems that we are using a mutex in a atomic context which is wrong. Change the mutex with a spinlock. Fixes: `12c2d0a5b8` ("net: lan966x: add ethtool configuration and statistics") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251105074955.1766792-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:31:34 -08:00
Nicolas Escande	9065b96875	wifi: ath11k: zero init info->status in wmi_process_mgmt_tx_comp() When reporting tx completion using ieee80211_tx_status_xxx() family of functions, the status part of the struct ieee80211_tx_info nested in the skb is used to report things like transmit rates & retry count to mac80211 On the TX data path, this is correctly memset to 0 before calling ieee80211_tx_status_ext(), but on the tx mgmt path this was not done. This leads to mac80211 treating garbage values as valid transmit counters (like tx retries for example) and accounting them as real statistics that makes their way to userland via station dump. The same issue was resolved in ath12k by commit `9903c0986f` ("wifi: ath12k: Add memset and update default rate value in wmi tx completion") Tested-on: QCN9074 PCI WLAN.HK.2.9.0.1-01977-QCAHKSWPL_SILICONZ-1 Fixes: `d5c65159f2` ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Signed-off-by: Nicolas Escande <nico.escande@gmail.com> Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Link: https://patch.msgid.link/20251104083957.717825-1-nico.escande@gmail.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>	2025-11-06 07:26:21 -08:00
Hangbin Liu	067bf016e9	bonding: fix NULL pointer dereference in actor_port_prio setting Liang reported an issue where setting a slave’s actor_port_prio to predefined values such as 0, 255, or 65535 would cause a system crash. The problem occurs because in bond_opt_parse(), when the provided value matches a predefined table entry, the function returns that table entry, which does not contain slave information. Later, in bond_option_actor_port_prio_set(), calling bond_slave_get_rtnl() leads to a NULL pointer dereference. Since actor_port_prio is defined as a u16 and initialized to the default value of 255 in ad_initialize_port(), there is no need for the bond_actor_port_prio_tbl. Using the BOND_OPTFLAG_RAWVAL flag is sufficient. Fixes: `6b6dc81ee7` ("bonding: add support for per-port LACP actor priority") Reported-by: Liang Li <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251105072620.164841-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:16:37 -08:00
Tristram Ha	96baf482ca	net: dsa: microchip: Fix reserved multicast address table programming KSZ9477/KSZ9897 and LAN937X families of switches use a reserved multicast address table for some specific forwarding with some multicast addresses, like the one used in STP. The hardware assumes the host port is the last port in KSZ9897 family and port 5 in LAN937X family. Most of the time this assumption is correct but not in other cases like KSZ9477. Originally the function just setups the first entry, but the others still need update, especially for one common multicast address that is used by PTP operation. LAN937x also uses different register bits when accessing the reserved table. Fixes: `457c182af5` ("net: dsa: microchip: generic access to ksz9477 static and reserved table") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Tested-by: Łukasz Majewski <lukma@nabladev.com> Link: https://patch.msgid.link/20251105033741.6455-1-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-06 07:11:36 -08:00
Sukrit Bhatnagar	d0164c1619	KVM: VMX: Fix check for valid GVA on an EPT violation On an EPT violation, bit 7 of the exit qualification is set if the guest linear-address is valid. The derived page fault error code should not be checked for this bit. Fixes: `f300948251` ("KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid") Cc: stable@vger.kernel.org Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://patch.msgid.link/20251106052853.3071088-1-Sukrit.Bhatnagar@sony.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-06 06:06:18 -08:00
Robin Gong	86d57d9c07	spi: imx: keep dma request disabled before dma transfer setup Since sdma hardware configure postpone to transfer phase, have to disable dma request before dma transfer setup because there is a hardware limitation on sdma event enable(ENBLn) as below: "It is thus essential for the Arm platform to program them before any DMA request is triggered to the SDMA, otherwise an unpredictable combination of channels may be started." Signed-off-by: Carlos Song <carlos.song@nxp.com> Signed-off-by: Robin Gong <yibin.gong@nxp.com> Link: https://patch.msgid.link/20251024055320.408482-1-carlos.song@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2025-11-06 13:13:57 +00:00
Niranjan H Y	84f5526e4d	ASoC: tas2783A: Fix issues in firmware parsing During firmware download, if the size of the firmware is too small, it wrongly assumes the firmware download is successful. If there is size mismatch with chunk's header, invalid memory is accessed. Fix these issues by throwing error during these cases. Fixes: `4cc9bd8d7b` (ASoc: tas2783A: Add soundwire based codec driver) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202510291226.2R3fbYNh-lkp@intel.com/ Signed-off-by: Niranjan H Y <niranjan.hy@ti.com> Link: https://patch.msgid.link/20251030151637.566-1-niranjan.hy@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>	2025-11-06 13:12:34 +00:00
Miaoqian Lin	1a58d865f4	ASoC: sdw_utils: fix device reference leak in is_sdca_endpoint_present() The bus_find_device_by_name() function returns a device pointer with an incremented reference count, but the original code was missing put_device() calls in some return paths, leading to reference count leaks. Fix this by ensuring put_device() is called before function exit after bus_find_device_by_name() succeeds This follows the same pattern used elsewhere in the kernel where bus_find_device_by_name() is properly paired with put_device(). Found via static analysis and code review. Fixes: `4f8ef33dd4` ("ASoC: soc_sdw_utils: skip the endpoint that doesn't present") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://patch.msgid.link/20251029071804.8425-1-linmq006@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2025-11-06 13:12:33 +00:00
Haotian Zhang	6b6eddc63c	ASoC: cs4271: Fix regulator leak on probe failure The probe function enables regulators at the beginning but fails to disable them in its error handling path. If any operation after enabling the regulators fails, the probe will exit with an error, leaving the regulators permanently enabled, which could lead to a resource leak. Add a proper error handling path to call regulator_bulk_disable() before returning an error. Fixes: `9a397f4736` ("ASoC: cs4271: add regulator consumer support") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20251105062246.1955-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org>	2025-11-06 13:12:32 +00:00
LiangCheng Wang	b750f5a9d6	drm/tiny: pixpaper: add explicit dependency on MMU The DRM_GEM_SHMEM_HELPER helper requires MMU enabled because it uses vmf_insert_pfn() in its mmap implementation. On NOMMU configurations (e.g. some RISC-V randconfig builds), this symbol is unavailable and selecting DRM_GEM_SHMEM_HELPER causes a modpost undefined reference: ERROR: modpost: "vmf_insert_pfn" [drivers/gpu/drm/drm_shmem_helper.ko] undefined! Normally, Kconfig prevents this helper from being selected when CONFIG_MMU=n. However, in some randconfig builds (such as those used by 0day CI), select statements can override unmet dependencies, triggering the issue. Add an explicit dependency on MMU to DRM_PIXPAPER to prevent this. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510280213.0rlYA4T3-lkp@intel.com/ Fixes: `0c4932f6dd` ("drm/tiny: pixpaper: Fix missing dependency on DRM_GEM_SHMEM_HELPER") Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: LiangCheng Wang <zaq14760@gmail.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251028-bar-v1-1-edfbd13fafff@gmail.com	2025-11-06 13:47:29 +01:00
Peter Zijlstra	4cb5ac2626	futex: Optimize per-cpu reference counting Shrikanth noted that the per-cpu reference counter was still some 10% slower than the old immutable option (which removes the reference counting entirely). Further optimize the per-cpu reference counter by: - switching from RCU to preempt; - using __this_cpu_() since we now have preempt disabled; - switching from smp_load_acquire() to READ_ONCE(). This is all safe because disabling preemption inhibits the RCU grace period exactly like rcu_read_lock(). Having preemption disabled allows using __this_cpu_() provided the only access to the variable is in task context -- which is the case here. Furthermore, since we know changing fph->state to FR_ATOMIC demands a full RCU grace period we can rely on the implied smp_mb() from that to replace the acquire barrier(). This is very similar to the percpu_down_read_internal() fast-path. The reason this is significant for PowerPC is that it uses the generic this_cpu_() implementation which relies on local_irq_disable() (the x86 implementation relies on it being a single memop instruction to be IRQ-safe). Switching to preempt_disable() and __this_cpu() avoids this IRQ state swizzling. Also, PowerPC needs LWSYNC for the ACQUIRE barrier, not having to use explicit barriers safes a bunch. Combined this reduces the performance gap by half, down to some 5%. Fixes: `760e6f7bef` ("futex: Remove support for IMMUTABLE") Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20251106092929.GR4067720@noisy.programming.kicks-ass.net	2025-11-06 12:30:54 +01:00
Aaron Lu	956dfda6a7	sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining When a cfs_rq is to be throttled, its limbo list should be empty and that's why there is a warn in tg_throttle_down() for non empty cfs_rq->throttled_limbo_list. When running a test with the following hierarchy: root / \ A* ... / \| \ ... B / \ C* where both A and C have quota settings, that warn on non empty limbo list is triggered for a cfs_rq of C, let's call it cfs_rq_c(and ignore the cpu part of the cfs_rq for the sake of simpler representation). Debug showed it happened like this: Task group C is created and quota is set, so in tg_set_cfs_bandwidth(), cfs_rq_c is initialized with runtime_enabled set, runtime_remaining equals to 0 and unthrottled. Before any tasks are enqueued to cfs_rq_c, multiple throttled tasks can migrate to cfs_rq_c (e.g., due to task group changes). When enqueue_task_fair(cfs_rq_c, throttled_task) is called and cfs_rq_c is in a throttled hierarchy (e.g., A is throttled), these throttled tasks are directly placed into cfs_rq_c's limbo list by enqueue_throttled_task(). Later, when A is unthrottled, tg_unthrottle_up(cfs_rq_c) enqueues these tasks. The first enqueue triggers check_enqueue_throttle(), and with zero runtime_remaining, cfs_rq_c can be throttled in throttle_cfs_rq() if it can't get more runtime and enters tg_throttle_down(), where the warning is hit due to remaining tasks in the limbo list. I think it's a chaos to trigger throttle on unthrottle path, the status of a being unthrottled cfs_rq can be in a mixed state in the end, so fix this by granting 1ns to cfs_rq in tg_set_cfs_bandwidth(). This ensures cfs_rq_c has a positive runtime_remaining when initialized as unthrottled and cannot enter tg_unthrottle_up() with zero runtime_remaining. Also, update outdated comments in tg_throttle_down() since unthrottle_cfs_rq() is no longer called with zero runtime_remaining. While at it, remove a redundant assignment to se in tg_throttle_down(). Fixes: `e1fad12dcb` ("sched/fair: Switch to task based throttle model") Reviewed-By: Benjamin Segall <bsegall@google.com> Suggested-by: Benjamin Segall <bsegall@google.com> Signed-off-by: Aaron Lu <ziqianlu@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Hao Jia <jiahao1@lixiang.com> Link: https://patch.msgid.link/20251030032755.560-1-ziqianlu@bytedance.com	2025-11-06 12:30:52 +01:00
Takashi Iwai	82420bd4e1	ALSA: hda/hdmi: Fix breakage at probing nvhdmi-mcp driver After restructuring and splitting the HDMI codec driver code, each HDMI codec driver contains the own build_controls and build_pcms ops. A copy-n-paste error put the wrong entries for nvhdmi-mcp driver; both build_controls and build_pcms are swapped. Unfortunately both callbacks have the very same form, and the compiler didn't complain it, either. This resulted in a NULL dereference because the PCM instance hasn't been initialized at calling the build_controls callback. Fix it by passing the proper entries. Fixes: `ad781b550f` ("ALSA: hda/hdmi: Rewrite to new probe method") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=220743 Link: https://patch.msgid.link/20251106104647.25805-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2025-11-06 11:49:35 +01:00
Christoph Hellwig	d8a823c6f0	xfs: free xfs_busy_extents structure when no RT extents are queued kmemleak occasionally reports leaking xfs_busy_extents structure from xfs_scrub calls after running xfs/528 (but attributed to following tests), which seems to be caused by not freeing the xfs_busy_extents structure when tr.queued is 0 and xfs_trim_rtgroup_extents breaks out of the main loop. Free the structure in this case. Fixes: `a3315d1130` ("xfs: use rtgroup busy extent list for FITRIM") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-11-06 08:59:19 +01:00
Vlastimil Babka	c379b745e1	slab: prevent infinite loop in kmalloc_nolock() with debugging In review of a followup work, Harry noticed a potential infinite loop. Upon closed inspection, it already exists for kmalloc_nolock() on a cache with debugging enabled, since commit `af92793e52` ("slab: Introduce kmalloc_nolock() and kfree_nolock().") When alloc_single_from_new_slab() fails to trylock node list_lock, we keep retrying to get partial slab or allocate a new slab. If we indeed interrupted somebody holding the list_lock, the trylock fill fail deterministically and we end up allocating and defer-freeing slabs indefinitely with no progress. To fix it, fail the allocation if spinning is not allowed. This is acceptable in the restricted context of kmalloc_nolock(), especially with debugging enabled. Reported-by: Harry Yoo <harry.yoo@oracle.com> Closes: https://lore.kernel.org/all/aQLqZjjq1SPD3Fml@hyeyoo/ Fixes: `af92793e52` ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20251103-fix-nolock-loop-v1-1-6e2b3e82b9da@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz>	2025-11-06 08:13:12 +01:00
Miaoqian Lin	59b0afd01b	crypto: hisilicon/qm - Fix device reference leak in qm_get_qos_value The qm_get_qos_value() function calls bus_find_device_by_name() which increases the device reference count, but fails to call put_device() to balance the reference count and lead to a device reference leak. Add put_device() calls in both the error path and success path to properly balance the reference count. Found via static analysis. Fixes: `22d7a6c39c` ("crypto: hisilicon/qm - add pci bdf number check") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Longfang Liu <liulongfang@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-11-06 14:29:49 +08:00
Jakub Kicinski	7d1988a943	Merge tag 'wireless-2025-11-05' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Just two small fixes: - ath12k: revert a change that caused performance regressions - hwsim: don't ignore netns on netlink socket matching * tag 'wireless-2025-11-05' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup Revert "wifi: ath12k: Fix missing station power save configuration" ==================== Link: https://patch.msgid.link/20251105152827.53254-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 18:04:55 -08:00
Haotian Zhang	4d6ec3a793	net: wan: framer: pef2256: Switch to devm_mfd_add_devices() The driver calls mfd_add_devices() but fails to call mfd_remove_devices() in error paths after successful MFD device registration and in the remove function. This leads to resource leaks where MFD child devices are not properly unregistered. Replace mfd_add_devices with devm_mfd_add_devices to automatically manage the device resources. Fixes: `c96e976d9a` ("net: wan: framer: Add support for the Lantiq PEF2256 framer") Suggested-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Acked-by: Herve Codina <herve.codina@bootlin.com> Link: https://patch.msgid.link/20251105034716.662-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 18:02:34 -08:00
Jiawen Wu	a04ea57aae	net: libwx: fix device bus LAN ID The device bus LAN ID was obtained from PCI_FUNC(), but when a PF port is passthrough to a virtual machine, the function number may not match the actual port index on the device. This could cause the driver to perform operations such as LAN reset on the wrong port. Fix this by reading the LAN ID from port status register. Fixes: `a34b3e6ed8` ("net: txgbe: Store PCI info") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/B60A670C1F52CB8E+20251104062321.40059-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:52:13 -08:00
Jakub Kicinski	b1d9154878	Merge branch 'net-mlx5e-shampo-fixes-for-64kb-page-size' Tariq Toukan says: ==================== net/mlx5e: SHAMPO fixes for 64KB page size This series by Dragos contains fixes for HW-GRO issues found on systems with 64KB page size. ==================== Link: https://patch.msgid.link/1762238915-1027590-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:48:41 -08:00
Dragos Tatulea	d8a7ed9586	net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages The MLX5E_SHAMPO_WQ_HEADER_PER_PAGE and MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE macros are used directly in several places under the assumption that there will always be more headers per WQE than headers per page. However, this assumption doesn't hold for 64K page sizes and higher MTUs (> 4K). This can be first observed during header page allocation: ksm_entries will become 0 during alignment to MLX5E_SHAMPO_WQ_HEADER_PER_PAGE. This patch introduces 2 additional members to the mlx5e_shampo_hd struct which are meant to be used instead of the macrose mentioned above. When the number of headers per WQE goes below MLX5E_SHAMPO_WQ_HEADER_PER_PAGE, clamp the number of headers per page and expand the header size accordingly so that the headers for one WQE cover a full page. All the formulas are adapted to use these two new members. Fixes: `945ca432bf` ("net/mlx5e: SHAMPO, Drop info array") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1762238915-1027590-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:48:37 -08:00
Dragos Tatulea	bacd8d8018	net/mlx5e: SHAMPO, Fix skb size check for 64K pages mlx5e_hw_gro_skb_has_enough_space() uses a formula to check if there is enough space in the skb frags to store more data. This formula is incorrect for 64K page sizes and it triggers early GRO session termination because the first fragment will blow up beyond GRO_LEGACY_MAX_SIZE. This patch adds a special case for page sizes >= GRO_LEGACY_MAX_SIZE (64K) which uses the skb->len instead. Within this context, the check is safe from fragment overflow because the hardware will continuously fill the data up to the reservation size of 64K and the driver will coalesce all data from the same page to the same fragment. This means that the data will span one fragment or at most two for such a large page size. It is expected that the if statement will be optimized out as the check is done with constants. Fixes: `92552d3abd` ("net/mlx5e: HW_GRO cqe handler implementation") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1762238915-1027590-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:48:36 -08:00
Dragos Tatulea	665a7e13c2	net/mlx5e: SHAMPO, Fix header mapping for 64K pages HW-GRO is broken on mlx5 for 64K page sizes. The patch in the fixes tag didn't take into account larger page sizes when doing an align down of max_ksm_entries. For 64K page size, max_ksm_entries is 0 which will skip mapping header pages via WQE UMR. This breaks header-data split and will result in the following syndrome: mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x4c9, ci 0x0, qn 0x1133, opcode 0xe, syndrome 0x4, vendor syndrome 0x32 00000000: 00 00 00 00 04 4a 00 00 00 00 00 00 20 00 93 32 00000010: 55 00 00 00 fb cc 00 00 00 00 00 00 07 18 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a 00000030: 00 00 3b c7 93 01 32 04 00 00 00 00 00 00 bf e0 mlx5_core 0000:00:08.0 eth2: ERR CQE on RQ: 0x1133 Furthermore, the function that fills in WQE UMRs for the headers (mlx5e_build_shampo_hd_umr()) only supports mapping page sizes that fit in a single UMR WQE. This patch goes back to the old non-aligned max_ksm_entries value and it changes mlx5e_build_shampo_hd_umr() to support mapping a large page over multiple UMR WQEs. This means that mlx5e_build_shampo_hd_umr() can now leave a page only partially mapped. The caller, mlx5e_alloc_rx_hd_mpwqe(), ensures that there are enough UMR WQEs to cover complete pages by working on ksm_entries that are multiples of MLX5E_SHAMPO_WQ_HEADER_PER_PAGE. Fixes: `8a0ee54027` ("net/mlx5e: SHAMPO, Simplify UMR allocation for headers") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1762238915-1027590-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:48:36 -08:00
Meghana Malladi	ae4789affd	net: ti: icssg-prueth: Fix fdb hash size configuration The ICSSG driver does the initial FDB configuration which includes setting the control registers. Other run time management like learning is managed by the PRU's. The default FDB hash size used by the firmware is 512 slots, which is currently missing in the current driver. Update the driver FDB config to include FDB hash size as well. Please refer trm [1] 6.4.14.12.17 section on how the FDB config register gets configured. From the table 6-1404, there is a reset field for FDB_HAS_SIZE which is 4, meaning 1024 slots. Currently the driver is not updating this reset value from 4(1024 slots) to 3(512 slots). This patch fixes this by updating the reset value to 512 slots. [1]: https://www.ti.com/lit/pdf/spruim2 Fixes: `abd5576b9c` ("net: ti: icssg-prueth: Add support for ICSSG switch firmware") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251104104415.3110537-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:43:08 -08:00
Gal Pressman	d1c94bc5b9	net/mlx5e: Fix return value in case of module EEPROM read error mlx5e_get_module_eeprom_by_page() has weird error handling. First, it is treating -EINVAL as a special case, but it is unclear why. Second, it tries to fail "gracefully" by returning the number of bytes read even in case of an error. This results in wrongly returning success (0 return value) if the error occurs before any bytes were read. Simplify the error handling by returning an error when such occurs. This also aligns with the error handling we have in mlx5e_get_module_eeprom() for the old API. This fixes the following case where the query fails, but userspace ethtool wrongly treats it as success and dumps an output: # ethtool -m eth2 netlink warning: mlx5_core: Query module eeprom by page failed, read 0 bytes, err -5 netlink warning: mlx5_core: Query module eeprom by page failed, read 0 bytes, err -5 Offset Values ------ ------ 0x0000: 00 00 00 00 05 00 04 00 00 00 00 00 05 00 05 00 0x0010: 00 00 00 00 05 00 06 00 50 00 00 00 67 65 20 66 0x0020: 61 69 6c 65 64 2c 20 72 65 61 64 20 30 20 62 79 0x0030: 74 65 73 2c 20 65 72 72 20 2d 35 00 14 00 03 00 0x0040: 08 00 01 00 03 00 00 00 08 00 02 00 1a 00 00 00 0x0050: 14 00 04 00 08 00 01 00 04 00 00 00 08 00 02 00 0x0060: 0e 00 00 00 14 00 05 00 08 00 01 00 05 00 00 00 0x0070: 08 00 02 00 1a 00 00 00 14 00 06 00 08 00 01 00 Fixes: `e109d2b204` ("net/mlx5: Implement get_module_eeprom_by_page()") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Alex Lazar <alazar@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1762265736-1028868-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:42:37 -08:00
Sebastian Andrzej Siewior	d917c217b6	net: gro_cells: Reduce lock scope in gro_cell_poll One GRO-cell device's NAPI callback can nest into the GRO-cell of another device if the underlying device is also using GRO-cell. This is the case for IPsec over vxlan. These two GRO-cells are separate devices. From lockdep's point of view it is the same because each device is sharing the same lock class and so it reports a possible deadlock assuming one device is nesting into itself. Hold the bh_lock only while accessing gro_cell::napi_skbs in gro_cell_poll(). This reduces the locking scope and avoids acquiring the same lock class multiple times. Fixes: `25718fdcbd` ("net: gro_cells: Use nested-BH locking for gro_cell") Reported-by: Gal Pressman <gal@nvidia.com> Closes: https://lore.kernel.org/all/66664116-edb8-48dc-ad72-d5223696dd19@nvidia.com/ Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20251104153435.ty88xDQt@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:41:29 -08:00
Michal Swiatkowski	b1d16f7c00	libie: depend on DEBUG_FS when building LIBIE_FWLOG LIBIE_FWLOG is unusable without DEBUG_FS. Mark it in Kconfig. Fix build error on ixgbe when DEBUG_FS is not set. To not add another layer of #if IS_ENABLED(LIBIE_FWLOG) in ixgbe fwlog code define debugfs dentry even when DEBUG_FS isn't enabled. In this case the dummy functions of LIBIE_FWLOG will be used, so not initialized dentry isn't a problem. Fixes: `641585bc97` ("ixgbe: fwlog support for e610") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/lkml/f594c621-f9e1-49f2-af31-23fbcb176058@roeck-us.net/ Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251104172333.752445-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-05 17:38:03 -08:00
Alexei Starovoitov	e427054ae7	Merge branch 'x86-fgraph-bpf-fix-orc-stack-unwind-from-return-probe' Jiri Olsa says: ==================== x86/fgraph,bpf: Fix ORC stack unwind from return probe sending fix for ORC stack unwind issue reported in here [1], where the ORC unwinder won't go pass the return_to_handler function and we get no stacktrace. Sending fix for that together with unrelated stacktrace fix (patch 1), so the attached test can work properly. It's based on: https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git probes/for-next v1: https://lore.kernel.org/bpf/20251027131354.1984006-1-jolsa@kernel.org/ v2: https://lore.kernel.org/bpf/20251103220924.36371-3-jolsa@kernel.org/ v3 changes: - fix assert condition in test thanks, jirka [1] https://lore.kernel.org/bpf/aObSyt3qOnS_BMcy@krava/ ==================== Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20251104215405.168643-1-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:14:42 -08:00
Jiri Olsa	3490d29964	selftests/bpf: Add stacktrace ips test for raw_tp Adding test that verifies we get expected initial 2 entries from stacktrace for rawtp probe via ORC unwind. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Jiri Olsa	c9e208fa93	selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi Adding test that attaches kprobe/kretprobe multi and verifies the ORC stacktrace matches expected functions. Adding bpf_testmod_stacktrace_test function to bpf_testmod kernel module which is called through several functions so we get reliable call path for stacktrace. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Jiri Olsa	20a0bc1027	x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe Currently we don't get stack trace via ORC unwinder on top of fgraph exit handler. We can see that when generating stacktrace from kretprobe_multi bpf program which is based on fprobe/fgraph. The reason is that the ORC unwind code won't get pass the return_to_handler callback installed by fgraph return probe machinery. Solving this by creating stack frame in return_to_handler expected by ftrace_graph_ret_addr function to recover original return address and continue with the unwind. Also updating the pt_regs data with cs/flags/rsp which are needed for successful stack retrieval from ebpf bpf_get_stackid helper. - in get_perf_callchain we check user_mode(regs) so CS has to be set - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED has to be unset Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Jiri Olsa	6d08340d1e	Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()" This reverts commit `83f44ae0f8`. Currently we store initial stacktrace entry twice for non-HW ot_regs, which means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel. It's easy to reproduce this bpftrace: # bpftrace -e 'tracepoint:sched:sched_process_exec { print(kstack()); }' Attaching 1 probe... bprm_execve+1767 bprm_execve+1767 do_execveat_common.isra.0+425 __x64_sys_execve+56 do_syscall_64+133 entry_SYSCALL_64_after_hwframe+118 When perf_callchain_kernel calls unwind_start with first_frame, AFAICS we do not skip regs->ip, but it's added as part of the unwind process. Hence reverting the extra perf_callchain_store for non-hw regs leg. I was not able to bisect this, so I'm not really sure why this was needed in v5.2 and why it's not working anymore, but I could see double entries as far as v5.10. I did the test for both ORC and framepointer unwind with and without the this fix and except for the initial entry the stacktraces are the same. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00

... 6 7 8 9 10 ...

1398251 Commits