linux-stable-mirror

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2026-04-08 12:02:33 +02:00

Author	SHA1	Message	Date
Jesper Dangaard Brouer	dd419a3f2e	veth: more robust handing of race to avoid txq getting stuck [ Upstream commit `5442a9da69` ] Commit `dc82a33297` ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") introduced a race condition that can lead to a permanently stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra Max). The race occurs in veth_xmit(). The producer observes a full ptr_ring and stops the queue (netif_tx_stop_queue()). The subsequent conditional logic, intended to re-wake the queue if the consumer had just emptied it (if (__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a "lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and traffic halts. This failure is caused by an incorrect use of the __ptr_ring_empty() API from the producer side. As noted in kernel comments, this check is not guaranteed to be correct if a consumer is operating on another CPU. The empty test is based on ptr_ring->consumer_head, making it reliable only for the consumer. Using this check from the producer side is fundamentally racy. This patch fixes the race by adopting the more robust logic from an earlier version V4 of the patchset, which always flushed the peer: (1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier are removed. Instead, after stopping the queue, we unconditionally call __veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled, making it solely responsible for re-waking the TXQ. This handles the race where veth_poll() consumes all packets and completes NAPI before veth_xmit() on the producer side has called netif_tx_stop_queue. The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule NAPI. (2) On the consumer side, the logic for waking the peer TXQ is moved out of veth_xdp_rcv() and placed at the end of the veth_poll() function. This placement is part of fixing the race, as the netif_tx_queue_stopped() check must occur after rx_notify_masked is potentially set to false during NAPI completion. This handles the race where veth_poll() consumes all packets, but haven't finished (rx_notify_masked is still true). The producer veth_xmit() stops the TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning not starting NAPI. Then veth_poll() change rx_notify_masked to false and stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake it up. Fixes: `dc82a33297` ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: `a14602fcae` ("veth: reduce XDP no_direct return section to fix race") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Jesper Dangaard Brouer	76bed44c5a	veth: prevent NULL pointer dereference in veth_xdp_rcv [ Upstream commit `9337c54401` ] The veth peer device is RCU protected, but when the peer device gets deleted (veth_dellink) then the pointer is assigned NULL (via RCU_INIT_POINTER). This patch adds a necessary NULL check in veth_xdp_rcv when accessing the veth peer net_device. This fixes a bug introduced in commit `dc82a33297` ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops"). The bug is a race and only triggers when having inflight packets on a veth that is being deleted. Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev> Closes: https://lore.kernel.org/all/fecfcad0-7a16-42b8-bff2-66ee83a6e5c4@linux.dev/ Reported-by: syzbot+c4c7bf27f6b0c4bd97fe@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/683da55e.a00a0220.d8eae.0052.GAE@google.com/ Fixes: `dc82a33297` ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://patch.msgid.link/174964557873.519608.10855046105237280978.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: `a14602fcae` ("veth: reduce XDP no_direct return section to fix race") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Jesper Dangaard Brouer	9fe31b3f31	veth: apply qdisc backpressure on full ptr_ring to reduce TX drops [ Upstream commit `dc82a33297` ] In production, we're seeing TX drops on veth devices when the ptr_ring fills up. This can occur when NAPI mode is enabled, though it's relatively rare. However, with threaded NAPI - which we use in production - the drops become significantly more frequent. The underlying issue is that with threaded NAPI, the consumer often runs on a different CPU than the producer. This increases the likelihood of the ring filling up before the consumer gets scheduled, especially under load, leading to drops in veth_xmit() (ndo_start_xmit()). This patch introduces backpressure by returning NETDEV_TX_BUSY when the ring is full, signaling the qdisc layer to requeue the packet. The txq (netdev queue) is stopped in this condition and restarted once veth_poll() drains entries from the ring, ensuring coordination between NAPI and qdisc. Backpressure is only enabled when a qdisc is attached. Without a qdisc, the driver retains its original behavior - dropping packets immediately when the ring is full. This avoids unexpected behavior changes in setups without a configured qdisc. With a qdisc in place (e.g. fq, sfq) this allows Active Queue Management (AQM) to fairly schedule packets across flows and reduce collateral damage from elephant flows. A known limitation of this approach is that the full ring sits in front of the qdisc layer, effectively forming a FIFO buffer that introduces base latency. While AQM still improves fairness and mitigates flow dominance, the latency impact is measurable. In hardware drivers, this issue is typically addressed using BQL (Byte Queue Limits), which tracks in-flight bytes needed based on physical link rate. However, for virtual drivers like veth, there is no fixed bandwidth constraint - the bottleneck is CPU availability and the scheduler's ability to run the NAPI thread. It is unclear how effective BQL would be in this context. This patch serves as a first step toward addressing TX drops. Future work may explore adapting a BQL-like mechanism to better suit virtual devices like veth. Reported-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Link: https://patch.msgid.link/174559294022.827981.1282809941662942189.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: `a14602fcae` ("veth: reduce XDP no_direct return section to fix race") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Jesper Dangaard Brouer	bf7528722e	net: sched: generalize check for no-queue qdisc on TX queue [ Upstream commit `34dd0fecaa` ] The "noqueue" qdisc can either be directly attached, or get default attached if net_device priv_flags has IFF_NO_QUEUE. In both cases, the allocated Qdisc structure gets it's enqueue function pointer reset to NULL by noqueue_init() via noqueue_qdisc_ops. This is a common case for software virtual net_devices. For these devices with no-queue, the transmission path in __dev_queue_xmit() will bypass the qdisc layer. Directly invoking device drivers ndo_start_xmit (via dev_hard_start_xmit). In this mode the device driver is not allowed to ask for packets to be queued (either via returning NETDEV_TX_BUSY or stopping the TXQ). The simplest and most reliable way to identify this no-queue case is by checking if enqueue == NULL. The vrf driver currently open-codes this check (!qdisc->enqueue). While functionally correct, this low-level detail is better encapsulated in a dedicated helper for clarity and long-term maintainability. To make this behavior more explicit and reusable, this patch introduce a new helper: qdisc_txq_has_no_queue(). Helper will also be used by the veth driver in the next patch, which introduces optional qdisc-based backpressure. This is a non-functional change. Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/174559293172.827981.7583862632045264175.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: `a14602fcae` ("veth: reduce XDP no_direct return section to fix race") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Luiz Augusto von Dentz	429bcd8ffa	Bluetooth: SMP: Fix not generating mackey and ltk when repairing [ Upstream commit `545d7827b2` ] The change `eed467b517` ("Bluetooth: fix passkey uninitialized when used") introduced a goto that bypasses the creation of temporary mackey and ltk which are later used by the likes of DHKey Check step. Later `ffee202a78` ("Bluetooth: Always request for user confirmation for Just Works (LE SC)") which means confirm_hint is always set in case JUST_WORKS so the branch checking for an existing LTK becomes pointless as confirm_hint will always be set, so this just merge both cases of malicious or legitimate devices to be confirmed before continuing with the pairing procedure. Link: https://github.com/bluez/bluez/issues/1622 Fixes: `eed467b517` ("Bluetooth: fix passkey uninitialized when used") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Edward Adam Davis	e90c05fc5b	Bluetooth: hci_sock: Prevent race in socket write iter and sock bind [ Upstream commit `89bb613511` ] There is a potential race condition between sock bind and socket write iter. bind may free the same cmd via mgmt_pending before write iter sends the cmd, just as syzbot reported in UAF[1]. Here we use hci_dev_lock to synchronize the two, thereby avoiding the UAF mentioned in [1]. [1] syzbot reported: BUG: KASAN: slab-use-after-free in mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316 Read of size 8 at addr ffff888077164818 by task syz.0.17/5989 Call Trace: mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316 set_link_security+0x5c2/0x710 net/bluetooth/mgmt.c:1918 hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719 hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 sock_write_iter+0x279/0x360 net/socket.c:1195 Allocated by task 5989: mgmt_pending_add+0x35/0x140 net/bluetooth/mgmt_util.c:296 set_link_security+0x557/0x710 net/bluetooth/mgmt.c:1910 hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719 hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 sock_write_iter+0x279/0x360 net/socket.c:1195 Freed by task 5991: mgmt_pending_free net/bluetooth/mgmt_util.c:311 [inline] mgmt_pending_foreach+0x30d/0x380 net/bluetooth/mgmt_util.c:257 mgmt_index_removed+0x112/0x2f0 net/bluetooth/mgmt.c:9477 hci_sock_bind+0xbe9/0x1000 net/bluetooth/hci_sock.c:1314 Fixes: `6fe26f694c` ("Bluetooth: MGMT: Protect mgmt_pending list with its own lock") Reported-by: syzbot+9aa47cd4633a3cf92a80@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9aa47cd4633a3cf92a80 Tested-by: syzbot+9aa47cd4633a3cf92a80@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Luiz Augusto von Dentz	b1225ccdbd	Bluetooth: hci_core: Fix triggering cmd_timer for HCI_OP_NOP [ Upstream commit `275ddfeb3f` ] HCI_OP_NOP means no command was actually sent so there is no point in triggering cmd_timer which may cause a hdev->reset in the process since it is assumed that the controller is stuck processing a command. Fixes: `e2d471b780` ("Bluetooth: ISO: Fix not using SID from adv report") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Chris Lu	421e88a0d8	Bluetooth: btusb: mediatek: Fix kernel crash when releasing mtk iso interface [ Upstream commit `4015b97976` ] When performing reset tests and encountering abnormal card drop issues that lead to a kernel crash, it is necessary to perform a null check before releasing resources to avoid attempting to release a null pointer. <4>[ 29.158070] Hardware name: Google Quigon sku196612/196613 board (DT) <4>[ 29.158076] Workqueue: hci0 hci_cmd_sync_work [bluetooth] <4>[ 29.158154] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) <4>[ 29.158162] pc : klist_remove+0x90/0x158 <4>[ 29.158174] lr : klist_remove+0x88/0x158 <4>[ 29.158180] sp : ffffffc0846b3c00 <4>[ 29.158185] pmr_save: 000000e0 <4>[ 29.158188] x29: ffffffc0846b3c30 x28: ffffff80cd31f880 x27: ffffff80c1bdc058 <4>[ 29.158199] x26: dead000000000100 x25: ffffffdbdc624ea3 x24: ffffff80c1bdc4c0 <4>[ 29.158209] x23: ffffffdbdc62a3e6 x22: ffffff80c6c07000 x21: ffffffdbdc829290 <4>[ 29.158219] x20: 0000000000000000 x19: ffffff80cd3e0648 x18: 000000031ec97781 <4>[ 29.158229] x17: ffffff80c1bdc4a8 x16: ffffffdc10576548 x15: ffffff80c1180428 <4>[ 29.158238] x14: 0000000000000000 x13: 000000000000e380 x12: 0000000000000018 <4>[ 29.158248] x11: ffffff80c2a7fd10 x10: 0000000000000000 x9 : 0000000100000000 <4>[ 29.158257] x8 : 0000000000000000 x7 : 7f7f7f7f7f7f7f7f x6 : 2d7223ff6364626d <4>[ 29.158266] x5 : 0000008000000000 x4 : 0000000000000020 x3 : 2e7325006465636e <4>[ 29.158275] x2 : ffffffdc11afeff8 x1 : 0000000000000000 x0 : ffffffdc11be4d0c <4>[ 29.158285] Call trace: <4>[ 29.158290] klist_remove+0x90/0x158 <4>[ 29.158298] device_release_driver_internal+0x20c/0x268 <4>[ 29.158308] device_release_driver+0x1c/0x30 <4>[ 29.158316] usb_driver_release_interface+0x70/0x88 <4>[ 29.158325] btusb_mtk_release_iso_intf+0x68/0xd8 [btusb (HASH:e8b6 5)] <4>[ 29.158347] btusb_mtk_reset+0x5c/0x480 [btusb (HASH:e8b6 5)] <4>[ 29.158361] hci_cmd_sync_work+0x10c/0x188 [bluetooth (HASH:a4fa 6)] <4>[ 29.158430] process_scheduled_works+0x258/0x4e8 <4>[ 29.158441] worker_thread+0x300/0x428 <4>[ 29.158448] kthread+0x108/0x1d0 <4>[ 29.158455] ret_from_fork+0x10/0x20 <0>[ 29.158467] Code: 91343000 940139d1 f9400268 927ff914 (f9401297) <4>[ 29.158474] ---[ end trace 0000000000000000 ]--- <0>[ 29.167129] Kernel panic - not syncing: Oops: Fatal exception <2>[ 29.167144] SMP: stopping secondary CPUs <4>[ 29.167158] ------------[ cut here ]------------ Fixes: `ceac1cb025` ("Bluetooth: btusb: mediatek: add ISO data transmission functions") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Marc Kleine-Budde	ad55004a3c	can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing data [ Upstream commit `395d988f93` ] The URB received in gs_usb_receive_bulk_callback() contains a struct gs_host_frame. The length of the data after the header depends on the gs_host_frame hf::flags and the active device features (e.g. time stamping). Introduce a new function gs_usb_get_minimum_length() and check that we have at least received the required amount of data before accessing it. Only copy the data to that skb that has actually been received. Fixes: `d08e973a77` ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-3-a29b42eacada@pengutronix.de [mkl: rename gs_usb_get_minimum_length() -> +gs_usb_get_minimum_rx_length()] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Marc Kleine-Budde	616eee3e89	can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing header [ Upstream commit `6fe9f3279f` ] The driver expects to receive a struct gs_host_frame in gs_usb_receive_bulk_callback(). Use struct_group to describe the header of the struct gs_host_frame and check that we have at least received the header before accessing any members of it. To resubmit the URB, do not dereference the pointer chain "dev->parent->hf_size_rx" but use "parent->hf_size_rx" instead. Since "urb->context" contains "parent", it is always defined, while "dev" is not defined if the URB it too short. Fixes: `d08e973a77` ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-2-a29b42eacada@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Marc Kleine-Budde	4a82072e45	can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBs [ Upstream commit `516a0cd1c0` ] The driver lacks the cleanup of failed transfers of URBs. This reduces the number of available URBs per error by 1. This leads to reduced performance and ultimately to a complete stop of the transmission. If the sending of a bulk URB fails do proper cleanup: - increase netdev stats - mark the echo_sbk as free - free the driver's context and do accounting - wake the send queue Closes: https://github.com/candle-usb/candleLight_fw/issues/187 Fixes: `d08e973a77` ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-1-a29b42eacada@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Seungjin Bae	0897cea266	can: kvaser_usb: leaf: Fix potential infinite loop in command parsers [ Upstream commit `0c73772cd2` ] The `kvaser_usb_leaf_wait_cmd()` and `kvaser_usb_leaf_read_bulk_callback` functions contain logic to zero-length commands. These commands are used to align data to the USB endpoint's wMaxPacketSize boundary. The driver attempts to skip these placeholders by aligning the buffer position `pos` to the next packet boundary using `round_up()` function. However, if zero-length command is found exactly on a packet boundary (i.e., `pos` is a multiple of wMaxPacketSize, including 0), `round_up` function will return the unchanged value of `pos`. This prevents `pos` to be increased, causing an infinite loop in the parsing logic. This patch fixes this in the function by using `pos + 1` instead. This ensures that even if `pos` is on a boundary, the calculation is based on `pos + 1`, forcing `round_up()` to always return the next aligned boundary. Fixes: `7259124eac` ("can: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c") Signed-off-by: Seungjin Bae <eeodqql09@gmail.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20251023162709.348240-1-eeodqql09@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-07 06:24:54 +09:00
Greg Kroah-Hartman	318a47068f	Linux 6.12.60 Link: https://lore.kernel.org/r/20251127144032.705323598@linuxfoundation.org Tested-by: Pavel Machek (CIP) <pavel@denx.de> Link: https://lore.kernel.org/r/20251127150346.125775439@linuxfoundation.org Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Dileep Malepu <dileep.debian@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> v6.12.60	2025-12-01 11:43:41 +01:00
Charles Keepax	81fdac6853	Revert "gpio: swnode: don't use the swnode's name as the key for GPIO lookup" This reverts commit `25decf0469`. This software node change doesn't actually fix any current issues with the kernel, it is an improvement to the lookup process rather than fixing a live bug. It also causes a couple of regressions with shipping laptops, which relied on the label based lookup. There is a fix for the regressions in mainline, the first 5 patches of [1]. However, those patches are fairly substantial changes and given the patch causing the regression doesn't actually fix a bug it seems better to just revert it in stable. CC: stable@vger.kernel.org # 6.12, 6.17 Link: https://lore.kernel.org/linux-sound/20251120-reset-gpios-swnodes-v7-0-a100493a0f4b@linaro.org/ [1] Closes: https://github.com/thesofproject/linux/issues/5599 Closes: https://github.com/thesofproject/linux/issues/5603 Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:40 +01:00
Fangzhi Zuo	53ca559992	drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched [ Upstream commit `cfa0904a35` ] [why] 1. With allow_0_dtb_clk enabled, the time required to latch DTBCLK to 600 MHz depends on the SMU. If DTBCLK is not latched to 600 MHz before set_mode completes, gating DTBCLK causes the DP2 sink to lose its clock source. 2. The existing DTBCLK gating sequence ungates DTBCLK based on both pix_clk and ref_dtbclk, but gates DTBCLK when either pix_clk or ref_dtbclk is zero. pix_clk can be zero outside the set_mode sequence before DTBCLK is properly latched, which can lead to DTBCLK being gated by mistake. [how] Consider both pixel_clk and ref_dtbclk when determining when it is safe to gate DTBCLK; this is more accurate. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4701 Fixes: `5949e7c489` ("drm/amd/display: Enable Dynamic DTBCLK Switch") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d04eb0c402`) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:40 +01:00
Charlene Liu	25dcf6299d	drm/amd/display: Insert dccg log for easy debug [ Upstream commit `35bcc9168f` ] [why] Log for sequence tracking Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com> Reviewed-by: Yihan Zhu <yihan.zhu@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `cfa0904a35` ("drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:40 +01:00
Charlene Liu	b1515304a5	drm/amd/display: disable DPP RCG before DPP CLK enable [ Upstream commit `1bcd679209` ] [why] DPP CLK enable needs to disable DPPCLK RCG first. The DPPCLK_en in dccg should always be enabled when the corresponding pipe is enabled. Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `cfa0904a35` ("drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:40 +01:00
Charlene Liu	467904aabb	drm/amd/display: avoid reset DTBCLK at clock init [ Upstream commit `0ae47e971b` ] [why & how] this is to init to HW real DTBCLK. and use real HW DTBCLK status to update internal logic state Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `cfa0904a35` ("drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:40 +01:00
Darrick J. Wong	7c2d68e091	xfs: fix out of bounds memory read error in symlink repair [ Upstream commit `678e1cc2f4` ] xfs/286 produced this report on my test fleet: ================================================================== BUG: KFENCE: out-of-bounds read in memcpy_orig+0x54/0x110 Out-of-bounds read at 0xffff88843fe9e038 (184B right of kfence-#184): memcpy_orig+0x54/0x110 xrep_symlink_salvage_inline+0xb3/0xf0 [xfs] xrep_symlink_salvage+0x100/0x110 [xfs] xrep_symlink+0x2e/0x80 [xfs] xrep_attempt+0x61/0x1f0 [xfs] xfs_scrub_metadata+0x34f/0x5c0 [xfs] xfs_ioc_scrubv_metadata+0x387/0x560 [xfs] xfs_file_ioctl+0xe23/0x10e0 [xfs] __x64_sys_ioctl+0x76/0xc0 do_syscall_64+0x4e/0x1e0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 kfence-#184: 0xffff88843fe9df80-0xffff88843fe9dfea, size=107, cache=kmalloc-128 allocated by task 3470 on cpu 1 at 263329.131592s (192823.508886s ago): xfs_init_local_fork+0x79/0xe0 [xfs] xfs_iformat_local+0xa4/0x170 [xfs] xfs_iformat_data_fork+0x148/0x180 [xfs] xfs_inode_from_disk+0x2cd/0x480 [xfs] xfs_iget+0x450/0xd60 [xfs] xfs_bulkstat_one_int+0x6b/0x510 [xfs] xfs_bulkstat_iwalk+0x1e/0x30 [xfs] xfs_iwalk_ag_recs+0xdf/0x150 [xfs] xfs_iwalk_run_callbacks+0xb9/0x190 [xfs] xfs_iwalk_ag+0x1dc/0x2f0 [xfs] xfs_iwalk_args.constprop.0+0x6a/0x120 [xfs] xfs_iwalk+0xa4/0xd0 [xfs] xfs_bulkstat+0xfa/0x170 [xfs] xfs_ioc_fsbulkstat.isra.0+0x13a/0x230 [xfs] xfs_file_ioctl+0xbf2/0x10e0 [xfs] __x64_sys_ioctl+0x76/0xc0 do_syscall_64+0x4e/0x1e0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 CPU: 1 UID: 0 PID: 1300113 Comm: xfs_scrub Not tainted 6.18.0-rc4-djwx #rc4 PREEMPT(lazy) 3d744dd94e92690f00a04398d2bd8631dcef1954 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014 ================================================================== On further analysis, I realized that the second parameter to min() is not correct. xfs_ifork::if_bytes is the size of the xfs_ifork::if_data buffer. if_bytes can be smaller than the data fork size because: (a) the forkoff code tries to keep the data area as large as possible (b) for symbolic links, if_bytes is the ondisk file size + 1 (c) forkoff is always a multiple of 8. Case in point: for a single-byte symlink target, forkoff will be 8 but the buffer will only be 2 bytes long. In other words, the logic here is wrong and we walk off the end of the incore buffer. Fix that. Cc: stable@vger.kernel.org # v6.10 Fixes: `2651923d8d` ("xfs: online repair of symbolic links") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:39 +01:00
Marcelo Moreira	12335f6ce2	xfs: Replace strncpy with memcpy [ Upstream commit `33ddc796ec` ] The changes modernizes the code by aligning it with current kernel best practices. It improves code clarity and consistency, as strncpy is deprecated as explained in Documentation/process/deprecated.rst. This change does not alter the functionality or introduce any behavioral changes. Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Marcelo Moreira <marcelomoreira1905@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Stable-dep-of: `678e1cc2f4` ("xfs: fix out of bounds memory read error in symlink repair") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:39 +01:00
Eric Dumazet	6d3275d4ca	mptcp: fix a race in mptcp_pm_del_add_timer() [ Upstream commit `426358d9be` ] mptcp_pm_del_add_timer() can call sk_stop_timer_sync(sk, &entry->add_timer) while another might have free entry already, as reported by syzbot. Add RCU protection to fix this issue. Also change confusing add_timer variable with stop_timer boolean. syzbot report: BUG: KASAN: slab-use-after-free in __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616 Read of size 4 at addr ffff8880311e4150 by task kworker/1:1/44 CPU: 1 UID: 0 PID: 44 Comm: kworker/1:1 Not tainted syzkaller #0 PREEMPT_{RT,(full)} Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 Workqueue: events mptcp_worker Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616 sk_stop_timer_sync+0x1b/0x90 net/core/sock.c:3631 mptcp_pm_del_add_timer+0x283/0x310 net/mptcp/pm.c:362 mptcp_incoming_options+0x1357/0x1f60 net/mptcp/options.c:1174 tcp_data_queue+0xca/0x6450 net/ipv4/tcp_input.c:5361 tcp_rcv_established+0x1335/0x2670 net/ipv4/tcp_input.c:6441 tcp_v4_do_rcv+0x98b/0xbf0 net/ipv4/tcp_ipv4.c:1931 tcp_v4_rcv+0x252a/0x2dc0 net/ipv4/tcp_ipv4.c:2374 ip_protocol_deliver_rcu+0x221/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:239 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 __netif_receive_skb_one_core net/core/dev.c:6079 [inline] __netif_receive_skb+0x143/0x380 net/core/dev.c:6192 process_backlog+0x31e/0x900 net/core/dev.c:6544 __napi_poll+0xb6/0x540 net/core/dev.c:7594 napi_poll net/core/dev.c:7657 [inline] net_rx_action+0x5f7/0xda0 net/core/dev.c:7784 handle_softirqs+0x22f/0x710 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] __local_bh_enable_ip+0x1a0/0x2e0 kernel/softirq.c:302 mptcp_pm_send_ack net/mptcp/pm.c:210 [inline] mptcp_pm_addr_send_ack+0x41f/0x500 net/mptcp/pm.c:-1 mptcp_pm_worker+0x174/0x320 net/mptcp/pm.c:1002 mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762 process_one_work kernel/workqueue.c:3263 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Allocated by task 44: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:400 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417 kasan_kmalloc include/linux/kasan.h:262 [inline] __kmalloc_cache_noprof+0x1ef/0x6c0 mm/slub.c:5748 kmalloc_noprof include/linux/slab.h:957 [inline] mptcp_pm_alloc_anno_list+0x104/0x460 net/mptcp/pm.c:385 mptcp_pm_create_subflow_or_signal_addr+0xf9d/0x1360 net/mptcp/pm_kernel.c:355 mptcp_pm_nl_fully_established net/mptcp/pm_kernel.c:409 [inline] __mptcp_pm_kernel_worker+0x417/0x1ef0 net/mptcp/pm_kernel.c:1529 mptcp_pm_worker+0x1ee/0x320 net/mptcp/pm.c:1008 mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762 process_one_work kernel/workqueue.c:3263 [inline] process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Freed by task 6630: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 __kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587 kasan_save_free_info mm/kasan/kasan.h:406 [inline] poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2523 [inline] slab_free mm/slub.c:6611 [inline] kfree+0x197/0x950 mm/slub.c:6818 mptcp_remove_anno_list_by_saddr+0x2d/0x40 net/mptcp/pm.c:158 mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_kernel.c:1209 [inline] mptcp_nl_flush_addrs_list net/mptcp/pm_kernel.c:1240 [inline] mptcp_pm_nl_flush_addrs_doit+0x593/0xbb0 net/mptcp/pm_kernel.c:1281 genl_family_rcv_msg_doit+0x215/0x300 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x846/0xa10 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 ____sys_sendmsg+0x508/0x820 net/socket.c:2630 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684 __sys_sendmsg net/socket.c:2716 [inline] __do_sys_sendmsg net/socket.c:2721 [inline] __se_sys_sendmsg net/socket.c:2719 [inline] __x64_sys_sendmsg+0x1a1/0x260 net/socket.c:2719 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Cc: stable@vger.kernel.org Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Reported-by: syzbot+2a6fbf0f0530375968df@syzkaller.appspotmail.com Closes: https://lore.kernel.org/691ad3c3.a70a0220.f6df1.0004.GAE@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251117100745.1913963-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:39 +01:00
Imre Deak	3e5271f224	drm/i915/dp_mst: Disable Panel Replay [ Upstream commit `f2687d3cc9` ] Disable Panel Replay on MST links until it's properly implemented. For instance the required VSC SDP is not programmed on MST and FEC is not enabled if Panel Replay is enabled. Fixes: `3257e55d3e` ("drm/i915/panelreplay: enable/disable panel replay") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15174 Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Animesh Manna <animesh.manna@intel.com> Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20251107124141.911895-1-imre.deak@intel.com (cherry picked from commit `e109f644b8`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [ placed MST check at function start since DPCD read was moved to caller ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:39 +01:00
Martin Kaiser	4ade59d68a	maple_tree: fix tracepoint string pointers commit `91a5409002` upstream. maple_tree tracepoints contain pointers to function names. Such a pointer is saved when a tracepoint logs an event. There's no guarantee that it's still valid when the event is parsed later and the pointer is dereferenced. The kernel warns about these unsafe pointers. event 'ma_read' has unsafe pointer field 'fn' WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4 Mark the function names as tracepoint_string() to fix the events. One case that doesn't work without my patch would be trace-cmd record to save the binary ringbuffer and trace-cmd report to parse it in userspace. The address of __func__ can't be dereferenced from userspace but tracepoint_string will add an entry to /sys/kernel/tracing/printk_formats Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx Fixes: `54a611b605` ("Maple Tree: add new data structure") Signed-off-by: Martin Kaiser <martin@kaiser.cx> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:39 +01:00
Jari Ruusu	c95e5af4b6	tty/vt: fix up incorrect backport to stable releases Below is a patch for 6.12.58+ and 6.17.8+ stable branches only. Upstream does not need this. Signed-off-by: Jari Ruusu <jariruusu@protonmail.com> Fixes: `da7e8b3823` ("tty/vt: Add missing return value for VT_RESIZE in vt_ioctl()") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:38 +01:00
Henrique Carvalho	1ebfea90f9	smb: client: fix incomplete backport in cfids_invalidation_worker() The previous commit `bdb596ceb4` ("smb: client: fix potential UAF in smb2_close_cached_fid()") was an incomplete backport and missed one kref_put() call in cfids_invalidation_worker() that should have been converted to close_cached_dir(). Fixes: `065bd62412` ("smb: client: fix potential UAF in smb2_close_cached_fid()")" Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-12-01 11:43:38 +01:00
Samuel Zhang	a45d6359ee	drm/amdgpu: fix gpu page fault after hibernation on PF passthrough [ Upstream commit `eb6e7f520d` ] On PF passthrough environment, after hibernate and then resume, coralgemm will cause gpu page fault. Mode1 reset happens during hibernate, but partition mode is not restored on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right after resume. When CP access the MQD BO, wrong stride size is used, this will cause out of bound access on the MQD BO, resulting page fault. The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called when resume from a hibernation. KFD resume is called separately during a reset recovery or resume from suspend sequence. Hence it's not required to be called as part of partition switch. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `5d1b32cfe4`) Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:38 +01:00
Zhang Chujun	2e628227bc	tracing/tools: Fix incorrcet short option in usage text for --threads [ Upstream commit `53afec2c8f` ] The help message incorrectly listed '-t' as the short option for --threads, but the actual getopt_long configuration uses '-e'. This mismatch can confuse users and lead to incorrect command-line usage. This patch updates the usage string to correctly show: "-e, --threads NRTHR" to match the implementation. Note: checkpatch.pl reports a false-positive spelling warning on 'Run', which is intentional. Link: https://patch.msgid.link/20251106031040.1869-1-zhangchujun@cmss.chinamobile.com Signed-off-by: Zhang Chujun <zhangchujun@cmss.chinamobile.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:38 +01:00
Nishanth Menon	fbb53727ca	net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error [ Upstream commit `90a88306eb` ] Make knav_dma_open_channel consistently return NULL on error instead of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h returns NULL when the driver is disabled, but the driver implementation does not even return NULL or ERR_PTR on failure, causing inconsistency in the users. This results in a crash in netcp_free_navigator_resources as followed (trimmed): Unhandled fault: alignment exception (0x221) at 0xfffffff2 [fffffff2] pgd=80000800207003, pmd=82ffda003, *pte=00000000 Internal error: : 221 [#1] SMP ARM Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE Hardware name: Keystone PC is at knav_dma_close_channel+0x30/0x19c LR is at netcp_free_navigator_resources+0x2c/0x28c [... TRIM...] Call trace: knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c netcp_ndo_open from __dev_open+0x114/0x29c __dev_open from __dev_change_flags+0x190/0x208 __dev_change_flags from netif_change_flags+0x1c/0x58 netif_change_flags from dev_change_flags+0x38/0xa0 dev_change_flags from ip_auto_config+0x2c4/0x11f0 ip_auto_config from do_one_initcall+0x58/0x200 do_one_initcall from kernel_init_freeable+0x1cc/0x238 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x38 [... TRIM...] Standardize the error handling by making the function return NULL on all error conditions. The API is used in just the netcp_core.c so the impact is limited. Note, this change, in effect reverts commit `5b6cb43b4d` ("net: ethernet: ti: netcp_core: return error while dma channel open issue"), but provides a less error prone implementation. Suggested-by: Simon Horman <horms@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:38 +01:00
René Rebe	05695cec60	ALSA: usb-audio: fix uac2 clock source at terminal parser [ Upstream commit `d26e9f669c` ] Since `8b3a087f7f` ("ALSA: usb-audio: Unify virtual type units type to UAC3 values") usb-audio is using UAC3_CLOCK_SOURCE instead of bDescriptorSubtype, later refactored with `e0ccdef926` ("ALSA: usb-audio: Clean up check_input_term()") into parse_term_uac2_clock_source(). This breaks the clock source selection for at least my 1397:0003 BEHRINGER International GmbH FCA610 Pro. Fix by using UAC2_CLOCK_SOURCE in parse_term_uac2_clock_source(). Fixes: `8b3a087f7f` ("ALSA: usb-audio: Unify virtual type units type to UAC3 values") Signed-off-by: René Rebe <rene@exactco.de> Link: https://patch.msgid.link/20251125.154149.1121389544970412061.rene@exactco.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:37 +01:00
Heiko Carstens	b514ad872a	s390/mm: Fix __ptep_rdp() inline assembly [ Upstream commit `31475b8811` ] When a zero ASCE is passed to the __ptep_rdp() inline assembly, the generated instruction should have the R3 field of the instruction set to zero. However the inline assembly is written incorrectly: for such cases a zero is loaded into a register allocated by the compiler and this register is then used by the instruction. This means that selected TLB entries may not be flushed since the specified ASCE does not match the one which was used when the selected TLB entries were created. Fix this by removing the asce and opt parameters of __ptep_rdp(), since all callers always pass zero, and use a hard-coded register zero for the R3 field. Fixes: `0807b85652` ("s390/mm: add support for RDP (Reset DAT-Protection)") Cc: stable@vger.kernel.org Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:37 +01:00
Shuicheng Lin	23ba534d73	drm/xe: Prevent BIT() overflow when handling invalid prefetch region [ Upstream commit `d52dea485c` ] If user provides a large value (such as 0x80) for parameter prefetch_mem_region_instance in vm_bind ioctl, it will cause BIT(prefetch_region) overflow as below: " ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7 shift exponent 128 is too large for 64-bit type 'long unsigned int' CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G W 6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary) Tainted: [W]=WARN Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 Call Trace: <TASK> dump_stack_lvl+0xa0/0xc0 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_shift_out_of_bounds+0x10e/0x170 ? mutex_unlock+0x12/0x20 xe_vm_bind_ioctl.cold+0x20/0x3c [xe] ... " Fix it by validating prefetch_region before the BIT() usage. v2: Add Closes and Cc stable kernels. (Matt) Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478 Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com (cherry picked from commit `8f565bdd14`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `d52dea485c`) Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:37 +01:00
Wentao Guan	ac9cc4db54	Revert "RDMA/irdma: Update Kconfig" Revert commit `8ced3cb73c` which is upstream commit `060842fed5` It causes regression in 6.12.58 stable, no issues in upstream. The Kconfig dependency change `060842fed5` ("RDMA/irdma: Update Kconfig") went in linux kernel 6.18 where RDMA IDPF support was merged. Even though IDPF driver exists in older kernels, it doesn't provide RDMA support so there is no need for IRDMA to depend on IDPF in kernels <= 6.17. Link: https://lore.kernel.org/all/IA1PR11MB7727692DE0ECFE84E9B52F02CBD5A@IA1PR11MB7727.namprd11.prod.outlook.com/ Link: https://lore.kernel.org/all/IA1PR11MB772718B36A3B27D2F07B0109CBD5A@IA1PR11MB7727.namprd11.prod.outlook.com/ Cc: stable@vger.kernel.org # v6.12.58 Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:37 +01:00
Marc Zyngier	2678ceed58	KVM: arm64: Make all 32bit ID registers fully writable commit `3f9eacf4f0` upstream. 32bit ID registers aren't getting much love these days, and are often missed in updates. One of these updates broke restoring a GICv2 guest on a GICv3 machine. Instead of performing a piecemeal fix, just bite the bullet and make all 32bit ID regs fully writable. KVM itself never relies on them for anything, and if the VMM wants to mess up the guest, so be it. Fixes: `5cb57a1aff` ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest") Reported-by: Peter Maydell <peter.maydell@linaro.org> Cc: stable@vger.kernel.org Reviewed-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251030122707.2033690-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:37 +01:00
Takashi Iwai	fdf0dc82eb	ALSA: usb-audio: Fix missing unlock at error path of maxpacksize check The recent backport of the upstream commit `05a1fc5efd` ("ALSA: usb-audio: Fix potential overflow of PCM transfer buffer") on the older stable kernels like 6.12.y was broken since it doesn't consider the mutex unlock, where the upstream code manages with guard(). In the older code, we still need an explicit unlock. This is a fix that corrects the error path, applied only on old stable trees. Reported-by: Pavel Machek <pavel@denx.de> Closes: https://lore.kernel.org/aSWtH0AZH5+aeb+a@duo.ucw.cz Fixes: `98e9d5e33b` ("ALSA: usb-audio: Fix potential overflow of PCM transfer buffer") Reviewed-by: Pavel Machek <pavel@denx.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:36 +01:00
Jakub Horký	f16b97babd	kconfig/nconf: Initialize the default locale at startup [ Upstream commit `43c2931a95` ] Fix bug where make nconfig doesn't initialize the default locale, which causes ncurses menu borders to be displayed incorrectly (lqqqqk) in UTF-8 terminals that don't support VT100 ACS by default, such as PuTTY. Signed-off-by: Jakub Horký <jakub.git@horky.net> Link: https://patch.msgid.link/20251014144405.3975275-2-jakub.git@horky.net [nathan: Alphabetize locale.h include] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:36 +01:00
Jakub Horký	9e3a382929	kconfig/mconf: Initialize the default locale at startup [ Upstream commit `3927c4a108` ] Fix bug where make menuconfig doesn't initialize the default locale, which causes ncurses menu borders to be displayed incorrectly (lqqqqk) in UTF-8 terminals that don't support VT100 ACS by default, such as PuTTY. Signed-off-by: Jakub Horký <jakub.git@horky.net> Link: https://patch.msgid.link/20251014154933.3990990-1-jakub.git@horky.net [nathan: Alphabetize locale.h include] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:36 +01:00
Shahar Shitrit	c944a850eb	net: tls: Cancel RX async resync request on rcd_delta overflow [ Upstream commit `c15d5c62ab` ] When a netdev issues a RX async resync request for a TLS connection, the TLS module handles it by logging record headers and attempting to match them to the tcp_sn provided by the device. If a match is found, the TLS module approves the tcp_sn for resynchronization. While waiting for a device response, the TLS module also increments rcd_delta each time a new TLS record is received, tracking the distance from the original resync request. However, if the device response is delayed or fails (e.g due to unstable connection and device getting out of tracking, hardware errors, resource exhaustion etc.), the TLS module keeps logging and incrementing, which can lead to a WARN() when rcd_delta exceeds the threshold. To address this, introduce tls_offload_rx_resync_async_request_cancel() to explicitly cancel resync requests when a device response failure is detected. Call this helper also as a final safeguard when rcd_delta crosses its threshold, as reaching this point implies that earlier cancellation did not occur. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761508983-937977-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:36 +01:00
Carlos Llamas	1f0f07fd8f	blk-crypto: use BLK_STS_INVAL for alignment errors [ Upstream commit `0b39ca4572` ] Make __blk_crypto_bio_prep() propagate BLK_STS_INVAL when IO segments fail the data unit alignment check. This was flagged by an LTP test that expects EINVAL when performing an O_DIRECT read with a misaligned buffer [1]. Cc: Eric Biggers <ebiggers@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/all/aP-c5gPjrpsn0vJA@google.com/ [1] Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:36 +01:00
Shahar Shitrit	74bf749662	net: tls: Change async resync helpers argument [ Upstream commit `34892cfec0` ] Update tls_offload_rx_resync_async_request_start() and tls_offload_rx_resync_async_request_end() to get a struct tls_offload_resync_async parameter directly, rather than extracting it from struct sock. This change aligns the function signatures with the upcoming tls_offload_rx_resync_async_request_cancel() helper, which will be introduced in a subsequent patch. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1761508983-937977-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:35 +01:00
Po-Hsu Lin	e8d7fa04c3	selftests: net: use BASH for bareudp testing [ Upstream commit `9311e9540a` ] In bareudp.sh, this script uses /bin/sh and it will load another lib.sh BASH script at the very beginning. But on some operating systems like Ubuntu, /bin/sh is actually pointed to DASH, thus it will try to run BASH commands with DASH and consequently leads to syntax issues: # ./bareudp.sh: 4: ./lib.sh: Bad substitution # ./bareudp.sh: 5: ./lib.sh: source: not found # ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected Fix this by explicitly using BASH for bareudp.sh. This fixes test execution failures on systems where /bin/sh is not BASH. Reported-by: Edoardo Canepa <edoardo.canepa@canonical.com> Link: https://bugs.launchpad.net/bugs/2129812 Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20251027095710.2036108-2-po-hsu.lin@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:35 +01:00
Borislav Petkov (AMD)	09c4f1a378	x86/microcode/AMD: Limit Entrysign signature checking to known generations [ Upstream commit `8a9fb5129e` ] Limit Entrysign sha256 signature checking to CPUs in the range Zen1-Zen5. X86_BUG cannot be used here because the loading on the BSP happens way too early, before the cpufeatures machinery has been set up. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/all/20251023124629.5385-1-bp@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:35 +01:00
Bart Van Assche	47c8b35a1f	scsi: core: Fix a regression triggered by scsi_host_busy() [ Upstream commit `a0b7780602` ] Commit `995412e23b` ("blk-mq: Replace tags->lock with SRCU for tag iterators") introduced the following regression: Call trace: __srcu_read_lock+0x30/0x80 (P) blk_mq_tagset_busy_iter+0x44/0x300 scsi_host_busy+0x38/0x70 ufshcd_print_host_state+0x34/0x1bc ufshcd_link_startup.constprop.0+0xe4/0x2e0 ufshcd_init+0x944/0xf80 ufshcd_pltfrm_init+0x504/0x820 ufs_rockchip_probe+0x2c/0x88 platform_probe+0x5c/0xa4 really_probe+0xc0/0x38c __driver_probe_device+0x7c/0x150 driver_probe_device+0x40/0x120 __driver_attach+0xc8/0x1e0 bus_for_each_dev+0x7c/0xdc driver_attach+0x24/0x30 bus_add_driver+0x110/0x230 driver_register+0x68/0x130 __platform_driver_register+0x20/0x2c ufs_rockchip_pltform_init+0x1c/0x28 do_one_initcall+0x60/0x1e0 kernel_init_freeable+0x248/0x2c4 kernel_init+0x20/0x140 ret_from_fork+0x10/0x20 Fix this regression by making scsi_host_busy() check whether the SCSI host tag set has already been initialized. tag_set->ops is set by scsi_mq_setup_tags() just before blk_mq_alloc_tag_set() is called. This fix is based on the assumption that scsi_host_busy() and scsi_mq_setup_tags() calls are serialized. This is the case in the UFS driver. Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com> Closes: https://lore.kernel.org/linux-block/pnezafputodmqlpumwfbn644ohjybouveehcjhz2hmhtcf2rka@sdhoiivync4y/ Cc: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com> Link: https://patch.msgid.link/20251007214800.1678255-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:35 +01:00
Steve French	cfc16a0fb0	cifs: fix typo in enable_gcm_256 module parameter [ Upstream commit `f765fdfcd8` ] Fix typo in description of enable_gcm_256 module parameter Suggested-by: Thomas Spear <speeddymon@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:35 +01:00
Rafał Miłecki	62df4bd320	bcma: don't register devices disabled in OF [ Upstream commit `a2a69add80` ] Some bus devices can be marked as disabled for specific SoCs or models. Those should not be registered to avoid probing them. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251003125126.27950-1-zajec5@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:34 +01:00
Michal Luczaj	f1c170cae2	vsock: Ignore signal/timeout on connect() if already established [ Upstream commit `002541ef65` ] During connect(), acting on a signal/timeout by disconnecting an already established socket leads to several issues: 1. connect() invoking vsock_transport_cancel_pkt() -> virtio_transport_purge_skbs() may race with sendmsg() invoking virtio_transport_get_credit(). This results in a permanently elevated `vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling. 2. connect() resetting a connected socket's state may race with socket being placed in a sockmap. A disconnected socket remaining in a sockmap breaks sockmap's assumptions. And gives rise to WARNs. 3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a transport change/drop after TCP_ESTABLISHED. Which poses a problem for any simultaneous sendmsg() or connect() and may result in a use-after-free/null-ptr-deref. Do not disconnect socket on signal/timeout. Keep the logic for unconnected sockets: they don't linger, can't be placed in a sockmap, are rejected by sendmsg(). [1]: https://lore.kernel.org/netdev/e07fd95c-9a38-4eea-9638-133e38c2ec9b@rbox.co/ [2]: https://lore.kernel.org/netdev/20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co/ [3]: https://lore.kernel.org/netdev/60f1b7db-3099-4f6a-875e-af9f6ef194f6@rbox.co/ Fixes: `d021c34405` ("VSOCK: Introduce VM Sockets") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20251119-vsock-interrupted-connect-v2-1-70734cf1233f@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:34 +01:00
Shaurya Rane	48d6929027	cifs: fix memory leak in smb3_fs_context_parse_param error path [ Upstream commit `7e4d9120cf` ] Add proper cleanup of ctx->source and fc->source to the cifs_parse_mount_err error handler. This ensures that memory allocated for the source strings is correctly freed on all error paths, matching the cleanup already performed in the success path by smb3_cleanup_fs_context_contents(). Pointers are also set to NULL after freeing to prevent potential double-free issues. This change fixes a memory leak originally detected by syzbot. The leak occurred when processing Opt_source mount options if an error happened after ctx->source and fc->source were successfully allocated but before the function completed. The specific leak sequence was: 1. ctx->source = smb3_fs_context_fullpath(ctx, '/') allocates memory 2. fc->source = kstrdup(ctx->source, GFP_KERNEL) allocates more memory 3. A subsequent error jumps to cifs_parse_mount_err 4. The old error handler freed passwords but not the source strings, causing the memory to leak. This issue was not addressed by commit `e8c73eb7db` ("cifs: client: fix memory leak in smb3_fs_context_parse_param"), which only fixed leaks from repeated fsconfig() calls but not this error path. Patch updated with minor change suggested by kernel test robot Reported-by: syzbot+87be6809ed9bf6d718e3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=87be6809ed9bf6d718e3 Fixes: `24e0a1eff9` ("cifs: switch to new mount api") Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:34 +01:00
Thomas Weißschuh	55d879d1f8	LoongArch: Use UAPI types in ptrace UAPI header [ Upstream commit `20d7338f2d` ] The kernel UAPI headers already contain fixed-width integer types, there is no need to rely on the libc types. There may not be a libc available or the libc may not provides the <stdint.h>, like for example on nolibc. This also aligns the header with the rest of the LoongArch UAPI headers. Fixes: `803b0fc5c3` ("LoongArch: Add process management") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:34 +01:00
Kuniyuki Iwashima	2b7b4efca0	af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic(). [ Upstream commit `7bf3a476ce` ] Miao Wang reported a bug of SO_PEEK_OFF on AF_UNIX SOCK_STREAM socket. The unexpected behaviour is triggered when the peek offset is larger than the recv queue and the thread is unblocked by new data. Let's assume a socket which has "aaaa" in the recv queue and the peek offset is 4. First, unix_stream_read_generic() reads the offset 4 and skips the skb(s) of "aaaa" with the code below: skip = max(sk_peek_offset(sk, flags), 0); /* @skip is 4. / do { ... while (skip >= unix_skb_len(skb)) { skip -= unix_skb_len(skb); ... skb = skb_peek_next(skb, &sk->sk_receive_queue); if (!skb) goto again; / @skip is 0. / } The thread jumps to the 'again' label and goes to sleep since new data has not arrived yet. Later, new data "bbbb" unblocks the thread, and the thread jumps to the 'redo:' label to restart the entire process from the first skb in the recv queue. do { ... redo: ... last = skb = skb_peek(&sk->sk_receive_queue); ... again: if (skb == NULL) { ... timeo = unix_stream_data_wait(sk, timeo, last, last_len, freezable); ... goto redo; / @skip is 0 !! */ However, the peek offset is not reset in the path. If the buffer size is 8, recv() will return "aaaabbbb" without skipping any data, and the final offset will be 12 (the original offset 4 + peeked skbs' length 8). After sleeping in unix_stream_read_generic(), we have to fetch the peek offset again. Let's move the redo label before mutex_lock(&u->iolock). Fixes: `9f389e3567` ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag") Reported-by: Miao Wang <shankerwangmiao@gmail.com> Closes: https://lore.kernel.org/netdev/3B969F90-F51F-4B9D-AB1A-994D9A54D460@gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251117174740.3684604-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:34 +01:00
Kuniyuki Iwashima	232bd2cf50	af_unix: Cache state->msg in unix_stream_read_generic(). [ Upstream commit `8b77338eb2` ] In unix_stream_read_generic(), state->msg is fetched multiple times. Let's cache it in a local variable. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702223606.1054680-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: `7bf3a476ce` ("af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic().") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:33 +01:00
Pradyumn Rahar	6ebd02cf2d	net/mlx5: Clean up only new IRQ glue on request_irq() failure [ Upstream commit `d47515af6c` ] The mlx5_irq_alloc() function can inadvertently free the entire rmap and end up in a crash[1] when the other threads tries to access this, when request_irq() fails due to exhausted IRQ vectors. This commit modifies the cleanup to remove only the specific IRQ mapping that was just added. This prevents removal of other valid mappings and ensures precise cleanup of the failed IRQ allocation's associated glue object. Note: This error is observed when both fwctl and rds configs are enabled. [1] mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 general protection fault, probably for non-canonical address 0xe277a58fde16f291: 0000 [#1] SMP NOPTI RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Call Trace: <TASK> ? show_trace_log_lvl+0x1d6/0x2f9 ? show_trace_log_lvl+0x1d6/0x2f9 ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] ? __die_body.cold+0x8/0xa ? die_addr+0x39/0x53 ? exc_general_protection+0x1c4/0x3e9 ? dev_vprintk_emit+0x5f/0x90 ? asm_exc_general_protection+0x22/0x27 ? free_irq_cpu_rmap+0x23/0x7d mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] irq_pool_request_vector+0x7d/0x90 [mlx5_core] mlx5_irq_request+0x2e/0xe0 [mlx5_core] mlx5_irq_request_vector+0xad/0xf7 [mlx5_core] comp_irq_request_pci+0x64/0xf0 [mlx5_core] create_comp_eq+0x71/0x385 [mlx5_core] ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core] mlx5_comp_eqn_get+0x72/0x90 [mlx5_core] ? xas_load+0x8/0x91 mlx5_comp_irqn_get+0x40/0x90 [mlx5_core] mlx5e_open_channel+0x7d/0x3c7 [mlx5_core] mlx5e_open_channels+0xad/0x250 [mlx5_core] mlx5e_open_locked+0x3e/0x110 [mlx5_core] mlx5e_open+0x23/0x70 [mlx5_core] __dev_open+0xf1/0x1a5 __dev_change_flags+0x1e1/0x249 dev_change_flags+0x21/0x5c do_setlink+0x28b/0xcc4 ? __nla_parse+0x22/0x3d ? inet6_validate_link_af+0x6b/0x108 ? cpumask_next+0x1f/0x35 ? __snmp6_fill_stats64.constprop.0+0x66/0x107 ? __nla_validate_parse+0x48/0x1e6 __rtnl_newlink+0x5ff/0xa57 ? kmem_cache_alloc_trace+0x164/0x2ce rtnl_newlink+0x44/0x6e rtnetlink_rcv_msg+0x2bb/0x362 ? __netlink_sendskb+0x4c/0x6c ? netlink_unicast+0x28f/0x2ce ? rtnl_calcit.isra.0+0x150/0x146 netlink_rcv_skb+0x5f/0x112 netlink_unicast+0x213/0x2ce netlink_sendmsg+0x24f/0x4d9 __sock_sendmsg+0x65/0x6a ____sys_sendmsg+0x28f/0x2c9 ? import_iovec+0x17/0x2b ___sys_sendmsg+0x97/0xe0 __sys_sendmsg+0x81/0xd8 do_syscall_64+0x35/0x87 entry_SYSCALL_64_after_hwframe+0x6e/0x0 RIP: 0033:0x7fc328603727 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727 RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc </TASK> ---[ end trace f43ce73c3c2b13a2 ]--- RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00 74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31 f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff RSP: 0018:ff384881640eaca0 EFLAGS: 00010282 RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400 RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88 R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8 FS: 00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) kvm-guest: disable async PF for cpu 0 Fixes: `3354822cde` ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Tested-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1763381768-1234998-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-12-01 11:43:33 +01:00

1 2 3 4 5 ...

1323657 Commits