Commit Graph

1398251 Commits

Author SHA1 Message Date
Stefano Garzarella f7c877e753 vsock: fix lock inversion in vsock_assign_transport()
Syzbot reported a potential lock inversion deadlock between
vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.

The issue was introduced by commit 687aa0c558 ("vsock: Fix
transport_* TOCTOU") which added vsock_register_mutex locking in
vsock_assign_transport() around the transport->release() call, that can
call vsock_linger(). vsock_assign_transport() can be called with sk_lock
held. vsock_linger() calls sk_wait_event() that temporarily releases and
re-acquires sk_lock. During this window, if another thread hold
vsock_register_mutex while trying to acquire sk_lock, a circular
dependency is created.

Fix this by releasing vsock_register_mutex before calling
transport->release() and vsock_deassign_transport(). This is safe
because we don't need to hold vsock_register_mutex while releasing the
old transport, and we ensure the new transport won't disappear by
obtaining a module reference first via try_module_get().

Reported-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Tested-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Fixes: 687aa0c558 ("vsock: Fix transport_* TOCTOU")
Cc: mhal@rbox.co
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251021121718.137668-1-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 16:07:58 +02:00
Paolo Abeni df890ceeb2 Merge branch 'fix-poll-behaviour-for-tcp-based-tunnel-protocols'
Ralf Lici says:

====================
fix poll behaviour for TCP-based tunnel protocols

This patch series introduces a polling function for datagram-style
sockets that operates on custom skb queues, and updates ovpn (the
OpenVPN data-channel offload module) and espintcp (the TCP Encapsulation
of IKE and IPsec Packets implementation) to use it accordingly.

Protocols like the aforementioned one decapsulate packets received over
TCP and deliver userspace-bound data through a separate skb queue, not
the standard sk_receive_queue. Previously, both relied on
datagram_poll(), which would signal readiness based on non-userspace
packets, leading to misleading poll results and unnecessary recv
attempts in userspace.

Patch 1 introduces datagram_poll_queue(), a variant of datagram_poll()
that accepts an explicit receive queue. This builds on the approach
introduced in commit b50b058, which extended other skb-related functions
to support custom queues. Patch 2 and 3 update espintcp_poll() and
ovpn_tcp_poll() respectively to use this helper, ensuring readiness is
only signaled when userspace data is available.

Each patch is self-contained and the ovpn one includes rationale and
lifecycle enforcement where appropriate.
====================

Link: https://patch.msgid.link/20251021100942.195010-1-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 15:46:10 +02:00
Ralf Lici efd729408b ovpn: use datagram_poll_queue for socket readiness in TCP
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.

Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.

Fixes: 11851cbd60 ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-4-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 15:46:04 +02:00
Ralf Lici 0fc3e32c2c espintcp: use datagram_poll_queue for socket readiness
espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.

Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.

Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-3-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 15:46:04 +02:00
Ralf Lici f6ceec6434 net: datagram: introduce datagram_poll_queue for custom receive queues
Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.

Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.

Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 15:46:04 +02:00
Alok Tiwari c5efc6a0b3 io_uring: correct __must_hold annotation in io_install_fixed_file
The __must_hold annotation references &req->ctx->uring_lock, but req
is not in scope in io_install_fixed_file. This change updates the
annotation to reference the correct ctx->uring_lock.
improving code clarity.

Fixes: f110ed8498 ("io_uring: split out fixed file installation and removal")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-23 07:25:07 -06:00
Shengjiu Wang ba3a5e1aea ASoC: fsl_micfil: correct the endian format for DSD
The DSD format supported by micfil is that oldest bit is in bit 31, so
the format should be DSD little endian format.

Fixes: 21aa330fec ("ASoC: fsl_micfil: Add decimation filter bypass mode support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Link: https://patch.msgid.link/20251023064538.368850-3-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-23 13:42:34 +01:00
Shengjiu Wang d9fbe5b0bf ASoC: fsl_sai: fix bit order for DSD format
The DSD little endian format requires the msb first, because oldest bit
is in msb.
found this issue by testing with pipewire.

Fixes: c111c2ddb3 ("ASoC: fsl_sai: Add PDM daifmt support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20251023064538.368850-2-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-23 13:42:33 +01:00
Cezary Rojewski 64007ad3e2 ASoC: Intel: avs: Use snd_codec format when initializing probe
The data probing is a debug feature. Currently parameters channels and
rate specified by the application are read while the format is ignored.
More robust approach is to read all of them.

Audio format, while not used by the Probe module for PCM streaming,
takes part in the gateway initialization on the DSP side. With full
parametrization we gain better coverage with the data probing feature.

Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20251023092348.3119313-4-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-23 13:42:27 +01:00
Cezary Rojewski 845f716dc5 ASoC: Intel: avs: Disable periods-elapsed work when closing PCM
avs_dai_fe_shutdown() handles the shutdown procedure for HOST HDAudio
stream while period-elapsed work services its IRQs. As the former
frees the DAI's private context, these two operations shall be
synchronized to avoid slab-use-after-free or worse errors.

Fixes: 0dbb186c35 ("ASoC: Intel: avs: Update stream status in a separate thread")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20251023092348.3119313-3-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-23 13:42:26 +01:00
Cezary Rojewski cfca1637bc ASoC: Intel: avs: Unprepare a stream when XRUN occurs
The pcm->prepare() function may be called multiple times in a row by the
userspace, as mentioned in the documentation. The driver shall take that
into account and prevent redundancy. However, the exact same function is
called during XRUNs and in such case, the particular stream shall be
reset and setup anew.

Fixes: 9114700b49 ("ASoC: Intel: avs: Generic PCM FE operations")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20251023092348.3119313-2-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-10-23 13:42:25 +01:00
Haotian Zhang 4c4e6ea4a1 gpio: ljca: Fix duplicated IRQ mapping
The generic_handle_domain_irq() function resolves the hardware IRQ
internally. The driver performed a duplicative mapping by calling
irq_find_mapping() first, which could lead to an RCU stall.

Delete the redundant irq_find_mapping() call and pass the hardware IRQ
directly to generic_handle_domain_irq().

Fixes: c5a4b6fd31 ("gpio: Add support for Intel LJCA USB GPIO driver")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20251023070231.1305-1-vulab@iscas.ac.cn
[Bartosz: remove unused variable]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-10-23 14:30:11 +02:00
LI Qingwu 622865c73a USB: serial: option: add Telit FN920C04 ECM compositions
Add support for the Telit Cinterion FN920C04 module when operating in
ECM (Ethernet Control Model) mode. The following USB product IDs are
used by the module when AT#USBCFG is set to 3 or 7.

0x10A3: ECM + tty (NMEA) + tty (DUN) [+ tty (DIAG)]
T:  Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=  3 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10a3 Rev= 5.15
S:  Manufacturer=Telit Cinterion
S:  Product=FN920
S:  SerialNumber=76e7cb38
C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether
E:  Ad=82(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

0x10A8: ECM + tty (DUN) + tty (AUX) [+ tty (DIAG)]
T:  Bus=03 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=  3 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10a8 Rev= 5.15
S:  Manufacturer=Telit Cinterion
S:  Product=FN920
S:  SerialNumber=76e7cb38
C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether
E:  Ad=82(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Adding these IDs allows the option driver to automatically create the
corresponding /dev/ttyUSB* ports under ECM mode.

Tested with FN920C04 under ECM configuration (USBCFG=3 and 7).

Signed-off-by: LI Qingwu <Qing-wu.Li@leica-geosystems.com.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2025-10-23 14:11:41 +02:00
Rafael J. Wysocki 114cbd67db Merge branch 'acpi-property'
Merge an ACPI device properties handling change fixing the order of
__acpi_node_get_property_reference() arguments broken by a recent
update (Sunil V L)

* 'acpi-property':
  ACPI: property: Fix argument order in __acpi_node_get_property_reference()
2025-10-23 13:25:02 +02:00
Rafael J. Wysocki b62bd2cf7e Merge branches 'pm-cpuidle' and 'pm-cpufreq'
Merge cpuidle and cpufreq fixes for 6.18-rc3:

 - Revert a cpuidle menu governor change that introduced a serious
   performance regression on Chromebooks with Intel Jasper Lake
   processors (Rafael Wysocki)

 - Fix an amd-pstate driver regression leading to EPP=0 after
   hibernation (Mario Limonciello)

* pm-cpuidle:
  Revert "cpuidle: menu: Avoid discarding useful information"

* pm-cpufreq:
  cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate
2025-10-23 13:12:24 +02:00
Tonghao Zhang 10843e1492 net: bonding: fix possible peer notify event loss or dup issue
If the send_peer_notif counter and the peer event notify are not synchronized.
It may cause problems such as the loss or dup of peer notify event.

Before this patch:
- If should_notify_peers is true and the lock for send_peer_notif-- fails, peer
  event may be sent again in next mii_monitor loop, because should_notify_peers
  is still true.
- If should_notify_peers is true and the lock for send_peer_notif-- succeeded,
  but the lock for peer event fails, the peer event will be lost.

This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.

Fixes: 07a4ddec3c ("bonding: add an option to specify a delay between peer notifications")
Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Vincent Bernat <vincent@bernat.ch>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 13:08:53 +02:00
Samuel Wu 79816d4b9e Revert "PM: sleep: Make pm_wakeup_clear() call more clear"
This reverts commit 56a232d93c.

The above commit changed the position of pm_wakeup_clear() for the
suspend call path, but other call paths with references to
freeze_processes() were not updated. This means that other call
paths, such as hibernate(), will not have pm_wakeup_clear() called.

Suggested-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Samuel Wu <wusamuel@google.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251022222830.634086-1-wusamuel@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-23 12:48:04 +02:00
Bartosz Golaszewski 5f4bfd03bc Merge tag 'intel-gpio-v6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio fixes for v6.18-1

* Make set debounce errors non-fatal in GPIO ACPI case
* Use human readable error when printing a message in GPIO ACPI code
2025-10-23 10:06:59 +02:00
David Howells 64c9471aa9 cifs: #include cifsglob.h before trace.h to allow structs in tracepoints
Make cifs #include cifsglob.h in advance of #including trace.h so that the
structures defined in cifsglob.h can be accessed directly by the cifs
tracepoints rather than the callers having to manually pass in the bits and
pieces.

This should allow the tracepoints to be made more efficient to use as well
as easier to read in the code.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-23 02:47:20 -05:00
David Howells 4b1d7f6222 cifs: Call the calc_signature functions directly
As the SMB1 and SMB2/3 calc_signature functions are called from separate
sign and verify paths, just call them directly rather than using a function
pointer.  The SMB3 calc_signature then jumps to the SMB2 variant if
necessary.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Shyam Prasad N <sprasad@microsoft.com>
cc: Tom Talpey <tom@talpey.com>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-23 02:47:20 -05:00
Paulo Alcantara 72ed55b4c3 smb: client: get rid of d_drop() in cifs_do_rename()
There is no need to force a lookup by unhashing the moved dentry after
successfully renaming the file on server.  The file metadata will be
re-fetched from server, if necessary, in the next call to
->d_revalidate() anyways.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-23 02:46:50 -05:00
Andy Shevchenko b1055678a0 gpiolib: acpi: Use %pe when passing an error pointer to dev_err()
One of the coccinelle recipe suggests to use %pe when we deal with
an error pointer. Do it so.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202510231350.calxvXIm-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2025-10-23 08:40:46 +02:00
Hans de Goede e4a77f9c85 gpiolib: acpi: Make set debounce errors non fatal
Commit 16c07342b5 ("gpiolib: acpi: Program debounce when finding GPIO")
adds a gpio_set_debounce_timeout() call to acpi_find_gpio() and makes
acpi_find_gpio() fail if this fails.

But gpio_set_debounce_timeout() failing is a somewhat normal occurrence,
since not all debounce values are supported on all GPIO/pinctrl chips.

Making this an error for example break getting the card-detect GPIO for
the micro-sd slot found on many Bay Trail tablets, breaking support for
the micro-sd slot on these tablets.

acpi_request_own_gpiod() already treats gpio_set_debounce_timeout()
failures as non-fatal, just warning about them.

Add a acpi_gpio_set_debounce_timeout() helper which wraps
gpio_set_debounce_timeout() and warns on failures and replace both existing
gpio_set_debounce_timeout() calls with the helper.

Since the helper only warns on failures this fixes the card-detect issue.

Fixes: 16c07342b5 ("gpiolib: acpi: Program debounce when finding GPIO")
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <superm1@kernel.org>
Signed-off-by: Hans de Goede <hansg@kernel.org>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/stable/20250920201200.20611-1-hansg%40kernel.org
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2025-10-23 08:36:53 +02:00
Haotian Zhang 3c9bf72cc1 crypto: aspeed - fix double free caused by devm
The clock obtained via devm_clk_get_enabled() is automatically managed
by devres and will be disabled and freed on driver detach. Manually
calling clk_disable_unprepare() in error path and remove function
causes double free.

Remove the manual clock cleanup in both aspeed_acry_probe()'s error
path and aspeed_acry_remove().

Fixes: 2f1cf4e50c ("crypto: aspeed - Add ACRY RSA driver")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-23 12:53:23 +08:00
Harald Freudenberger 3ac2939bc4 crypto: s390/phmac - Do not modify the req->nbytes value
The phmac implementation used the req->nbytes field on combined
operations (finup, digest) to track the state:
with req->nbytes > 0 the update needs to be processed,
while req->nbytes == 0 means to do the final operation. For
this purpose the req->nbytes field was set to 0 after successful
update operation. However, aead uses the req->nbytes field after a
successful hash operation to determine the amount of data to
en/decrypt. So an implementation must not modify the nbytes field.

Fixed by a slight rework on the phmac implementation. There is
now a new field async_op in the request context which tracks
the (asynch) operation to process. So the 'state' via req->nbytes
is not needed any more and now this field is untouched and may
be evaluated even after a request is processed by the phmac
implementation.

Fixes: cbbc675506 ("crypto: s390 - New s390 specific protected key hash phmac")
Reported-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-10-23 12:53:23 +08:00
Keith Busch 0194b65ab5 nvme-pci: use blk_map_iter for p2p metadata
The dma_map_bvec helper doesn't work for p2p data, so use the same
blk_map_iter method that sgl uses for this memory type.

Reported-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-10-22 19:46:25 -07:00
Fernando Fernandez Mancera c0178eec88 net: hsr: prevent creation of HSR device with slaves from another netns
HSR/PRP driver does not handle correctly having slaves/interlink devices
in a different net namespace. Currently, it is possible to create a HSR
link in a different net namespace than the slaves/interlink with the
following command:

 ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2

As there is no use-case on supporting this scenario, enforce that HSR
device link matches netns defined by IFLA_LINK_NETNSID.

The iproute2 command mentioned above will throw the following error:

 Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link.

Fixes: f421436a59 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 19:22:22 -07:00
Alexey Simakov 441f0647f7 sctp: avoid NULL dereference when chunk data buffer is missing
chunk->skb pointer is dereferenced in the if-block where it's supposed
to be NULL only.

chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list
instead and do it just before replacing chunk->skb. We're sure that
otherwise chunk->skb is non-NULL because of outer if() condition.

Fixes: 90017accff ("sctp: Add GSO support")
Signed-off-by: Alexey Simakov <bigalex934@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 19:19:31 -07:00
Jiasheng Jiang a767957e7a ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1]
instead of bp->sma[i] inside a for-loop, which caused only SMA[1]
to have its DIRECTION_CAN_CHANGE capability cleared. This led to
inconsistent capability flags across SMA pins.

Fixes: 09eeb3aecc ("ptp_ocp: implement DPLL ops")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251021182456.9729-1-jiashengjiangcool@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 19:18:39 -07:00
David Howells 5b2ff4873a cifs: Fix TCP_Server_Info::credits to be signed
Fix TCP_Server_Info::credits to be signed, just as echo_credits and
oplock_credits are.  This also fixes what ought to get at least a
compilation warning if not an outright error in *get_credits_field() as a
pointer to the unsigned server->credits field is passed back as a pointer
to a signed int.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cifs@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Acked-by: Pavel Shilovskiy <pshilovskiy@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:22:18 -05:00
Jakub Kicinski 4c3aa496a2 Merge branch 'net-ravb-fix-soc-specific-configuration-and-descriptor-handling-issues'
Lad Prabhakar says:

====================
net: ravb: Fix SoC-specific configuration and descriptor handling issues [part]

This series addresses several issues in the Renesas Ethernet AVB (ravb)
driver related descriptor ordering.

A potential ordering hazard in descriptor setup could cause
the DMA engine to start prematurely, leading to TX stalls on some
platforms.

The series includes the following changes:

Enforce descriptor type ordering to prevent early DMA start
Ensure proper write ordering of TX descriptor type fields to prevent the
DMA engine from observing an incomplete descriptor chain. This fixes
observed TX stalls on RZ/G2L platforms running RT kernels.

Tested on R/G1x Gen2, RZ/G2x Gen3 and RZ/G2L family hardware.
====================

Link: https://patch.msgid.link/20251017151830.171062-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 18:16:17 -07:00
Lad Prabhakar 706136c572 net: ravb: Ensure memory write completes before ringing TX doorbell
Add a final dma_wmb() barrier before triggering the transmit request
(TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to
the DMA engine.

According to the hardware manual, a read-back operation is required
before writing to the doorbell register to guarantee completion of
previous writes. Instead of performing a dummy read, a dma_wmb() is
used to both enforce the same ordering semantics on the CPU side and
also to ensure completion of writes.

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-5-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 18:15:14 -07:00
Lad Prabhakar 5370c31e84 net: ravb: Enforce descriptor type ordering
Ensure the TX descriptor type fields are published in a safe order so the
DMA engine never begins processing a descriptor chain before all descriptor
fields are fully initialised.

For multi-descriptor transmits the driver writes DT_FEND into the last
descriptor and DT_FSTART into the first. The DMA engine begins processing
when it observes DT_FSTART. Move the dma_wmb() barrier so it executes
immediately after DT_FEND and immediately before writing DT_FSTART
(and before DT_FSINGLE in the single-descriptor case). This guarantees
that all prior CPU writes to the descriptor memory are visible to the
device before DT_FSTART is seen.

This avoids a situation where compiler/CPU reordering could publish
DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to
start on a partially initialised chain and causing corrupted transmissions
or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as
transmit queue timeouts and device resets.

Fixes: 2f45d1902a ("ravb: minimize TX data copying")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 18:15:13 -07:00
Stefan Metzmacher 123111ea62 smb: client: make use of smbdirect_socket.send_io.lcredits.*
This makes the logic to prevent on overflow of
the send submission queue with ib_post_send() easier.

As we first get a local credit and then a remote credit
before we mark us as pending.

For now we'll keep the logic around smbdirect_socket.send_io.pending.*,
but that will likely change or be removed completely.

The server will get a similar logic soon, so
we'll be able to share the send code in future.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:11:05 -05:00
Stefan Metzmacher 0158e864cc smb: server: make use of smbdirect_socket.send_io.lcredits.*
This introduces logic to prevent on overflow of
the send submission queue with ib_post_send() easier.

As we first get a local credit and then a remote credit
before we mark us as pending.

From reading the git history of the linux smbdirect
implementations in client and server) it was seen
that a peer granted more credits than we requested.
I guess that only happened because of bugs in our
implementation which was active as client and server.
I guess Windows won't do that.

So the local credits make sure we only use the amount
of credits we asked for.

Fixes: 0626e6641f ("cifsd: add server handler for central processing and tranport layers")
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:11:01 -05:00
Stefan Metzmacher a90227462a smb: server: simplify sibling_list handling in smb_direct_flush_send_list/send_done
We have a list handling that is much easier to understand:

1. Before smb_direct_flush_send_list() is called all
   struct smbdirect_send_io messages are part of
   send_ctx->msg_list

2. Before smb_direct_flush_send_list() calls
   smb_direct_post_send() we remove the last
   element in send_ctx->msg_list and move all
   others into last->sibling_list. As only
   last has IB_SEND_SIGNALED and gets a completion
   vis send_done().

3. send_done() has an easy way to free all others
   in sendmsg->sibling_list (if there are any).
   And use list_for_each_entry_safe() instead of
   a complex custom logic.

This will help us to share send_done() in common
code soon, as it will work fine for the client too,
where last->sibling_list is currently always an empty list.

Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:10:53 -05:00
Stefan Metzmacher 8059c64049 smb: server: smb_direct_disconnect_rdma_connection() already wakes all waiters on error
There's no need to care about pending or credit counters when we
already disconnecting.

And all related wait_event conditions already check for broken
connections too.

This will simplify the code and makes the following changes simpler.

Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:10:44 -05:00
Stefan Metzmacher 68335cbcdd smb: smbdirect: introduce smbdirect_socket.send_io.lcredits.*
This will be used to implement a logic in order to make sure
we don't overflow the send submission queue for ib_post_send().

We will initialize the local credits with the
fixed sp->send_credit_target value, which matches
the reserved slots in the submission queue for ib_post_send().

We will be a local credit first and then wait for a remote credit,
if we managed to get both we are allowed to post an
IB_WR_SEND[_WITH_INV]. The local credit is given back to
the pool when we get the local ib_post_send() completion,
while remote credits are granted by the peer.

From reading the git history of the linux smbdirect
implementations in client and server) it was seen
that a peer granted more credits than we requested.
I guess that only happened because of bugs in our
implementation which was active as client and server.
I guess Windows won't do that.

So the local credits make sure we only use the amount
of credits we asked for.

The client already has some logic for this based on
smbdirect_socket.send_io.pending.count, but that
counts in the order direction and makes it complex it
share common logic for various credits classes.
That logic will be replaced soon.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:10:19 -05:00
Stefan Metzmacher 0bd73ae09b smb: server: allocate enough space for RW WRs and ib_drain_qp()
Make use of rdma_rw_mr_factor() to calculate the number of rw
credits and the number of pages per RDMA RW operation.

We get the same numbers for iWarp connections, tested
with siw.ko and irdma.ko (in iWarp mode).

siw:

CIFS: max_qp_rd_atom=128, max_fast_reg_page_list_len = 256
CIFS: max_sgl_rd=0, max_sge_rd=1
CIFS: responder_resources=32 max_frmr_depth=256 mr_io.type=0
CIFS: max_send_wr 384, device reporting max_cqe 3276800 max_qp_wr 32768
ksmbd: max_fast_reg_page_list_len = 256, max_sgl_rd=0, max_sge_rd=1
ksmbd: device reporting max_cqe 3276800 max_qp_wr 32768
ksmbd: Old sc->rw_io.credits: max = 9, num_pages = 256
ksmbd: New sc->rw_io.credits: max = 9, num_pages = 256, maxpages=2048
ksmbd: Info: rdma_send_wr 27 + max_send_wr 256 = 283

irdma (in iWarp mode):

CIFS: max_qp_rd_atom=127, max_fast_reg_page_list_len = 262144
CIFS: max_sgl_rd=0, max_sge_rd=13
CIFS: responder_resources=32 max_frmr_depth=2048 mr_io.type=0
CIFS: max_send_wr 384, device reporting max_cqe 1048574 max_qp_wr 4063
ksmbd: max_fast_reg_page_list_len = 262144, max_sgl_rd=0, max_sge_rd=13
ksmbd: device reporting max_cqe 1048574 max_qp_wr 4063
ksmbd: Old sc->rw_io.credits: max = 9, num_pages = 256
ksmbd: New sc->rw_io.credits: max = 9, num_pages = 256, maxpages=2048
ksmbd: rdma_send_wr 27 + max_send_wr 256 = 283

This means that we get the different correct numbers for ROCE,
tested with rdma_rxe.ko and irdma.ko (in RoCEv2 mode).

rxe:

CIFS: max_qp_rd_atom=128, max_fast_reg_page_list_len = 512
CIFS: max_sgl_rd=0, max_sge_rd=32
CIFS: responder_resources=32 max_frmr_depth=512 mr_io.type=0
CIFS: max_send_wr 384, device reporting max_cqe 32767 max_qp_wr 1048576
ksmbd: max_fast_reg_page_list_len = 512, max_sgl_rd=0, max_sge_rd=32
ksmbd: device reporting max_cqe 32767 max_qp_wr 1048576
ksmbd: Old sc->rw_io.credits: max = 9, num_pages = 256
ksmbd: New sc->rw_io.credits: max = 65, num_pages = 32, maxpages=2048
ksmbd: rdma_send_wr 65 + max_send_wr 256 = 321

irdma (in RoCEv2 mode):

CIFS: max_qp_rd_atom=127, max_fast_reg_page_list_len = 262144,
CIFS: max_sgl_rd=0, max_sge_rd=13
CIFS: responder_resources=32 max_frmr_depth=2048 mr_io.type=0
CIFS: max_send_wr 384, device reporting max_cqe 1048574 max_qp_wr 4063
ksmbd: max_fast_reg_page_list_len = 262144, max_sgl_rd=0, max_sge_rd=13
ksmbd: device reporting max_cqe 1048574 max_qp_wr 4063
ksmbd: Old sc->rw_io.credits: max = 9, num_pages = 256,
ksmbd: New sc->rw_io.credits: max = 159, num_pages = 13, maxpages=2048
ksmbd: rdma_send_wr 159 + max_send_wr 256 = 415

And rely on rdma_rw_init_qp() to setup ib_mr_pool_init() for
RW MRs. ib_mr_pool_destroy() will be called by rdma_rw_cleanup_mrs().

It seems the code was implemented before the rdma_rw_* layer
was fully established in the kernel.

While there also add additional space for ib_drain_qp().

This should make sure ib_post_send() will never fail
because the submission queue is full.

Fixes: ddbdc861e3 ("ksmbd: smbd: introduce read/write credits for RDMA read/write")
Fixes: 4c564f03e2 ("smb: server: make use of common smbdirect_socket")
Fixes: 177368b992 ("smb: server: make use of common smbdirect_socket_parameters")
Fixes: 95475d8886 ("smb: server: make use smbdirect_socket.rw_io.credits")
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-22 20:10:12 -05:00
Linus Torvalds 43e9ad0c55 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "All driver fixes. The big change is the storvsc one to rejig the
  hyper-v channel handling to be more efficient for SMP virtual
  machines"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: phy: dt-bindings: Add QMP UFS PHY compatible for Kaanapali
  scsi: ufs: qcom: dt-bindings: Document the Kaanapali UFS controller
  scsi: libfc: Prevent integer overflow in fc_fcp_recv_data()
  scsi: qla4xxx: Fix typos in comments
  scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU
2025-10-22 15:00:34 -10:00
Linus Torvalds 0f3ad9c610 Merge tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "17 hotfixes. 12 are cc:stable and 14 are for MM.

  There's a two-patch DAMON series from SeongJae Park which addresses a
  missed check and possible memory leak. Apart from that it's all
  singletons - please see the changelogs for details"

* tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  csky: abiv2: adapt to new folio flags field
  mm/damon/core: use damos_commit_quota_goal() for new goal commit
  mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme
  hugetlbfs: move lock assertions after early returns in huge_pmd_unshare()
  vmw_balloon: indicate success when effectively deflating during migration
  mm/damon/core: fix list_add_tail() call on damon_call()
  mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap
  mm: prevent poison consumption when splitting THP
  ocfs2: clear extent cache after moving/defragmenting extents
  mm: don't spin in add_stack_record when gfp flags don't allow
  dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC
  mm/damon/sysfs: dealloc commit test ctx always
  mm/damon/sysfs: catch commit test ctx alloc failure
  hung_task: fix warnings caused by unaligned lock pointers
2025-10-22 14:57:35 -10:00
Hannes Reinecke 60ad1de8e5 nvmet-auth: update sc_c in host response
The target code should set the sc_c bit in calculating the host response
based on the status of the 'concat' setting, otherwise we'll get an
authentication mismatch for hosts setting that bit correctly.

Fixes: 7e091add9c ("nvme-auth: update sc_c in host response")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-10-22 17:28:18 -07:00
Kaushlendra Kumar 20594cd104 ACPI: button: Call input_free_device() on failing input device registration
Make acpi_button_add() call input_free_device() when
input_register_device() fails as required according to the
documentation of the latter.

Fixes: 0d51157dfa ("ACPI: button: Eliminate the driver notify callback")
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Cc: 6.5+ <stable@vger.kernel.org> # 6.5+
[ rjw: Subject and changelog rewrite, Fixes: tag ]
Link: https://patch.msgid.link/20251006084706.971855-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:19:23 +02:00
Eric Biggers 1af424b154 lib/crypto: poly1305: Restore dependency of arch code on !KMSAN
Restore the dependency of the architecture-optimized Poly1305 code on
!KMSAN.  It was dropped by commit b646b782e5 ("lib/crypto: poly1305:
Consolidate into single module").

Unlike the other hash algorithms in lib/crypto/ (e.g., SHA-512), the way
the architecture-optimized Poly1305 code is integrated results in
assembly code initializing memory, for several different architectures.
Thus, it generates false positive KMSAN warnings.  These could be
suppressed with kmsan_unpoison_memory(), but it would be needed in quite
a few places.  For now let's just restore the dependency on !KMSAN.

Note: this should have been caught by running poly1305_kunit with
CONFIG_KMSAN=y, which I did.  However, due to an unrelated KMSAN bug
(https://lore.kernel.org/r/20251022030213.GA35717@sol/), KMSAN currently
isn't working reliably.  Thus, the warning wasn't noticed until later.

Fixes: b646b782e5 ("lib/crypto: poly1305: Consolidate into single module")
Reported-by: syzbot+01fcd39a0d90cdb0e3df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68f6a48f.050a0220.91a22.0452.GAE@google.com/
Reported-by: Pei Xiao <xiaopei01@kylinos.cn>
Closes: https://lore.kernel.org/r/751b3d80293a6f599bb07770afcef24f623c7da0.1761026343.git.xiaopei01@kylinos.cn/
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251022033405.64761-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-22 10:52:10 -07:00
David Wei 060aa0b0c2 io_uring zcrx: add MAINTAINERS entry
Same as [1] but also with netdev@ as an additional mailing list.
io_uring zero copy receive is of particular interest to netdev
participants too, given its tight integration to netdev core.

With this updated entry, folks running get_maintainer.pl on patches that
touch io_uring/zcrx.* will know to send it to netdev@ as well.

Note that this doesn't mean all changes require explicit acks from
netdev; this is purely for wider visibility and for other contributors
to know where to send patches.

[1]: https://lore.kernel.org/io-uring/989528e611b51d71fb712691ebfb76d2059ba561.1755461246.git.asml.silence@gmail.com/

Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
[axboe: use correct io_uring tree URL]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:05:19 -06:00
Ranganath V N 915651b7c9 io_uring: Fix code indentation error
Fix the indentation to ensure consistent code style and improve
readability and to fix the errors:
ERROR: code indent should use tabs where possible
+               return io_net_import_vec(req, kmsg, sr->buf, sr->len, ITER_SOURCE);$

ERROR: code indent should use tabs where possible
+^I^I^I           struct io_big_cqe *big_cqe)$

Tested by running the /scripts/checkpatch.pl

Signed-off-by: Ranganath V N <vnranganath.20@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 10:56:11 -06:00
Jens Axboe a94e065726 io_uring/sqpoll: be smarter on when to update the stime usage
The current approach is a bit naive, and hence calls the time querying
way too often. Only start the "doing work" timer when there's actual
work to do, and then use that information to terminate (and account) the
work time once done. This greatly reduces the frequency of these calls,
when they cannot have changed anyway.

Running a basic random reader that is setup to use SQPOLL, a profile
before this change shows these as the top cycle consumers:

+   32.60%  iou-sqp-1074  [kernel.kallsyms]  [k] thread_group_cputime_adjusted
+   19.97%  iou-sqp-1074  [kernel.kallsyms]  [k] thread_group_cputime
+   12.20%  io_uring      io_uring           [.] submitter_uring_fn
+    4.13%  iou-sqp-1074  [kernel.kallsyms]  [k] getrusage
+    2.45%  iou-sqp-1074  [kernel.kallsyms]  [k] io_submit_sqes
+    2.18%  iou-sqp-1074  [kernel.kallsyms]  [k] __pi_memset_generic
+    2.09%  iou-sqp-1074  [kernel.kallsyms]  [k] cputime_adjust

and after this change, top of profile looks as follows:

+   36.23%  io_uring     io_uring           [.] submitter_uring_fn
+   23.26%  iou-sqp-819  [kernel.kallsyms]  [k] io_sq_thread
+   10.14%  iou-sqp-819  [kernel.kallsyms]  [k] io_sq_tw
+    6.52%  iou-sqp-819  [kernel.kallsyms]  [k] tctx_task_work_run
+    4.82%  iou-sqp-819  [kernel.kallsyms]  [k] nvme_submit_cmds.part.0
+    2.91%  iou-sqp-819  [kernel.kallsyms]  [k] io_submit_sqes
[...]
     0.02%  iou-sqp-819  [kernel.kallsyms]  [k] cputime_adjust

where it's spending the cycles on things that actually matter.

Reported-by: Fengnan Chang <changfengnan@bytedance.com>
Cc: stable@vger.kernel.org
Fixes: 3fcb9d1720 ("io_uring/sqpoll: statistics of the true utilization of sq threads")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 10:55:33 -06:00
Jens Axboe 8ac9b0d33e io_uring/sqpoll: switch away from getrusage() for CPU accounting
getrusage() does a lot more than what the SQPOLL accounting needs, the
latter only cares about (and uses) the stime. Rather than do a full
RUSAGE_SELF summation, just query the used stime instead.

Cc: stable@vger.kernel.org
Fixes: 3fcb9d1720 ("io_uring/sqpoll: statistics of the true utilization of sq threads")
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 10:51:20 -06:00
Ilpo Järvinen f294a5fd34 MIPS: Malta: Use pcibios_align_resource() to block io range
According to Maciej W. Rozycki <macro@orcam.me.uk>, the
mips_pcibios_init() for malta adjusts root bus IO resource start
address to prevent interfering with PIIX4 I/O cycle decoding. Adjusting
lower bound leaves PIIX4 IO resources outside of the root bus resource
and assign_fixed_resource_on_bus() does not link the resources into the
resource tree.

Prior to commit ae81aad5c2 ("MIPS: PCI: Use pci_enable_resources()") the
arch specific pcibios_enable_resources() did not check if the resources
were assigned which diverges from what PCI core checks, effectively hiding
the PIIX4 IO resources were not properly within the resource tree. After
starting to use pcibios_enable_resources() from PCI core, enabling PIIX4
fails:

  ata_piix 0000:00:0a.1: BAR 0 [io  0x01f0-0x01f7]: not claimed; can't enable device
  ata_piix 0000:00:0a.1: probe with driver ata_piix failed with error -22

MIPS PCI code already has support for enforcing lower bounds using
PCIBIOS_MIN_IO in pcibios_align_resource() without altering the IO window
start address itself. Make malta PCI code too to use PCIBIOS_MIN_IO.

Fixes: ae81aad5c2 ("MIPS: PCI: Use pci_enable_resources()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/linux-pci/9085ab12-1559-4462-9b18-f03dcb9a4088@roeck-us.net/
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/linux-pci/alpine.DEB.2.21.2510132229120.39634@angie.orcam.me.uk/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://patch.msgid.link/20251017110903.1973-1-ilpo.jarvinen@linux.intel.com
2025-10-22 11:06:31 -05:00
Maciej W. Rozycki 1d5d166361 MIPS: Malta: Fix PCI southbridge legacy resource reservations
Covering the PCI southbridge legacy port I/O range with a northbridge
resource reservation prevents MIPS Malta platform code from claiming its
standard legacy resources.  This is because request_resource() calls
cause a clash with the previous reservation and consequently fail.

Change to using insert_resource() so as to prevent the clash, switching
the legacy reservations from:

  00000000-00ffffff : MSC PCI I/O
    00000020-00000021 : pic1
    00000070-00000077 : rtc0
    000000a0-000000a1 : pic2
    [...]

to:

  00000000-00ffffff : MSC PCI I/O
    00000000-0000001f : dma1
    00000020-00000021 : pic1
    00000040-0000005f : timer
    00000060-0000006f : keyboard
    00000070-00000077 : rtc0
    00000080-0000008f : dma page reg
    000000a0-000000a1 : pic2
    000000c0-000000df : dma2
    [...]

Fixes: ae81aad5c2 ("MIPS: PCI: Use pci_enable_resources()")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: stable@vger.kernel.org # v6.18+
Link: https://patch.msgid.link/alpine.DEB.2.21.2510212001250.8377@angie.orcam.me.uk
2025-10-22 11:06:24 -05:00