Commit Graph

4497 Commits

Author SHA1 Message Date
Linus Torvalds b9c8fc2cae Merge tag 'net-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from IPsec, Bluetooth and netfilter

  Current release - regressions:

   - wifi: fix dev_alloc_name() return value check

   - rds: fix recursive lock in rds_tcp_conn_slots_available

  Current release - new code bugs:

   - vsock: lock down child_ns_mode as write-once

  Previous releases - regressions:

   - core:
      - do not pass flow_id to set_rps_cpu()
      - consume xmit errors of GSO frames

   - netconsole: avoid OOB reads, msg is not nul-terminated

   - netfilter: h323: fix OOB read in decode_choice()

   - tcp: re-enable acceptance of FIN packets when RWIN is 0

   - udplite: fix null-ptr-deref in __udp_enqueue_schedule_skb().

   - wifi: brcmfmac: fix potential kernel oops when probe fails

   - phy: register phy led_triggers during probe to avoid AB-BA deadlock

   - eth:
      - bnxt_en: fix deleting of Ntuple filters
      - wan: farsync: fix use-after-free bugs caused by unfinished tasklets
      - xscale: check for PTP support properly

  Previous releases - always broken:

   - tcp: fix potential race in tcp_v6_syn_recv_sock()

   - kcm: fix zero-frag skb in frag_list on partial sendmsg error

   - xfrm:
      - fix race condition in espintcp_close()
      - always flush state and policy upon NETDEV_UNREGISTER event

   - bluetooth:
      - purge error queues in socket destructors
      - fix response to L2CAP_ECRED_CONN_REQ

   - eth:
      - mlx5:
         - fix circular locking dependency in dump
         - fix "scheduling while atomic" in IPsec MAC address query
      - gve: fix incorrect buffer cleanup for QPL
      - team: avoid NETDEV_CHANGEMTU event when unregistering slave
      - usb: validate USB endpoints"

* tag 'net-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
  netfilter: nf_conntrack_h323: fix OOB read in decode_choice()
  dpaa2-switch: validate num_ifs to prevent out-of-bounds write
  net: consume xmit errors of GSO frames
  vsock: document write-once behavior of the child_ns_mode sysctl
  vsock: lock down child_ns_mode as write-once
  selftests/vsock: change tests to respect write-once child ns mode
  net/mlx5e: Fix "scheduling while atomic" in IPsec MAC address query
  net/mlx5: Fix missing devlink lock in SRIOV enable error path
  net/mlx5: E-switch, Clear legacy flag when moving to switchdev
  net/mlx5: LAG, disable MPESW in lag_disable_change()
  net/mlx5: DR, Fix circular locking dependency in dump
  selftests: team: Add a reference count leak test
  team: avoid NETDEV_CHANGEMTU event when unregistering slave
  net: mana: Fix double destroy_workqueue on service rescan PCI path
  MAINTAINERS: Update maintainer entry for QUALCOMM ETHQOS ETHERNET DRIVER
  dpll: zl3073x: Remove redundant cleanup in devm_dpll_init()
  selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0
  tcp: re-enable acceptance of FIN packets when RWIN is 0
  vsock: Use container_of() to get net namespace in sysctl handlers
  net: usb: kaweth: validate USB endpoints
  ...
2026-02-26 08:00:13 -08:00
Kees Cook 189f164e57 Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch && !(file in "tools") && !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 08:26:33 -08:00
Linus Torvalds 32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds 323bbfcf1e Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook 69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Pavan Chebbi c1bbd9900d bnxt_en: Fix deleting of Ntuple filters
Ntuple filters can be deleted when the interface
is down. The current code blindly sends the filter
delete command to FW. When the interface is down, all
the VNICs are deleted in the FW. When the VNIC is
freed in the FW, all the associated filters are also
freed. We need not send the free command explicitly.
Sending such command will generate FW error in the
dmesg.

In order to fix this, we can safely return from
bnxt_hwrm_cfa_ntuple_filter_free() when BNXT_STATE_OPEN
is not true which confirms the VNICs have been deleted.

Fixes: 8336a974f3 ("bnxt_en: Save user configured filters in a lookup list")
Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260219185313.2682148-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-20 16:34:47 -08:00
Pavan Chebbi e123d9302d bnxt_en: Fix RSS context delete logic
We need to free the corresponding RSS context VNIC
in FW everytime an RSS context is deleted in driver.
Commit 667ac333db added a check to delete the VNIC
in FW only when netif_running() is true to help delete
RSS contexts with interface down.

Having that condition will make the driver leak VNICs
in FW whenever close() happens with active RSS contexts.
On the subsequent open(), as part of RSS context restoration,
we will end up trying to create extra VNICs for which we
did not make any reservation. FW can fail this request,
thereby making us lose active RSS contexts.

Suppose an RSS context is deleted already and we try to
process a delete request again, then the HWRM functions
will check for validity of the request and they simply
return if the resource is already freed. So, even for
delete-when-down cases, netif_running() check is not
necessary.

Remove the netif_running() condition check when deleting
an RSS context.

Reported-by: Jakub Kicinski <kicinski@meta.com>
Fixes: 667ac333db ("eth: bnxt: allow deleting RSS contexts when the device is down")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260219185313.2682148-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-20 16:34:47 -08:00
Vikas Gupta 604530085b bnge: fix reserving resources from FW
HWRM_FUNC_CFG is used to reserve resources, whereas HWRM_FUNC_QCFG is
intended for querying resource information from the firmware.
Since __bnge_hwrm_reserve_pf_rings() reserves resources for a specific
PF, the command type should be HWRM_FUNC_CFG.

Fixes: 627c67f038 ("bng_en: Add resource management support")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260218052755.4097468-1-vikas.gupta@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-19 09:29:59 -08:00
Chen Ni fe6515f595 bng_en: Remove duplicate include
Remove duplicate inclusion of <net/netdev_queues.h>.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Link: https://patch.msgid.link/20260211032021.2719742-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-12 18:40:57 -08:00
Vikas Gupta 42d1c54d62 bnge/bng_re: Add a new HSI
The HSI is shared between the firmware and the driver and is
automatically generated.
Add a new HSI for the BNGE driver. The current HSI refers to BNXT,
which will become incompatible with ThorUltra devices as the
BNGE driver adds more features. The BNGE driver will not use the HSI
located in the bnxt folder.
Also, add an HSI for ThorUltra RoCE driver.

Changes in v3:
- Fix in bng_roce_hsi.h reported by Jakub (AI review)
  https://lore.kernel.org/netdev/20260207051422.4181717-1-kuba@kernel.org/
- Add an entry in MAINTAINERS

Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Link: https://patch.msgid.link/20260208172925.1861255-1-vikas.gupta@broadcom.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-11 13:44:47 +01:00
Michael Chan b9355ad52b bnxt_en: Check RSS contexts in bnxt_need_reserve_rings()
bnxt_need_reserve_rings() checks all resources except HW RSS contexts
to determine if a new reservation is required.  For completeness, add
the check for HW RSS contexts.  This makes the code more complete after
the recent commit to increase the number of RSS contexts for a larger
RSS indirection table:

Fixes: 51b9d3f948 ("bnxt_en: Use a larger RSS indirection table on P5_PLUS chips")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260207235118.1987301-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-10 20:17:42 -08:00
Michael Chan 5a2f3aa289 bnxt_en: Refactor bnxt_need_reserve_rings()
bnxt_need_reserve_rings() checks 6 ring resources against the reserved
values to determine if a new reservation is needed.  Factor out the code
to collect the total resources into a new helper function
bnxt_get_total_resources() to make the code cleaner and easier to read.
Instead of individual scalar variables, use the struct bnxt_hw_rings to
hold all the ring resources.  Using the struct, hwr.cp replaces the nq
variable and the chip specific hwr.cp_p5 replaces cp on newer chips.

There is no change in behavior.  This will make it easier to check the
RSS context resource in the next patch.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260207235118.1987301-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-10 20:17:36 -08:00
Jakub Kicinski a7fc1a488f eth: bnxt: gather and report HW-GRO stats
Count and report HW-GRO stats as seen by the kernel.
The device stats for GRO seem to not reflect the reality,
perhaps they count sessions which did not actually result
in any aggregation. Also they count wire packets, so we
have to count super-frames, anyway.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260207003509.3927744-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-09 21:08:32 -08:00
Alice Mikityanska 28df1c6927 bng_en: Remove jumbo_remove step from TX path
Now that the kernel doesn't insert HBH for BIG TCP IPv6 packets, remove
unnecessary steps from the bng_en TX path, that used to check and
remove HBH.

Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Link: https://patch.msgid.link/20260205133925.526371-12-alice.kernel@fastmail.im
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-06 20:50:13 -08:00
Alice Mikityanska 3f1bff1d7f bnxt_en: Remove jumbo_remove step from TX path
Now that the kernel doesn't insert HBH for BIG TCP IPv6 packets, remove
unnecessary steps from the bnxt_en TX path, that used to check and
remove HBH.

Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260205133925.526371-9-alice.kernel@fastmail.im
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-06 20:50:12 -08:00
Alok Tiwari 1384a81e54 bng_en: fix misleading error message for generic firmware version
The devlink info_get handler incorrectly reports "roce firmware" when
populating the generic firmware version field.

Update the error message to correctly describe the failing operation.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Link: https://patch.msgid.link/20260202033848.22993-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-03 19:11:29 -08:00
Joe Damato 61cef6454c bnxt_en: Allow ntuple filters for drops
It appears that in commit 7efd79c0e6 ("bnxt_en: Add drop action
support for ntuple"), bnxt gained support for ntuple filters for packet
drops.

However, support for this does not seem to work in recent kernels or
against net-next:

  % sudo ethtool -U eth0 flow-type udp4 src-ip 1.1.1.1 action -1
    rmgr: Cannot insert RX class rule: Operation not supported
    Cannot insert classification rule

The issue is that the existing code uses ethtool_get_flow_spec_ring_vf,
which will return a non-zero value if the ring_cookie is set to
RX_CLS_FLOW_DISC, which then causes bnxt_add_ntuple_cls_rule to return
-EOPNOTSUPP because it thinks the user is trying to set an ntuple filter
for a vf.

Fix this by first checking that the ring_cookie is not RX_CLS_FLOW_DISC.

After this patch, ntuple filters for drops can be added:

  % sudo ethtool -U eth0 flow-type udp4 src-ip 1.1.1.1 action -1
  Added rule with ID 0

  % ethtool -n eth0
  44 RX rings available
  Total 1 rules

  Filter: 0
      Rule Type: UDP over IPv4
      Src IP addr: 1.1.1.1 mask: 0.0.0.0
      Dest IP addr: 0.0.0.0 mask: 255.255.255.255
      TOS: 0x0 mask: 0xff
      Src port: 0 mask: 0xffff
      Dest port: 0 mask: 0xffff
      Action: Drop

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260131003042.2570434-1-joe@dama.to
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-02 17:07:48 -08:00
Bhargava Marreddy c2effd12c9 bng_en: Add support for TPA events
Enable TPA functionality in the VNIC and add functions
to handle TPA events, which help in processing LRO/GRO.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-9-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:57 -08:00
Bhargava Marreddy 58165c9932 bng_en: Add TPA related functions
Add the functions to handle TPA events in RX path.
This helps the next patch enable TPA functionality.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-8-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy c858ac8735 bng_en: Add support to handle AGG events
Add AGG event handling in the RX path to receive packet data
on AGG rings. This enables Jumbo and HDS functionality.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-7-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy 5deaeae186 bng_en: Add ndo_features_check support
Implement ndo_features_check to validate hardware constraints per-packet:
- Disable SG if nr_frags exceeds hardware limit.
- Disable GSO if packet/fragment length exceeds supported maximum.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Rahul Gupta <rahul-rg.gupta@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-6-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy bd5ad9c052 bng_en: Add TX support
Add functions to support xmit along with TSO/GSO.
Also, add functions to handle TX completion events in the NAPI context.
This commit introduces the fundamental transmit data path

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Rahul Gupta <rahul-rg.gupta@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-5-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy 23cfc4e8e1 bng_en: Handle an HWRM completion request
Since the HWRM completion for a sent request lands on the NQ,
add functions to handle the HWRM completion event.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-4-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy 4d6a60057c bng_en: Add RX support
Add support to receive packet using NAPI, build and deliver the skb
to stack. With help of meta data available in completions, fill the
appropriate information in skb.

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Reviewed-by: Ajit Kumar Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Rahul Gupta <rahul-rg.gupta@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-3-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:56 -08:00
Bhargava Marreddy 705c7933d9 bng_en: Extend bnge_set_ring_params() for rx-copybreak
Add rx-copybreak support in bnge_set_ring_params()

Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20260128185623.26559-2-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 19:49:55 -08:00
Jakub Kicinski 3cf48c0496 eth: bnxt: make sure we populate the qcfg defaults on old FW/HW
The driver now depends on the core to tell it what the rx page size
should be for the agg ring. We must populate the ndo_default_qcfg
callback even if we don't support any queue ops.

This fixes:
  Oops: divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
  RIP: 0010:bnxt_alloc_rx_page_pool (drivers/net/ethernet/broadcom/bnxt/bnxt.c:3852)

with fw version 225.1.109.0.

Link: https://lore.kernel.org/20250421222827.283737-20-kuba@kernel.org
Fixes: f96e1b3577 ("eth: bnxt: support qcfg provided rx page size")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260128193258.125274-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 18:51:11 -08:00
Jakub Kicinski a010fe8d86 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc8).

No adjacent changes, conflicts:

drivers/net/ethernet/spacemit/k1_emac.c
  2c84959167 ("net: spacemit: Check for netif_carrier_ok() in emac_stats_update()")
  f66086798f ("net: spacemit: Remove broken flow control support")
https://lore.kernel.org/aXjAqZA3iEWD_DGM@sirena.org.uk

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-29 17:28:54 -08:00
Justin Chen 1fd1281250 net: bcmasp: streamline early exit in probe
Streamline the bcmasp_probe early exit. As support for other
functionality is added(i.e. ptp), it is easier to keep track of early
exit cleanup when it is all in one place.

Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260122194949.1145107-3-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26 18:58:22 -08:00
Justin Chen 9271fc3190 net: bcmasp: clean up some legacy logic
Removed wol_irq check. This was needed for brcm,asp-v2.0, which was
removed in previous commits.

Removed bcmasp_intf_ops. These function pointers were added to make
it easier to implement pseudo channels. These channels were removed
in newer versions of the hardware and were never implemented.

Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260122194949.1145107-2-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-26 18:58:22 -08:00
Jakub Kicinski b33006ebb7 eth: bnxt: plug bnxt_validate_qcfg() into qops
Plug bnxt_validate_qcfg() back into qops, where it was in my old RFC.

Link: https://patch.msgid.link/20260122005113.2476634-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23 11:49:02 -08:00
Jakub Kicinski b9ac2c60a3 net: introduce a trivial netdev_queue_config()
We may choose to extend or reimplement the logic which renders
the per-queue config. The drivers should not poke directly into
the queue state. Add a helper for drivers to use when they want
to query the config for a specific queue.

Link: https://patch.msgid.link/20260122005113.2476634-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23 11:49:01 -08:00
Jakub Kicinski 1410c7416d eth: bnxt: always set the queue mgmt ops
Core provides a centralized callback for validating per-queue settings
but the callback is part of the queue management ops. Having the ops
conditionally set complicates the parts of the driver which could
otherwise lean on the core to feed it the correct settings.

Always set the queue ops, but provide no restart-related callbacks if
queue ops are not supported by the device. This should maintain current
behavior, the check in netdev_rx_queue_restart() looks both at op struct
and individual ops.

Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/20260122005113.2476634-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23 11:49:01 -08:00
Justin Chen 6de4436bf3 net: bcmasp: fix early exit leak with fixed phy
We are not deregistering the fixed phy link when hitting the early
exit condition. Add the correct early exit sequence.

Fixes: 490cb41200 ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260122194001.1098859-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23 10:50:49 -08:00
Jakub Kicinski 9abf22075d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc7).

Conflicts:

drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
  b35a6fd37a ("hinic3: Add adaptive IRQ coalescing with DIM")
  fb2bb2a1eb ("hinic3: Fix netif_queue_set_napi queue_index input parameter error")
https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com

drivers/net/wireless/ath/ath12k/mac.c
drivers/net/wireless/ath/ath12k/wifi7/hw.c
  3170757210 ("wifi: ath12k: Fix wrong P2P device link id issue")
  c26f294fef ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module")
https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au

Adjacent changes:

drivers/net/wireless/ath/ath12k/mac.c
  8b8d6ee53d ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel")
  914c890d3b ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-22 20:14:36 -08:00
Justin Chen bbb11b8d75 net: bcmasp: Fix network filter wake for asp-3.0
We need to apply the tx_chan_offset to the netfilter cfg channel or the
output channel will be incorrect for asp-3.0 and newer.

Fixes: e9f31435ee ("net: bcmasp: Add support for asp-v3.0")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260120192339.2031648-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:29:52 -08:00
Jakub Kicinski 677a51790b Merge tag 'net-queue-rx-buf-len-v9' of https://github.com/isilence/linux
Pavel Begunkov says:

====================
Add support for providers with large rx buffer

Many modern NICs support configurable receive buffer lengths, and zcrx and
memory providers can use buffers larger than 4K to improve performance.
When paired with hw-gro larger rx buffer sizes can drastically reduce
the number of buffers traversing the stack and save a lot of processing
time. It also allows to give to users larger contiguous chunks of data.

Single stream benchmarks showed up to ~30% CPU util improvement.
E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC:

packets=23987040 (MB=2745098), rps=199559 (MB/s=22837)
CPU    %usr   %nice    %sys %iowait    %irq   %soft   %idle
  0    1.53    0.00   27.78    2.72    1.31   66.45    0.22
packets=24078368 (MB=2755550), rps=200319 (MB/s=22924)
CPU    %usr   %nice    %sys %iowait    %irq   %soft   %idle
  0    0.69    0.00    8.26   31.65    1.83   57.00    0.57

This series adds net infrastructure for memory providers configuring
the size and implements it for bnxt. It's an opt-in feature for drivers,
they should advertise support for the parameter in the qops and must check
if the hardware supports the given size. It's limited to memory providers
as it drastically simplifies implementation. It doesn't affect the fast
path zcrx uAPI, and the user exposed parameter is defined in zcrx terms,
which allows it to be flexible and adjusted in the future.

A liburing example can be found at [2]

full branch:
[1] https://github.com/isilence/linux.git zcrx/large-buffers-v8
Liburing example:
[2] https://github.com/isilence/liburing.git zcrx/rx-buf-len

* tag 'net-queue-rx-buf-len-v9' of https://github.com/isilence/linux:
  io_uring/zcrx: document area chunking parameter
  selftests: iou-zcrx: test large chunk sizes
  eth: bnxt: support qcfg provided rx page size
  eth: bnxt: adjust the fill level of agg queues with larger buffers
  eth: bnxt: store rx buffer size per queue
  net: pass queue rx page size from memory provider
  net: add bare bone queue configs
  net: reduce indent of struct netdev_queue_mgmt_ops members
  net: memzero mp params when closing a queue
====================

Link: https://patch.msgid.link/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:10:04 -08:00
Pavan Chebbi dc634118aa bnxt_en: Fix build break on non-x86 platforms
Commit c470195b98 added .getcrosststamp() interface where
the code uses boot_cpu_has() function which is available only
in x86 platforms. This fails the build on any other platform.

Since the interface is going to be supported only on x86 anyway,
we can simply compile out the entire support on non-x86 platforms.

Cover the .getcrosststamp support under CONFIG_X86

Fixes: c470195b98 ("bnxt_en: Add PTP .getcrosststamp() interface to get device/host times")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601111808.WnBJCuWI-lkp@intel.com
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20260113183422.508851-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:56:13 -08:00
Pavel Begunkov f96e1b3577 eth: bnxt: support qcfg provided rx page size
Implement support for qcfg provided rx page sizes. For that, implement
the ndo_default_qcfg callback and validate the config on restart. Also,
use the current config's value in bnxt_init_ring_struct to retain the
correct size across resets.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2026-01-14 02:13:37 +00:00
Jakub Kicinski c55bf90a21 eth: bnxt: adjust the fill level of agg queues with larger buffers
The driver tries to provision more agg buffers than header buffers
since multiple agg segments can reuse the same header. The calculation
/ heuristic tries to provide enough pages for 65k of data for each header
(or 4 frags per header if the result is too big). This calculation is
currently global to the adapter. If we increase the buffer sizes 8x
we don't want 8x the amount of memory sitting on the rings.
Luckily we don't have to fill the rings completely, adjust
the fill level dynamically in case particular queue has buffers
larger than the global size.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[pavel: rebase on top of agg_size_fac]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2026-01-14 02:13:37 +00:00
Pavel Begunkov f57efb32aa eth: bnxt: store rx buffer size per queue
Instead of using a constant buffer length, allow configuring the size
for each queue separately. There is no way to change the length yet, and
it'll be passed from memory providers in a later patch.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2026-01-14 02:13:37 +00:00
Pavel Begunkov efcb9a4d32 net: add bare bone queue configs
We'll need to pass extra parameters when allocating a queue for memory
providers. Define a new structure for queue configurations, and pass it
to qapi callbacks. It's empty for now, actual parameters will be added
in following patches.

Configurations should persist across resets, and for that they're
default-initialised on device registration and stored in struct
netdev_rx_queue. We also add a new qapi callback for defaulting a given
config. It must be implemented if a driver wants to use queue configs
and is optional otherwise.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2026-01-14 02:13:36 +00:00
Michael Chan bc87b14594 bnxt_en: Implement ethtool_ops -> get_link_ext_state()
Map the link_down_reason from the FW to the ethtool link_ext_state
when it is available.  Also log it to the link down dmesg when it is
available.  Add 2 new link_ext_state enums to the UAPI:

ETHTOOL_LINK_EXT_STATE_OTP_SPEED_VIOLATION
ETHTOOL_LINK_EXT_STATE_BMC_REQUEST_DOWN

to cover OTP (one-time-programmable) speed restrictions and
BMC (Baseboard management controller) forcing the link down.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:51 -08:00
Michael Chan 51b9d3f948 bnxt_en: Use a larger RSS indirection table on P5_PLUS chips
The driver currently uses a chip supported RSS indirection table size
just big enough to cover the number of RX rings.  Each table with 64
entries requires one HW RSS context.  The HW supported table sizes are
64, 128, 256, and 512 entries.  Using the smallest table size can cause
unbalanced RSS packet distributions.  For example, if the number of
rings is 48, the table size using existing logic will be 64.  32 rings
will have a weight of 1 and 16 rings will have a weight of 2 when
set to default even distribution.  This represents a 100% difference in
weights between some of the rings.

Newer FW has increased the RSS indirection table resource.  When the
increased resource is detected, use the largest RSS indirection table
size (512 entries) supported by the chip.  Using the same example
above, the weights of the 48 rings will be either 10 or 11 when set to
default even distribution.  The weight difference is only 10%.

If there are thousands of VFs, there is a possiblity that we may not
be able to allocate this larger RSS indirection table from the FW, so
we add a check to fall back to the legacy scheme.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:51 -08:00
Pavan Chebbi 743e683596 bnxt_en: Defrag the NVRAM region when resizing UPDATE region fails
When updating to a new firmware pkg, the driver checks if the UPDATE
region is big enough for the pkg and if it's not big enough, it
issues an NVM_WRITE cmd to update with the requested size.

This NVM_WRITE cmd can fail indicating fragmented region. Currently
the driver fails the fw update when this happens. We can improve the
situation by defragmenting the region and try the NVM_WRITE cmd
again. This will make firmware update more reliable.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:50 -08:00
Michael Chan e1c9c8928b bnxt_en: Add support for FEC bin histograms
Fill in the struct ethtool_fec_hist passed to the bnxt_get_fec_stats()
callback if the FW supports the feature.  Bins 0 to 15 inclusive are
available when the feature is supported.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:50 -08:00
Pavan Chebbi c470195b98 bnxt_en: Add PTP .getcrosststamp() interface to get device/host times
.getcrosststamp() helps the applications to obtain a snapshot of
device and host time almost taken at the same time. This function
will report PCIe PTM device and host times to any application using
the ioctl PTP_SYS_OFFSET_PRECISE. The device time from the HW is
48-bit and needs to be converted to 64-bit.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 15:19:50 -08:00
Breno Leitao 3358995b1a bnxt_en: Fix NULL pointer crash in bnxt_ptp_enable during error cleanup
When bnxt_init_one() fails during initialization (e.g.,
bnxt_init_int_mode returns -ENODEV), the error path calls
bnxt_free_hwrm_resources() which destroys the DMA pool and sets
bp->hwrm_dma_pool to NULL. Subsequently, bnxt_ptp_clear() is called,
which invokes ptp_clock_unregister().

Since commit a60fc3294a ("ptp: rework ptp_clock_unregister() to
disable events"), ptp_clock_unregister() now calls
ptp_disable_all_events(), which in turn invokes the driver's .enable()
callback (bnxt_ptp_enable()) to disable PTP events before completing the
unregistration.

bnxt_ptp_enable() attempts to send HWRM commands via bnxt_ptp_cfg_pin()
and bnxt_ptp_cfg_event(), both of which call hwrm_req_init(). This
function tries to allocate from bp->hwrm_dma_pool, causing a NULL
pointer dereference:

  bnxt_en 0000:01:00.0 (unnamed net_device) (uninitialized): bnxt_init_int_mode err: ffffffed
  KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
  Call Trace:
   __hwrm_req_init (drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c:72)
   bnxt_ptp_enable (drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:323 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:517)
   ptp_disable_all_events (drivers/ptp/ptp_chardev.c:66)
   ptp_clock_unregister (drivers/ptp/ptp_clock.c:518)
   bnxt_ptp_clear (drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:1134)
   bnxt_init_one (drivers/net/ethernet/broadcom/bnxt/bnxt.c:16889)

Lines are against commit f8f9c1f4d0 ("Linux 6.19-rc3")

Fix this by clearing and unregistering ptp (bnxt_ptp_clear()) before
freeing HWRM resources.

Suggested-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: a60fc3294a ("ptp: rework ptp_clock_unregister() to disable events")
Cc: stable@vger.kernel.org
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20260106-bnxt-v3-1-71f37e11446a@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:44:41 -08:00
Srijit Bose ffeafa65b2 bnxt_en: Fix potential data corruption with HW GRO/LRO
Fix the max number of bits passed to find_first_zero_bit() in
bnxt_alloc_agg_idx().  We were incorrectly passing the number of
long words.  find_first_zero_bit() may fail to find a zero bit and
cause a wrong ID to be used.  If the wrong ID is already in use, this
can cause data corruption.  Sometimes an error like this can also be
seen:

bnxt_en 0000:83:00.0 enp131s0np0: TPA end agg_buf 2 != expected agg_bufs 1

Fix it by passing the correct number of bits MAX_TPA_P5.  Use
DECLARE_BITMAP() to more cleanly define the bitmap.  Add a sanity
check to warn if a bit cannot be found and reset the ring [MChan].

Fixes: ec4d8e7cf0 ("bnxt_en: Add TPA ID mapping logic for 57500 chips.")
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Srijit Bose <srijit.bose@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251231083625.3911652-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-04 10:49:26 -08:00
Markus Blöchl d7065436e8 net: bnge: add AUXILIARY_BUS to Kconfig dependencies
The build can currently fail with

    ld: drivers/net/ethernet/broadcom/bnge/bnge_auxr.o: in function `bnge_rdma_aux_device_add':
    bnge_auxr.c:(.text+0x366): undefined reference to `__auxiliary_device_add'
    ld: drivers/net/ethernet/broadcom/bnge/bnge_auxr.o: in function `bnge_rdma_aux_device_init':
    bnge_auxr.c:(.text+0x43c): undefined reference to `auxiliary_device_init'

if BNGE is enabled but no other driver pulls in AUXILIARY_BUS.

Select AUXILIARY_BUS in BNGE like in all other drivers which create
an auxiliary_device.

Fixes: 8ac050ec3b ("bng_en: Add RoCE aux device support")
Signed-off-by: Markus Blöchl <markus@blochl.de>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Link: https://patch.msgid.link/20251228-bnge_aux_bus-v1-1-82e273ebfdac@blochl.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-04 09:44:23 -08:00