[ Upstream commit 4c2c8f03a5 ]
Moshe Kol, Amit Klein, and Yossi Gilad reported being able to accurately
identify a client by forcing it to emit only 40 times more connections
than there are entries in the table_perturb[] table. The previous two
improvements consisting in resalting the secret every 10s and adding
randomness to each port selection only slightly improved the situation,
and the current value of 2^8 was too small as it's not very difficult
to make a client emit 10k connections in less than 10 seconds.
Thus we're increasing the perturb table from 2^8 to 2^16 so that the
same precision now requires 2.6M connections, which is more difficult in
this time frame and harder to hide as a background activity. The impact
is that the table now uses 256 kB instead of 1 kB, which could mostly
affect devices making frequent outgoing connections. However such
components usually target a small set of destinations (load balancers,
database clients, perf assessment tools), and in practice only a few
entries will be visited, like before.
A live test at 1 million connections per second showed no performance
difference from the previous value.
Reported-by: Moshe Kol <moshe.kol@mail.huji.ac.il>
Reported-by: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e926147618 ]
We'll need to further increase the size of this table and it's likely
that at some point its size will not be suitable anymore for a static
table. Let's allocate it on boot from inet_hashinfo2_init(), which is
called from tcp_init().
Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
Cc: Amit Klein <aksecurity@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca7af04025 ]
Here we're randomly adding between 0 and 7 random increments to the
selected source port in order to add some noise in the source port
selection that will make the next port less predictable.
With the default port range of 32768-60999 this means a worst case
reuse scenario of 14116/8=1764 connections between two consecutive
uses of the same port, with an average of 14116/4.5=3137. This code
was stressed at more than 800000 connections per second to a fixed
target with all connections closed by the client using RSTs (worst
condition) and only 2 connections failed among 13 billion, despite
the hash being reseeded every 10 seconds, indicating a perfectly
safe situation.
Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
Cc: Amit Klein <aksecurity@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2d057560b ]
SipHash replaced MD5 in secure_ipv{4,6}_port_ephemeral() via commit
7cd23e5300 ("secure_seq: use SipHash in place of MD5"), but the output
remained truncated to 32-bit only. In order to exploit more bits from the
hash, let's make the functions return the full 64-bit of siphash_3u32().
We also make sure the port offset calculation in __inet_hash_connect()
remains done on 32-bit to avoid the need for div_u64_rem() and an extra
cost on 32-bit systems.
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
Cc: Amit Klein <aksecurity@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2069624dac ]
As noted elsewhere, various GPON SFP modules exhibit non-standard
TX-fault behaviour. In the tested case, the Huawei MA5671A, when used
in combination with a Marvell mv88e6085 switch, was found to
persistently assert TX-fault, resulting in the module being disabled.
This patch adds a quirk to ignore the SFP_F_TX_FAULT state, allowing the
module to function.
Change from v1: removal of erroneous return statment (Andrew Lunn)
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220502223315.1973376-1-mnhagan88@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b800528b97 ]
In xemaclite_open() function we are setting the max speed of
emaclite to 100Mb using phy_set_max_speed() function so,
there is no need to write the advertising registers to stop
giga-bit speed and the phy_start() function starts the
auto-negotiation so, there is no need to handle it separately
using advertising registers. Remove the phy_read and phy_write
of advertising registers in xemaclite_open() function.
Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2fbe467bcb ]
The max98090 driver has a custom put function for some controls which can
only be updated in certain circumstances which makes no effort to validate
that input is suitable for the control, allowing out of spec values to be
written to the hardware and presented to userspace. Fix this by returning
an error when invalid values are written.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220420193454.2647908-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a25f2ea0e ]
Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
entries to not be invalidated correctly. The problem is that the walk
cache index generated for IOVA is not same across translation and
invalidation requests. This is leading to page faults when PMD entry is
released during unmap and populated with new PTE table during subsequent
map request. Disabling large page mappings avoids the release of PMD
entry and avoid translations seeing stale PMD entry in walk cache.
Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
Tegra234 devices. This is recommended fix from Tegra hardware design
team.
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Co-developed-by: Pritesh Raithatha <praithatha@nvidia.com>
Signed-off-by: Pritesh Raithatha <praithatha@nvidia.com>
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Link: https://lore.kernel.org/r/20220421081504.24678-1-amhetre@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 679ab61bf5 ]
There is a deadlock in irdma_cleanup_cm_core(), which is shown below:
(Thread 1) | (Thread 2)
| irdma_schedule_cm_timer()
irdma_cleanup_cm_core() | add_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | irdma_cm_timer_tick()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold cm_core->ht_lock in position (1) of thread 1 and use
del_timer_sync() to wait timer to stop, but timer handler also need
cm_core->ht_lock in position (2) of thread 2. As a result,
irdma_cleanup_cm_core() will block forever.
This patch removes the check of timer_pending() in
irdma_cleanup_cm_core(), because the del_timer_sync() function will just
return directly if there isn't a pending timer. As a result, the lock is
redundant, because there is no resource it could protect.
Link: https://lore.kernel.org/r/20220418153322.42524-1-duoming@zju.edu.cn
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d031a8866e ]
When a write cannot be carried out in full, gfs2_iomap_end() releases
blocks that have been allocated for this write but haven't been used.
To compute the end of the allocation, gfs2_iomap_end() incorrectly
rounded the end of the attempted write down to the next block boundary
to arrive at the end of the allocation. It would have to round up, but
the end of the allocation is also available as iomap->offset +
iomap->length, so just use that instead.
In addition, use round_up() for computing the start of the unused range.
Fixes: 64bc06bb32 ("gfs2: iomap buffered write support")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d6595b4cd ]
Port of the vmwgfx to SVGAv3 lacked support for fencing. SVGAv3 removed
FIFO's and replaced them with command buffers and extra registers.
The initial version of SVGAv3 lacked support for most advanced features
(e.g. 3D) which made fences unnecessary. That is no longer the case,
especially as 3D support is being turned on.
Switch from FIFO commands and capabilities to command buffers and extra
registers to enable fences on SVGAv3.
Fixes: 2cd80dbd35 ("drm/vmwgfx: Add basic support for SVGA3")
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302152426.885214-5-zack@kde.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3740651bf7 ]
The commit cited below claims to fix a use-after-free condition after
tls_device_down. Apparently, the description wasn't fully accurate. The
context stayed alive, but ctx->netdev became NULL, and the offload was
torn down without a proper fallback, so a bug was present, but a
different kind of bug.
Due to misunderstanding of the issue, the original patch dropped the
refcount_dec_and_test line for the context to avoid the alleged
premature deallocation. That line has to be restored, because it matches
the refcount_inc_not_zero from the same function, otherwise the contexts
that survived tls_device_down are leaked.
This patch fixes the described issue by restoring refcount_dec_and_test.
After this change, there is no leak anymore, and the fallback to
software kTLS still works.
Fixes: c55dcdd435 ("net/tls: Fix use-after-free after the TLS device goes down and up")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220512091830.678684-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1fa89ffbc0 ]
In the NIC ->probe() callback, ->mtd_probe() callback is called.
If NIC has 2 ports, ->probe() is called twice and ->mtd_probe() too.
In the ->mtd_probe(), which is efx_ef10_mtd_probe() it allocates and
initializes mtd partiion.
But mtd partition for sfc is shared data.
So that allocated mtd partition data from last called
efx_ef10_mtd_probe() will not be used.
Therefore it must be freed.
But it doesn't free a not used mtd partition data in efx_ef10_mtd_probe().
kmemleak reports:
unreferenced object 0xffff88811ddb0000 (size 63168):
comm "systemd-udevd", pid 265, jiffies 4294681048 (age 348.586s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffffa3767749>] kmalloc_order_trace+0x19/0x120
[<ffffffffa3873f0e>] __kmalloc+0x20e/0x250
[<ffffffffc041389f>] efx_ef10_mtd_probe+0x11f/0x270 [sfc]
[<ffffffffc0484c8a>] efx_pci_probe.cold.17+0x3df/0x53d [sfc]
[<ffffffffa414192c>] local_pci_probe+0xdc/0x170
[<ffffffffa4145df5>] pci_device_probe+0x235/0x680
[<ffffffffa443dd52>] really_probe+0x1c2/0x8f0
[<ffffffffa443e72b>] __driver_probe_device+0x2ab/0x460
[<ffffffffa443e92a>] driver_probe_device+0x4a/0x120
[<ffffffffa443f2ae>] __driver_attach+0x16e/0x320
[<ffffffffa4437a90>] bus_for_each_dev+0x110/0x190
[<ffffffffa443b75e>] bus_add_driver+0x39e/0x560
[<ffffffffa4440b1e>] driver_register+0x18e/0x310
[<ffffffffc02e2055>] 0xffffffffc02e2055
[<ffffffffa3001af3>] do_one_initcall+0xc3/0x450
[<ffffffffa33ca574>] do_init_module+0x1b4/0x700
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Fixes: 8127d661e7 ("sfc: Add support for Solarflare SFC9100 family")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220512054709.12513-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7be130c5d ]
After commit 2d1f90f9ba ("net: dsa/bcm_sf2: fix incorrect usage of
state->link") the interface suspend path would call our mac_link_down()
call back which would forcibly set the link down, thus preventing
Wake-on-LAN packets from reaching our management port.
Fix this by looking at whether the port is enabled for Wake-on-LAN and
not clearing the link status in that case to let packets go through.
Fixes: 2d1f90f9ba ("net: dsa/bcm_sf2: fix incorrect usage of state->link")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220512021731.2494261-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fed53de56 ]
drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_connector_detect’:
drivers/gpu/drm/vc4/vc4_hdmi.c:228:7: error: implicit declaration of function ‘gpiod_get_value_cansleep’; did you mean ‘gpio_get_value_cansleep’? [-Werror=implicit-function-declaration]
if (gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio))
^~~~~~~~~~~~~~~~~~~~~~~~
gpio_get_value_cansleep
CC [M] drivers/gpu/drm/vc4/vc4_validate.o
CC [M] drivers/gpu/drm/vc4/vc4_v3d.o
CC [M] drivers/gpu/drm/vc4/vc4_validate_shaders.o
CC [M] drivers/gpu/drm/vc4/vc4_debugfs.o
drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
drivers/gpu/drm/vc4/vc4_hdmi.c:2883:23: error: implicit declaration of function ‘devm_gpiod_get_optional’; did you mean ‘devm_clk_get_optional’? [-Werror=implicit-function-declaration]
vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN);
^~~~~~~~~~~~~~~~~~~~~~~
devm_clk_get_optional
drivers/gpu/drm/vc4/vc4_hdmi.c:2883:59: error: ‘GPIOD_IN’ undeclared (first use in this function); did you mean ‘GPIOF_IN’?
vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN);
^~~~~~~~
GPIOF_IN
drivers/gpu/drm/vc4/vc4_hdmi.c:2883:59: note: each undeclared identifier is reported only once for each function it appears in
cc1: all warnings being treated as errors
Fixes: 6800234cee ("drm/vc4: hdmi: Convert to gpiod")
Signed-off-by: Hui Tang <tanghui20@huawei.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220510135148.247719-1-tanghui20@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b77c06655 ]
The interrupt controller supplying the Wake-on-LAN interrupt line maybe
modular on some platforms (irq-bcm7038-l1.c) and might be probed at a
later time than the GENET driver. We need to specifically check for
-EPROBE_DEFER and propagate that error to ensure that we eventually
fetch the interrupt descriptor.
Fixes: 9deb48b53e ("bcmgenet: add WOL IRQ check")
Fixes: 5b1f0e6294 ("net: bcmgenet: Avoid touching non-existent interrupt")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Stefan Wahren <stefan.wahren@i2se.com>
Link: https://lore.kernel.org/r/20220511031752.2245566-1-f.fainelli@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b796475fd ]
Currently pedit tries to ensure that the accessed skb offset
is writable via skb_unclone(). The action potentially allows
touching any skb bytes, so it may end-up modifying shared data.
The above causes some sporadic MPTCP self-test failures, due to
this code:
tc -n $ns2 filter add dev ns2eth$i egress \
protocol ip prio 1000 \
handle 42 fw \
action pedit munge offset 148 u8 invert \
pipe csum tcp \
index 100
The above modifies a data byte outside the skb head and the skb is
a cloned one, carrying a TCP output packet.
This change addresses the issue by keeping track of a rough
over-estimate highest skb offset accessed by the action and ensuring
such offset is really writable.
Note that this may cause performance regressions in some scenarios,
but hopefully pedit is not in the critical path.
Fixes: db2c24175d ("act_pedit: access skb->data safely")
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/1fcf78e6679d0a287dd61bb0f04730ce33b3255d.1652194627.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 671bb35c8e ]
smatch complains about
drivers/s390/net/lcs.c:1741 lcs_get_control() warn: variable dereferenced before check 'card->dev' (see line 1739)
Fixes: 27eb5ac8f0 ("[PATCH] s390: lcs driver bug fixes and improvements [1/2]")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c0b20587b ]
smatch complains about
drivers/s390/net/ctcm_mpc.c:1210 ctcmpc_unpack_skb() warn: possible memory leak of 'mpcginfo'
mpc_action_discontact() did not free mpcginfo. Consolidate the freeing in
ctcmpc_unpack_skb().
Fixes: 293d984f0e ("ctcm: infrastructure for replaced ctc driver")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c50c6867c ]
Found by cppcheck and smatch.
smatch complains about
drivers/s390/net/ctcm_sysfs.c:43 ctcm_buffer_write() warn: variable dereferenced before check 'priv' (see line 42)
Fixes: 3c09e2647b ("ctcm: rename READ/WRITE defines to avoid redefinitions")
Reported-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 41c240099f ]
The tools/testing/selftests/vm/Makefile uses the variable TARGETS
internally to generate a list of platform-specific binary build targets
suffixed with _{32,64}. When building the selftests using its own
Makefile directly, such as via the following command run in a kernel tree:
One receives an error such as the following:
make: Entering directory '/root/linux/tools/testing/selftests'
make --no-builtin-rules ARCH=x86 -C ../../.. headers_install
make[1]: Entering directory '/root/linux'
INSTALL ./usr/include
make[1]: Leaving directory '/root/linux'
make[1]: Entering directory '/root/linux/tools/testing/selftests/vm'
make[1]: *** No rule to make target 'vm.c', needed by '/root/linux/tools/testing/selftests/vm/vm_64'. Stop.
make[1]: Leaving directory '/root/linux/tools/testing/selftests/vm'
make: *** [Makefile:175: all] Error 2
make: Leaving directory '/root/linux/tools/testing/selftests'
The TARGETS variable passed to tools/testing/selftests/Makefile collides
with the TARGETS used in tools/testing/selftests/vm/Makefile, so rename
the latter to VMTARGETS, eliminating the collision with no functional
change.
Link: https://lkml.kernel.org/r/20220504213454.1282532-1-jsavitz@redhat.com
Fixes: f21fda8f64 ("selftests: vm: pkeys: fix multilib builds for x86")
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Acked-by: Nico Pache <npache@redhat.com>
Cc: Joel Savitz <jsavitz@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 151d6dcbed ]
Building with SENSORS_LTQ_CPUTEMP=y with SOC_FALCON=y causes build
errors since FALCON does not support the same features as XWAY.
Change this symbol to depend on SOC_XWAY since that provides the
necessary interfaces.
Repairs these build errors:
../drivers/hwmon/ltq-cputemp.c: In function 'ltq_cputemp_enable':
../drivers/hwmon/ltq-cputemp.c:23:9: error: implicit declaration of function 'ltq_cgu_w32'; did you mean 'ltq_ebu_w32'? [-Werror=implicit-function-declaration]
23 | ltq_cgu_w32(ltq_cgu_r32(CGU_GPHY1_CR) | CGU_TEMP_PD, CGU_GPHY1_CR);
../drivers/hwmon/ltq-cputemp.c:23:21: error: implicit declaration of function 'ltq_cgu_r32'; did you mean 'ltq_ebu_r32'? [-Werror=implicit-function-declaration]
23 | ltq_cgu_w32(ltq_cgu_r32(CGU_GPHY1_CR) | CGU_TEMP_PD, CGU_GPHY1_CR);
../drivers/hwmon/ltq-cputemp.c: In function 'ltq_cputemp_probe':
../drivers/hwmon/ltq-cputemp.c:92:31: error: 'SOC_TYPE_VR9_2' undeclared (first use in this function)
92 | if (ltq_soc_type() != SOC_TYPE_VR9_2)
Fixes: 7074d0a927 ("hwmon: (ltq-cputemp) add cpu temp sensor driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Florian Eckert <fe@dev.tdt.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: linux-hwmon@vger.kernel.org
Link: https://lore.kernel.org/r/20220509234740.26841-1-rdunlap@infradead.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee1444b5e1 ]
The W=2 build pointed out that the code wasn't initializing all the
variables in the dim_cq_moder declarations with the struct initializers.
The net change here is zero since these structs were already static
const globals and were initialized with zeros by the compiler, but
removing compiler warnings has value in and of itself.
lib/dim/net_dim.c: At top level:
lib/dim/net_dim.c:54:9: warning: missing initializer for field ‘comps’ of ‘const struct dim_cq_moder’ [-Wmissing-field-initializers]
54 | NET_DIM_RX_EQE_PROFILES,
| ^~~~~~~~~~~~~~~~~~~~~~~
In file included from lib/dim/net_dim.c:6:
./include/linux/dim.h:45:13: note: ‘comps’ declared here
45 | u16 comps;
| ^~~~~
and repeats for the tx struct, and once you fix the comps entry then
the cq_period_mode field needs the same treatment.
Use the commonly accepted style to indicate to the compiler that we
know what we're doing, and add a comma at the end of each struct
initializer to clean up the issue, and use explicit initializers
for the fields we are initializing which makes the compiler happy.
While here and fixing these lines, clean up the code slightly with
a fix for the super long lines by removing the word "_MODERATION" from a
couple defines only used in this file.
Fixes: f8be17b81d ("lib/dim: Fix -Wunused-const-variable warnings")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20220507011038.14568-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 085d16d5f9 ]
Turns out that ever since this mount option was added, passing
`softreval` in NFS mount options cancelled all other flags while not
affecting the underlying flag `NFS_MOUNT_SOFTREVAL`.
Fixes: c74dfe97c1 ("NFS: Add mount option 'softreval'")
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49e6123c65 ]
It fixes memory leak in ring buffer change logic.
When ring buffer size is changed(ethtool -G eth0 rx 4096), sfc driver
works like below.
1. stop all channels and remove ring buffers.
2. allocates new buffer array.
3. allocates rx buffers.
4. start channels.
While the above steps are working, it skips some steps if the channel
doesn't have a ->copy callback function.
Due to ptp channel doesn't have ->copy callback, these above steps are
skipped for ptp channel.
It eventually makes some problems.
a. ptp channel's ring buffer size is not changed, it works only
1024(default).
b. memory leak.
The reason for memory leak is to use the wrong ring buffer values.
There are some values, which is related to ring buffer size.
a. efx->rxq_entries
- This is global value of rx queue size.
b. rx_queue->ptr_mask
- used for access ring buffer as circular ring.
- roundup_pow_of_two(efx->rxq_entries) - 1
c. rx_queue->max_fill
- efx->rxq_entries - EFX_RXD_HEAD_ROOM
These all values should be based on ring buffer size consistently.
But ptp channel's values are not.
a. efx->rxq_entries
- This is global(for sfc) value, always new ring buffer size.
b. rx_queue->ptr_mask
- This is always 1023(default).
c. rx_queue->max_fill
- This is new ring buffer size - EFX_RXD_HEAD_ROOM.
Let's assume we set 4096 for rx ring buffer,
normal channel ptp channel
efx->rxq_entries 4096 4096
rx_queue->ptr_mask 4095 1023
rx_queue->max_fill 4086 4086
sfc driver allocates rx ring buffers based on these values.
When it allocates ptp channel's ring buffer, 4086 ring buffers are
allocated then, these buffers are attached to the allocated array.
But ptp channel's ring buffer array size is still 1024(default)
and ptr_mask is still 1023 too.
So, 3062 ring buffers will be overwritten to the array.
This is the reason for memory leak.
Test commands:
ethtool -G <interface name> rx 4096
while :
do
ip link set <interface name> up
ip link set <interface name> down
done
In order to avoid this problem, it adds ->copy callback to ptp channel
type.
So that rx_queue->ptr_mask value will be updated correctly.
Fixes: 7c236c43b8 ("sfc: Add support for IEEE-1588 PTP")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c7ab9cd98 ]
Using min_t(int, ...) as a potential array index implies to the compiler
that negative offsets should be allowed. This is not the case, though.
Replace "int" with "unsigned int". Fixes the following warning exposed
under future CONFIG_FORTIFY_SOURCE improvements:
In file included from include/linux/string.h:253,
from include/linux/bitmap.h:11,
from include/linux/cpumask.h:12,
from include/linux/smp.h:13,
from include/linux/lockdep.h:14,
from include/linux/rcupdate.h:29,
from include/linux/rculist.h:11,
from include/linux/pid.h:5,
from include/linux/sched.h:14,
from include/linux/delay.h:23,
from drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:35:
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c: In function 't4_get_raw_vpd_params':
include/linux/fortify-string.h:46:33: warning: '__builtin_memcpy' pointer overflow between offset 29 and size [2147483648, 4294967295] [-Warray-bounds]
46 | #define __underlying_memcpy __builtin_memcpy
| ^
include/linux/fortify-string.h:388:9: note: in expansion of macro '__underlying_memcpy'
388 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
include/linux/fortify-string.h:433:26: note: in expansion of macro '__fortify_memcpy_chk'
433 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2796:9: note: in expansion of macro 'memcpy'
2796 | memcpy(p->id, vpd + id, min_t(int, id_len, ID_LEN));
| ^~~~~~
include/linux/fortify-string.h:46:33: warning: '__builtin_memcpy' pointer overflow between offset 0 and size [2147483648, 4294967295] [-Warray-bounds]
46 | #define __underlying_memcpy __builtin_memcpy
| ^
include/linux/fortify-string.h:388:9: note: in expansion of macro '__underlying_memcpy'
388 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
include/linux/fortify-string.h:433:26: note: in expansion of macro '__fortify_memcpy_chk'
433 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2798:9: note: in expansion of macro 'memcpy'
2798 | memcpy(p->sn, vpd + sn, min_t(int, sn_len, SERNUM_LEN));
| ^~~~~~
Additionally remove needless cast from u8[] to char * in last strim()
call.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202205031926.FVP7epJM-lkp@intel.com
Fixes: fc9279298e ("cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()")
Fixes: 24c521f81c ("cxgb4: Use pci_vpd_find_id_string() to find VPD ID string")
Cc: Raju Rangoju <rajur@chelsio.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220505233101.1224230-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d5076fe404 ]
netlink_recvmsg() does not need to change transport header.
If transport header was needed, it should have been reset
by the producer (netlink_dump()), not the consumer(s).
The following trace probably happened when multiple threads
were using MSG_PEEK.
BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
write to 0xffff88811e9f15b2 of 2 bytes by task 32012 on cpu 1:
skb_reset_transport_header include/linux/skbuff.h:2760 [inline]
netlink_recvmsg+0x1de/0x790 net/netlink/af_netlink.c:1978
sock_recvmsg_nosec net/socket.c:948 [inline]
sock_recvmsg net/socket.c:966 [inline]
__sys_recvfrom+0x204/0x2c0 net/socket.c:2097
__do_sys_recvfrom net/socket.c:2115 [inline]
__se_sys_recvfrom net/socket.c:2111 [inline]
__x64_sys_recvfrom+0x74/0x90 net/socket.c:2111
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
write to 0xffff88811e9f15b2 of 2 bytes by task 32005 on cpu 0:
skb_reset_transport_header include/linux/skbuff.h:2760 [inline]
netlink_recvmsg+0x1de/0x790 net/netlink/af_netlink.c:1978
____sys_recvmsg+0x162/0x2f0
___sys_recvmsg net/socket.c:2674 [inline]
__sys_recvmsg+0x209/0x3f0 net/socket.c:2704
__do_sys_recvmsg net/socket.c:2714 [inline]
__se_sys_recvmsg net/socket.c:2711 [inline]
__x64_sys_recvmsg+0x42/0x50 net/socket.c:2711
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0xffff -> 0x0000
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 32005 Comm: syz-executor.4 Not tainted 5.18.0-rc1-syzkaller-00328-ge1f700ebd6be-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220505161946.2867638-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ab244be47a ]
If successful ida_simple_get() calls are not undone when needed, some
additional memory may be allocated and wasted.
Here, an ID between 0 and MAX_INT is required. If this ID is >=100, it is
not taken into account and is wasted. It should be released.
Instead of calling ida_simple_remove(), take advantage of the 'max'
parameter to require the ID not to be too big. Should it be too big, it
is not allocated and don't need to be freed.
While at it, use ida_alloc_xxx()/ida_free() instead to
ida_simple_get()/ida_simple_remove().
The latter is deprecated and more verbose.
Fixes: db1a0ae214 ("drm/nouveau/bl: Assign different names to interfaces")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[Fixed formatting warning from checkpatch]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/9ba85bca59df6813dc029e743a836451d5173221.1644386541.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a11b6c1a38 ]
Read stale PTP Tx timestamps from PHY on cleanup.
After running out of Tx timestamps request handlers, hardware (HW) stops
reporting finished requests. Function ice_ptp_tx_tstamp_cleanup() used
to only clean up stale handlers in driver and was leaving the hardware
registers not read. Not reading stale PTP Tx timestamps prevents next
interrupts from arriving and makes timestamping unusable.
Fixes: ea9b847cda ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Michal Michalik <michal.michalik@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 486b9eee57 ]
Function ice_plug_aux_dev() assigns pf->adev field too early prior
aux device initialization and on other side ice_unplug_aux_dev()
starts aux device deinit and at the end assigns NULL to pf->adev.
This is wrong because pf->adev should always be non-NULL only when
aux device is fully initialized and ready. This wrong order causes
a crash when ice_send_event_to_aux() call occurs because that function
depends on non-NULL value of pf->adev and does not assume that
aux device is half-initialized or half-destroyed.
After order correction the race window is tiny but it is still there,
as Leon mentioned and manipulation with pf->adev needs to be protected
by mutex.
Fix (un-)plugging functions so pf->adev field is set after aux device
init and prior aux device destroy and protect pf->adev assignment by
new mutex. This mutex is also held during ice_send_event_to_aux()
call to ensure that aux device is valid during that call.
Note that device lock used ice_send_event_to_aux() needs to be kept
to avoid race with aux drv unload.
Reproducer:
cycle=1
while :;do
echo "#### Cycle: $cycle"
ip link set ens7f0 mtu 9000
ip link add bond0 type bond mode 1 miimon 100
ip link set bond0 up
ifenslave bond0 ens7f0
ip link set bond0 mtu 9000
ethtool -L ens7f0 combined 1
ip link del bond0
ip link set ens7f0 mtu 1500
sleep 1
let cycle++
done
In short when the device is added/removed to/from bond the aux device
is unplugged/plugged. When MTU of the device is changed an event is
sent to aux device asynchronously. This can race with (un)plugging
operation and because pf->adev is set too early (plug) or too late
(unplug) the function ice_send_event_to_aux() can touch uninitialized
or destroyed fields. In the case of crash below pf->adev->dev.mutex.
Crash:
[ 53.372066] bond0: (slave ens7f0): making interface the new active one
[ 53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[ 53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up
link
[ 54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[ 54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[ 54.248204] bond0: (slave ens7f0): Releasing backup interface
[ 54.253955] bond0: (slave ens7f1): making interface the new active one
[ 54.274875] bond0: (slave ens7f1): Releasing backup interface
[ 54.289153] bond0 (unregistering): Released all slaves
[ 55.383179] MII link monitoring set to 100 ms
[ 55.398696] bond0: (slave ens7f0): making interface the new active one
[ 55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080
[ 55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[ 55.412198] #PF: supervisor write access in kernel mode
[ 55.412200] #PF: error_code(0x0002) - not-present page
[ 55.412201] PGD 25d2ad067 P4D 0
[ 55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S
5.17.0-13579-g57f2d6540f03 #1
[ 55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up
link
[ 55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/
2021
[ 55.430226] Workqueue: ice ice_service_task [ice]
[ 55.468169] RIP: 0010:mutex_unlock+0x10/0x20
[ 55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 <f0> 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48
[ 55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246
[ 55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001
[ 55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080
[ 55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041
[ 55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0
[ 55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000
[ 55.532076] FS: 0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000
[ 55.540163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0
[ 55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 55.567305] PKRU: 55555554
[ 55.570018] Call Trace:
[ 55.572474] <TASK>
[ 55.574579] ice_service_task+0xaab/0xef0 [ice]
[ 55.579130] process_one_work+0x1c5/0x390
[ 55.583141] ? process_one_work+0x390/0x390
[ 55.587326] worker_thread+0x30/0x360
[ 55.590994] ? process_one_work+0x390/0x390
[ 55.595180] kthread+0xe6/0x110
[ 55.598325] ? kthread_complete_and_exit+0x20/0x20
[ 55.603116] ret_from_fork+0x1f/0x30
[ 55.606698] </TASK>
Fixes: f9f5301e7e ("ice: Register auxiliary device to provide RDMA")
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44acfc22c7 ]
When building the Surface Aggregator Module (SAM) core, registry, and
other SAM client drivers as builtin modules (=y), proper initialization
order is not guaranteed. Due to this, client driver registration
(triggered by device registration in the registry) races against bus
initialization in the core.
If any attempt is made at registering the device driver before the bus
has been initialized (i.e. if bus initialization fails this race) driver
registration will fail with a message similar to:
Driver surface_battery was unable to register with bus_type surface_aggregator because the bus was not initialized
Switch from module_init() to subsys_initcall() to resolve this issue.
Note that the serdev subsystem uses postcore_initcall() so we are still
able to safely register the serdev device driver for the core.
Fixes: c167b9c7e3 ("platform/surface: Add Surface Aggregator subsystem")
Reported-by: Blaž Hrastnik <blaz@mxxn.io>
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20220429195738.535751-1-luzmaximilian@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3c9a924aa ]
The driver is calling framebuffer_release() in its .remove callback, but
this will cause the struct fb_info to be freed too early. Since it could
be that a reference is still hold to it if user-space opened the fbdev.
This would lead to a use-after-free error if the framebuffer device was
unregistered but later a user-space process tries to close the fbdev fd.
To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy
instead of doing it in the driver's .remove callback.
Strictly speaking, the code flow in the driver is still wrong because all
the hardware cleanupd (i.e: iounmap) should be done in .remove while the
software cleanup (i.e: releasing the framebuffer) should be done in the
.fb_destroy handler. But this at least makes to match the behavior before
commit 27599aacba ("fbdev: Hot-unplug firmware fb devices on forced removal").
Fixes: 27599aacba ("fbdev: Hot-unplug firmware fb devices on forced removal")
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220505220631.366371-1-javierm@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d258d00fb9 ]
The driver is calling framebuffer_release() in its .remove callback, but
this will cause the struct fb_info to be freed too early. Since it could
be that a reference is still hold to it if user-space opened the fbdev.
This would lead to a use-after-free error if the framebuffer device was
unregistered but later a user-space process tries to close the fbdev fd.
To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy
instead of doing it in the driver's .remove callback.
Strictly speaking, the code flow in the driver is still wrong because all
the hardware cleanupd (i.e: iounmap) should be done in .remove while the
software cleanup (i.e: releasing the framebuffer) should be done in the
.fb_destroy handler. But this at least makes to match the behavior before
commit 27599aacba ("fbdev: Hot-unplug firmware fb devices on forced removal").
Fixes: 27599aacba ("fbdev: Hot-unplug firmware fb devices on forced removal")
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220505220540.366218-1-javierm@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>