Commit Graph

3063 Commits

Author SHA1 Message Date
Kaushlendra Kumar ecd2fc9e4c thermal: int340x: Fix sysfs group leak on DLVR registration failure
[ Upstream commit 15176b818e ]

When DLVR sysfs group creation fails in proc_thermal_rfim_add(),
the function returns immediately without cleaning up the FIVR group
that may have been created earlier.

Add proper error unwinding to remove the FIVR group before returning
failure.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/LV3PR11MB876881B77D32A2854AD2908EF563A@LV3PR11MB8768.namprd11.prod.outlook.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04 07:21:27 -05:00
Felix Gu 8af710156c thermal/of: Fix reference leak in thermal_of_cm_lookup()
[ Upstream commit a1fe789a96 ]

In thermal_of_cm_lookup(), tr_np is obtained via of_parse_phandle(), but
never released.

Use the __free(device_node) cleanup attribute to automatically release
the node and fix the leak.

Fixes: 423de5b5bc ("thermal/of: Fix cdev lookup in thermal_of_should_bind()")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260124-thermal_of-v1-1-54d3416948cf@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04 07:19:30 -05:00
Rafael J. Wysocki 3c08d86d0c thermal: intel: x86_pkg_temp_thermal: Handle invalid temperature
[ Upstream commit 9635c586a5 ]

After commit be0a3600aa ("thermal: sysfs: Rework the handling of trip
point updates"), THERMAL_TEMP_INVALID can be passed to sys_set_trip_temp()
and it is treated as a regular temperature value there, so the sysfs
write fails even though it is expected to succeed and disable the given
trip point.

Address this by making sys_set_trip_temp() clear its temp variable when
it is equal to THERMAL_TEMP_INVALID.

Fixes: be0a3600aa ("thermal: sysfs: Rework the handling of trip point updates")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2815400.mvXUDI8C0e@rafael.j.wysocki
Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04 07:19:28 -05:00
Rafael J. Wysocki 9150750366 thermal: gov_step_wise: Allow cooling level to be reduced earlier
[ Upstream commit 2e82368359 ]

The current behavior of the Step-wise thermal governor is to increase
the cooling level one step at a time after trip point threshold passing
by thermal zone temperature until the temperature stops to rise.  Then,
nothing is done until the temperature decreases below the (possibly
updated) trip point threshold, at which point the cooling level is
reduced straight to the applicable minimum.

While this generally works, it is not in agreement with the throttling
logic description comment in step_wise_manage() any more after some
relatively recent changes, and in the case of passive cooling, it may
lead to undesirable performance oscillations between high and low
levels.

For this reason, modify the governor's cooling device state selection
function, get_target_state(), to reduce cooling by one level even if
the temperature is still above the thermal zone threshold, but the
temperature has started to fall down.  However, ensure that the cooling
level will remain above the applicable minimum in that case to pull
the zone temperature further down, possibly until it falls below the
trip threshold (which may now be equal to the low temperature of the
trip).

Doing so should help higher performance to be restored earlier in some
cases which is desirable especially for passive trip points with
relatively high hysteresis values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/1947735.tdWV9SEqCh@rafael.j.wysocki
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:04 -05:00
Dmitry Baryshkov 7ac2241635 thermal/drivers/qcom/lmh: Add missing IRQ includes
[ Upstream commit b50b2c53f9 ]

As reported by LKP, the Qualcomm LMH driver needs to include several
IRQ-related headers, which decrlare necessary IRQ functionality.
Currently driver builds on ARM64 platforms, where the headers are pulled
in implicitly by other headers, but fails to build on other platforms.

Fixes: 53bca371cd ("thermal/drivers/qcom: Add support for LMh driver")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507270042.KdK0KKht-lkp@intel.com/
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20250728-lmh-scm-v2-2-33bc58388ca5@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-15 12:00:03 +02:00
Dmitry Baryshkov 482f1b6f56 thermal/drivers/qcom: Make LMH select QCOM_SCM
[ Upstream commit 57eda47bd1 ]

The QCOM_SCM symbol is not user-visible, so it makes little sense to
depend on it. Make LMH driver select QCOM_SCM as all other drivers do
and, as the dependecy is now correctly handled, enable || COMPILE_TEST
in order to include the driver into broader set of build tests.

Fixes: 9e5a4fb842 ("thermal/drivers/qcom/lmh: make QCOM_LMH depends on QCOM_SCM")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20250728-lmh-scm-v2-1-33bc58388ca5@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-15 12:00:03 +02:00
Nícolas F. R. A. Prado 523aefb90b thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
[ Upstream commit fa17ff8e32 ]

In order to get working interrupts, a low offset value needs to be
configured. The minimum value for it is 20 Celsius, which is what is
configured when there's no lower thermal trip (ie the thermal core
passes -INT_MAX as low trip temperature). However, when the temperature
gets that low and fluctuates around that value it causes an interrupt
storm.

Prevent that interrupt storm by not enabling the low offset interrupt if
the low threshold is the minimum one.

Cc: stable@vger.kernel.org
Fixes: 77354eaef8 ("thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-3-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ Adapted interrupt mask definitions ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-09 18:58:22 +02:00
Mason Chang 9a7141d480 thermal/drivers/mediatek/lvts_thermal: Add mt7988 lvts commands
commit 685a755089 upstream.

These commands are necessary to avoid severely abnormal and inaccurate
temperature readings that are caused by using the default commands.

Signed-off-by: Mason Chang <mason-cw.chang@mediatek.com>
Link: https://lore.kernel.org/r/20250526102659.30225-4-mason-cw.chang@mediatek.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04 15:31:56 +02:00
Mason Chang 739229eb4d thermal/drivers/mediatek/lvts_thermal: Add lvts commands and their sizes to driver data
commit 6203a5e6fd upstream.

Add LVTS commands and their sizes to driver data in preparation for
adding different commands.

Signed-off-by: Mason Chang <mason-cw.chang@mediatek.com>
Link: https://lore.kernel.org/r/20250526102659.30225-3-mason-cw.chang@mediatek.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04 15:31:56 +02:00
Mason Chang d1f4b09d9b thermal/drivers/mediatek/lvts_thermal: Change lvts commands array to static const
commit c5d5a72c01 upstream.

Change the LVTS commands array to static const in preparation for
adding different commands.

Signed-off-by: Mason Chang <mason-cw.chang@mediatek.com>
Link: https://lore.kernel.org/r/20250526102659.30225-2-mason-cw.chang@mediatek.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-04 15:31:56 +02:00
Hsin-Te Yuan e90a4edb7e thermal: sysfs: Return ENODATA instead of EAGAIN for reads
[ Upstream commit 1a4aabc27e ]

According to POSIX spec, EAGAIN returned by read with O_NONBLOCK set
means the read would block. Hence, the common implementation in
nonblocking model will poll the file when the nonblocking read returns
EAGAIN. However, when the target file is thermal zone, this mechanism
will totally malfunction because thermal zone doesn't implement sysfs
notification and thus the poll will never return.

For example, the read in Golang implemnts such method and sometimes
hangs at reading some thermal zones via sysfs.

Change to return -ENODATA instead of -EAGAIN to userspace.

Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Link: https://patch.msgid.link/20250620-temp-v3-1-6becc6aeb66c@chromium.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20 18:30:24 +02:00
David Collins f740ee4a07 thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required
[ Upstream commit f8e157ff2d ]

Certain TEMP_ALARM GEN2 PMIC peripherals need over-temperature stage 2
automatic PMIC partial shutdown. This will ensure that in the event of
reaching the hotter stage 3 over-temperature threshold, repeated faults
will be avoided during the automatic PMIC hardware full shutdown.
Modify the stage 2 shutdown control logic to ensure that stage 2
shutdown is enabled on all affected PMICs. Read the digital major
and minor revision registers to identify these PMICs.

Signed-off-by: David Collins <david.collins@oss.qualcomm.com>
Signed-off-by: Anjelique Melendez <anjelique.melendez@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250710224555.3047790-2-anjelique.melendez@oss.qualcomm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20 18:30:23 +02:00
Arnd Bergmann fe794d1f9b thermal/drivers/mediatek/lvts: Remove unused lvts_debugfs_exit
commit 3159c96ac2 upstream.

When debugfs is disabled, the function has no reference any more:

drivers/thermal/mediatek/lvts_thermal.c:266:13: error: 'lvts_debugfs_exit' defined but not used [-Werror=unused-function]
  266 | static void lvts_debugfs_exit(struct lvts_domain *lvts_td) { }
      |             ^~~~~~~~~~~~~~~~~

Fixes: ef280c17a840 ("thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250505052502.1812867-1-arnd@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19 15:32:38 +02:00
AngeloGioacchino Del Regno 0440baa4d4 thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure
[ Upstream commit b49825661a ]

When running the probe function for this driver, the function
lvts_debugfs_init() gets called in lvts_domain_init() which, in
turn, gets called in lvts_probe() before registering threaded
interrupt handlers.

Even though it's unlikely, the last call may fail and, if it does,
there's nothing removing the already created debugfs folder and
files.

In order to fix that, instead of calling the lvts debugfs cleanup
function upon failure, register a devm action that will take care
of calling that upon failure or driver removal.

Since devm was used, also delete the call to lvts_debugfs_exit()
in the lvts_remove() callback, as now that's done automatically.

Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20250402083852.20624-1-angelogioacchino.delregno@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19 15:31:33 +02:00
Zhang Rui 56081f5d14 thermal: intel: x86_pkg_temp_thermal: Fix bogus trip temperature
commit cf948c8e27 upstream.

The tj_max value obtained from the Intel TCC library are in Celsius,
whereas the thermal subsystem operates in milli-Celsius.

This discrepancy leads to incorrect trip temperature calculations.

Fix bogus trip temperature by converting tj_max to milli-Celsius Unit.

Fixes: 8ef0ca4a17 ("Merge back other thermal control material for 6.3.")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reported-by: zhang ning <zhangn1985@outlook.com>
Closes: https://lore.kernel.org/all/TY2PR01MB3786EF0FE24353026293F5ACCD97A@TY2PR01MB3786.jpnprd01.prod.outlook.com/
Tested-by: zhang ning <zhangn1985@outlook.com>
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
Link: https://patch.msgid.link/20250519070901.1031233-1-rui.zhang@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-29 11:03:21 +02:00
Alice Guo 7cfde2a482 thermal/drivers/qoriq: Power down TMU on system suspend
[ Upstream commit 229f3feb4b ]

Enable power-down of TMU (Thermal Management Unit) for TMU version 2 during
system suspend to save power. Save approximately 4.3mW on VDD_ANA_1P8 on
i.MX93 platforms.

Signed-off-by: Alice Guo <alice.guo@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20241209164859.3758906-2-Frank.Li@nxp.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:02:07 +02:00
Nícolas F. R. A. Prado c347928320 thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
[ Upstream commit 2738fb3ec6 ]

Interrupts are enabled per sensor in lvts_update_irq_mask() as needed,
there's no point in enabling all of them during initialization. Change
the MONINT register initial value so all sensor interrupts start
disabled.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-4-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:02:07 +02:00
Nícolas F. R. A. Prado d25a240c5a thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
commit c612cbcdf6 upstream.

The Stage 3 thermal threshold is currently configured during
the controller initialization to 105 Celsius. From the kernel
perspective, this configuration is harmful because:
* The stage 3 interrupt that gets triggered when the threshold is
  crossed is not handled in any way by the IRQ handler, it just gets
  cleared. Besides, the temperature used for stage 3 comes from the
  sensors, and the critical thermal trip points described in the
  Devicetree will already cause a shutdown when crossed (at a lower
  temperature, of 100 Celsius, for all SoCs currently using this
  driver).
* The only effect of crossing the stage 3 threshold that has been
  observed is that it causes the machine to no longer be able to enter
  suspend. Even if that was a result of a momentary glitch in the
  temperature reading of a sensor (as has been observed on the
  MT8192-based Chromebooks).

For those reasons, disable the Stage 3 thermal threshold configuration.

Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-2-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-20 10:16:00 +02:00
Nícolas F. R. A. Prado 0131251d93 thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
commit 65594b3745 upstream.

When configured in filtered mode, the LVTS thermal controller will
monitor the temperature from the sensors and trigger an interrupt once a
thermal threshold is crossed.

Currently this is true even during suspend and resume. The problem with
that is that when enabling the internal clock of the LVTS controller in
lvts_ctrl_set_enable() during resume, the temperature reading can glitch
and appear much higher than the real one, resulting in a spurious
interrupt getting generated.

Disable the temperature monitoring and give some time for the signals to
stabilize during suspend in order to prevent such spurious interrupts.

Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: 8137bb9060 ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-1-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-20 10:15:59 +02:00
Trevor Woerner 63d71ae010 thermal/drivers/rockchip: Add missing rk3328 mapping entry
commit ee022e5cae upstream.

The mapping table for the rk3328 is missing the entry for -25C which is
found in the TRM section 9.5.2 "Temperature-to-code mapping".

NOTE: the kernel uses the tsadc_q_sel=1'b1 mode which is defined as:
      4096-<code in table>. Whereas the table in the TRM gives the code
      "3774" for -25C, the kernel uses 4096-3774=322.

[Dragan Simic] : "After going through the RK3308 and RK3328 TRMs, as
  well as through the downstream kernel code, it seems we may have
  some troubles at our hands.  Let me explain, please.

  To sum it up, part 1 of the RK3308 TRM v1.1 says on page 538 that
  the equation for the output when tsadc_q_sel equals 1 is (4096 -
  tsadc_q), while part 1 of the RK3328 TRM v1.2 says that the output
  equation is (1024 - tsadc_q) in that case.

  The downstream kernel code, however, treats the RK3308 and RK3328
  tables and their values as being the same.  It even mentions 1024 as
  the "offset" value in a comment block for the rk_tsadcv3_control()
  function, just like the upstream code does, which is obviously wrong
  "offset" value when correlated with the table on page 544 of part 1
  of the RK3308 TRM v1.1.

  With all this in mind, it's obvious that more work is needed to make
  it clear where's the actual mistake (it could be that the TRM is
  wrong), which I'll volunteer for as part of the SoC binning project.
  In the meantime, this patch looks fine as-is to me, by offering
  what's a clear improvement to the current state of the upstream
  code"

Link: https://opensource.rock-chips.com/images/9/97/Rockchip_RK3328TRM_V1.1-Part1-20170321.pdf
Cc: stable@vger.kernel.org
Fixes: eda519d5f7 ("thermal: rockchip: Support the RK3328 SOC in thermal driver")
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20250207175048.35959-1-twoerner@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-20 10:15:51 +02:00
Chenyuan Yang 953d28a4f4 thermal: int340x: Add NULL check for adev
[ Upstream commit 2542a3f70e ]

Not all devices have an ACPI companion fwnode, so adev might be NULL.
This is similar to the commit cd2fd6eab4
("platform/x86: int3472: Check for adev == NULL").

Add a check for adev not being set and return -ENODEV in that case to
avoid a possible NULL pointer deref in int3402_thermal_probe().

Note, under the same directory, int3400_thermal_probe() has such a
check.

Fixes: 77e337c6e2 ("Thermal: introduce INT3402 thermal driver")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250313043611.1212116-1-chenyuan0y@gmail.com
[ rjw: Subject edit, added Fixes: ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10 14:39:11 +02:00
Daniel Lezcano a91922e9ea thermal/cpufreq_cooling: Remove structure member documentation
[ Upstream commit a6768c4f92 ]

The structure member documentation refers to a member which does not
exist any more. Remove it.

Link: https://lore.kernel.org/all/202501220046.h3PMBCti-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501220046.h3PMBCti-lkp@intel.com/
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250211084712.2746705-1-daniel.lezcano@linaro.org
[ rjw: Minor changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22 12:54:17 -07:00
Rafael J. Wysocki 72cbaf8b41 thermal: gov_power_allocator: Add missing NULL pointer check
commit ac1f43c03f upstream.

Commit 0dc23567c2 ("thermal: core: Move lists of thermal instances
to trip descriptors") overlooked the case in which the Power Allocator
governor attempts to bind to a tripless thermal zone and params->trip_max
is NULL in check_power_actors().

No power actors can be found in that case, so check_power_actors() needs
to be made return 0 then to restore its previous behavior.

Fixes: 0dc23567c2 ("thermal: core: Move lists of thermal instances to trip descriptors")
Closes: https://lore.kernel.org/linux-pm/Z0NeGF4ryCe_b5rr@sashalap/
Reported-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2761105.mvXUDI8C0e@rjwysocki.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-07 18:25:46 +01:00
Yu-Che Cheng 27a144c3be thermal: gov_power_allocator: Update total_weight on bind and cdev updates
[ Upstream commit 0cde378a10 ]

params->total_weight is not initialized during bind and not updated when
the bound cdev changes. The cooling device weight will not be used due
to the uninitialized total_weight, until an update via sysfs is
triggered.

The bound cdevs are updated during thermal zone registration, where each
cooling device will be bound to the thermal zone one by one, but
power_allocator_bind() can be called without an additional cdev update
when manually changing the policy of a thermal zone via sysfs.

Add a new function to handle weight update logic, including updating
total_weight, and call it when bind, weight changes, and cdev updates to
ensure total_weight is always correct.

Fixes: a3cd6db4cc ("thermal: gov_power_allocator: Support new update callback of weights")
Signed-off-by: Yu-Che Cheng <giver@chromium.org>
Link: https://patch.msgid.link/20250222-fix-power-allocator-weight-v2-1-a94de86b685a@chromium.org
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07 18:25:35 +01:00
Rafael J. Wysocki 19cd2dc4d4 thermal: core: Move lists of thermal instances to trip descriptors
[ Upstream commit 0dc23567c2 ]

In almost all places where a thermal zone's list of thermal instances
is walked, there is a check to match a specific trip point and it is
walked in vain whenever there are no cooling devices associated with
the given trip.

To address this, store the lists of thermal instances in trip point
descriptors instead of storing them in thermal zones and adjust all
code using those lists accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5522726.Sb9uPGUboI@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Stable-dep-of: 0cde378a10 ("thermal: gov_power_allocator: Update total_weight on bind and cdev updates")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07 18:25:35 +01:00
Rafael J. Wysocki e11df3bffd thermal/of: Fix cdev lookup in thermal_of_should_bind()
[ Upstream commit 423de5b5bc ]

Since thermal_of_should_bind() terminates the loop after processing
the first child found in cooling-maps, it will never match more than
one cdev to a given trip point which is incorrect, as there may be
cooling-maps associating one trip point with multiple cooling devices.

Address this by letting the loop continue until either all
children have been processed or a matching one has been found.

To avoid adding conditionals or goto statements, put the loop in
question into a separate function and make that function return
right away after finding a matching cooling-maps entry.

Fixes: 94c6110b0b ("thermal/of: Use the .should_bind() thermal zone callback")
Link: https://lore.kernel.org/linux-pm/20250219-fix-thermal-of-v1-1-de36e7a590c4@chromium.org/
Reported-by: Yu-Che Cheng <giver@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Yu-Che Cheng <giver@chromium.org>
Tested-by: Yu-Che Cheng <giver@chromium.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2788228.mvXUDI8C0e@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07 18:25:35 +01:00
Krzysztof Kozlowski 07a82c78d8 thermal: of: Simplify thermal_of_should_bind with scoped for each OF child
[ Upstream commit 69f3aa6ad9 ]

Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20241010-b4-cleanup-h-of-node-put-thermal-v4-1-bfbe29ad81f4@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 423de5b5bc ("thermal/of: Fix cdev lookup in thermal_of_should_bind()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07 18:25:35 +01:00
Yu-Che Cheng 806437d047 thermal: gov_power_allocator: Fix incorrect calculation in divvy_up_power()
[ Upstream commit 4ecaa75771 ]

divvy_up_power() should use weighted_req_power instead of req_power to
calculate granted_power. Otherwise, granted_power may be unexpected as
the denominator total_req_power is a weighted sum.

This is a mistake made during the previous refactor.

Replace req_power with weighted_req_power in divvy_up_power()
calculation.

Fixes: 912e97c67c ("thermal: gov_power_allocator: Move memory allocation out of throttle()")
Signed-off-by: Yu-Che Cheng <giver@chromium.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250219-fix-power-allocator-calc-v1-1-48b860291919@chromium.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07 18:25:34 +01:00
Joe Hattori 5b80f2fe8a thermal: of: fix OF node leak in of_thermal_zone_find()
[ Upstream commit 9164e0912a ]

of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not
release the OF node reference obtained by it.

Add a of_node_put() call when the call is successful.

Fixes: 3fd6d6e2b4 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-17 13:40:48 +01:00
Barnabás Czémán a5289e7085 thermal/drivers/qcom/tsens-v1: Add support for MSM8937 tsens
[ Upstream commit e2ffb6c3a4 ]

Add support for tsens v1.4 block what can be found in
MSM8937 and MSM8917.

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
Link: https://lore.kernel.org/r/20241113-msm8917-v6-5-c348fb599fef@mainlining.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14 20:04:02 +01:00
Srinivas Pandruvada 3a3f789ee1 thermal: int3400: Fix reading of current_uuid for active policy
commit 7082503622 upstream.

When the current_uuid attribute is set to the active policy UUID,
reading back the same attribute is returning "INVALID" instead of
the active policy UUID on some platforms before Ice Lake.

In platforms before Ice Lake, firmware provides a list of supported
thermal policies. In this case, user space can select any of the
supported thermal policies via a write to attribute "current_uuid".

In commit c7ff297639 ("thermal: int340x: Update OS policy capability
handshake")', the OS policy handshake was updated to support Ice Lake
and later platforms and it treated priv->current_uuid_index=0 as
invalid. However, priv->current_uuid_index=0 is for the active policy,
only priv->current_uuid_index=-1 is invalid.

Fix this issue by updating the priv->current_uuid_index check.

Fixes: c7ff297639 ("thermal: int340x: Update OS policy capability handshake")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 5.18+ <stable@vger.kernel.org> # 5.18+
Link: https://patch.msgid.link/20241114200213.422303-1-srinivas.pandruvada@linux.intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09 10:41:08 +01:00
Rafael J. Wysocki 526c132124 thermal: testing: Initialize some variables annoteded with _free()
[ Upstream commit 0104dcdaad ]

Variables annotated with __free() need to be initialized if the function
can return before they get updated for the first time or the attempt to
free the memory pointed to by them upon function return may crash the
kernel.

Fix this issue in some places in the thermal testing code.

Fixes: f6a034f2df ("thermal: Introduce a debugfs-based testing facility")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12576267.O9o76ZdvQC@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:20 +01:00
Rafael J. Wysocki 6bb065a874 thermal: testing: Use DEFINE_FREE() and __free() to simplify code
[ Upstream commit 1c426fd09c ]

Use DEFINE_FREE() to define a __free function for dropping thermal
zone template reference counters and use it along with __free() to
simplify code in some places.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/4628747.LvFx2qVVIh@rjwysocki.net
[ rjw: Add variable initialization to address compiler warning ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 0104dcdaad ("thermal: testing: Initialize some variables annoteded with _free()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:20 +01:00
Rafael J. Wysocki 79cb9952e1 thermal: core: Fix race between zone registration and system suspend
[ Upstream commit cdf771ab47 ]

If the registration of a thermal zone takes place at the time when
system suspend is started, thermal_pm_notify() can run before the new
thermal zone is added to thermal_tz_list and its "suspended" flag will
not be set.  Consequently, if __thermal_zone_device_update() is called
for that thermal zone, it will not return early as expected which may
cause some destructive interference with the system suspend or resume
flow to occur.

To avoid that, make thermal_zone_init_complete() introduced previously
set the "suspended" flag for new thermal zones if it runs during system
suspend or resume.

Fixes: 4e814173a8 ("thermal: core: Fix thermal zone suspend-resume synchronization")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/8490245.NyiUUSuA9g@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:17 +01:00
Rafael J. Wysocki 39a5ad6b63 thermal: core: Mark thermal zones as initializing to start with
[ Upstream commit 7837fa8115 ]

After thermal_zone_device_register_with_trips() has called
device_register() and it has registered the new thermal zone device
with the driver core, user space may access its sysfs attributes and,
among other things, it may enable the thermal zone before it is ready.

To address this, introduce a new thermal zone state flag for
initialization and set it before calling device_register() in
thermal_zone_device_register_with_trips().  This causes
__thermal_zone_device_update() to return early until the new flag
is cleared.

To clear it when the thermal zone is ready, introduce a new
function called thermal_zone_init_complete() that will also invoke
__thermal_zone_device_update() after clearing that flag (both under the
thernal zone lock) and make thermal_zone_device_register_with_trips()
call the new function instead of checking need_update and calling
thermal_zone_device_update() when it is set.

After this change, if user space enables the thermal zone prematurely,
__thermal_zone_device_update() will return early for it until
thermal_zone_init_complete() is called.  In turn, if the thermal zone
is not enabled by user space before thermal_zone_init_complete() is
called, the __thermal_zone_device_update() call in it will return early
because the thermal zone has not been enabled yet, but that function
will be invoked again by thermal_zone_device_set_mode() when the thermal
zone is enabled and it will not return early this time.

The checking of need_update is not necessary any more because the
__thermal_zone_device_update() calls potentially triggered by cooling
device binding take place before calling thermal_zone_init_complete(),
so they all will return early, which means that
thermal_zone_init_complete() must call __thermal_zone_device_update()
in case the thermal zone is enabled prematurely by user space.

Fixes: 203d3d4aa4 ("the generic thermal sysfs driver")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/9360231.CDJkKcVGEf@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:17 +01:00
Rafael J. Wysocki f52dc3c757 thermal: core: Represent suspend-related thermal zone flags as bits
[ Upstream commit 26c9ab8090 ]

Instead of using two separate fields in struct thermal_zone_device for
representing flags related to thermal zone suspend, represent them
explicitly as bits in one u8 "state" field.

Subsequently, that field will be used for addressing race conditions
related to thermal zone initialization and exit.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/7733910.EvYhyI6sBW@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Stable-dep-of: 7837fa8115 ("thermal: core: Mark thermal zones as initializing to start with")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:17 +01:00
Rafael J. Wysocki 933ef9360a thermal: core: Rearrange PM notification code
[ Upstream commit 7ddca58857 ]

Move the code run for each thermal zone by the thermal PM notify
handler to separate functions.

This will help to make some subsequent changes look somewhat more
straightforward, among other things.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2299090.iZASKD2KPV@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Stable-dep-of: 7837fa8115 ("thermal: core: Mark thermal zones as initializing to start with")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:17 +01:00
Rafael J. Wysocki d01d9005fb thermal: core: Initialize thermal zones before registering them
[ Upstream commit 662f920f7e ]

Since user space can start interacting with a new thermal zone as soon
as device_register() called by thermal_zone_device_register_with_trips()
returns, it is better to initialize the thermal zone before calling
device_register() on it.

Fixes: d0df264fbd ("thermal/core: Remove pointless thermal_zone_device_reset() function")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3336146.44csPzL39Z@rjwysocki.net
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05 14:01:16 +01:00
Icenowy Zheng 725f31f300 thermal/of: support thermal zones w/o trips subnode
Although the current device tree binding of thermal zones require the
trips subnode, the binding in kernel v5.15 does not require it, and many
device trees shipped with the kernel, for example,
allwinner/sun50i-a64.dtsi and mediatek/mt8183-kukui.dtsi in ARM64, still
comply to the old binding and contain no trips subnode.

Allow the code to successfully register thermal zones w/o trips subnode
for DT binding compatibility now.

Furtherly, the inconsistency between DTs and bindings should be resolved
by either adding empty trips subnode or dropping the trips subnode
requirement.

Fixes: d0c75fa2c1 ("thermal/of: Initialize trip points separately")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
[wenst@chromium.org: Reworked logic and kernel log messages]
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20241018073139.1268995-1-wenst@chromium.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-11-04 15:38:29 +01:00
Dmitry Baryshkov f16beaaee2 thermal/drivers/qcom/lmh: Remove false lockdep backtrace
Annotate LMH IRQs with lockdep classes so that the lockdep doesn't
report possible recursive locking issue between LMH and GIC interrupts.

For the reference:

       CPU0
       ----
  lock(&irq_desc_lock_class);
  lock(&irq_desc_lock_class);

 *** DEADLOCK ***

Call trace:
 dump_backtrace+0x98/0xf0
 show_stack+0x18/0x24
 dump_stack_lvl+0x90/0xd0
 dump_stack+0x18/0x24
 print_deadlock_bug+0x258/0x348
 __lock_acquire+0x1078/0x1f44
 lock_acquire+0x1fc/0x32c
 _raw_spin_lock_irqsave+0x60/0x88
 __irq_get_desc_lock+0x58/0x98
 enable_irq+0x38/0xa0
 lmh_enable_interrupt+0x2c/0x38
 irq_enable+0x40/0x8c
 __irq_startup+0x78/0xa4
 irq_startup+0x78/0x168
 __enable_irq+0x70/0x7c
 enable_irq+0x4c/0xa0
 qcom_cpufreq_ready+0x20/0x2c
 cpufreq_online+0x2a8/0x988
 cpufreq_add_dev+0x80/0x98
 subsys_interface_register+0x104/0x134
 cpufreq_register_driver+0x150/0x234
 qcom_cpufreq_hw_driver_probe+0x2a8/0x388
 platform_probe+0x68/0xc0
 really_probe+0xbc/0x298
 __driver_probe_device+0x78/0x12c
 driver_probe_device+0x3c/0x160
 __device_attach_driver+0xb8/0x138
 bus_for_each_drv+0x84/0xe0
 __device_attach+0x9c/0x188
 device_initial_probe+0x14/0x20
 bus_probe_device+0xac/0xb0
 deferred_probe_work_func+0x8c/0xc8
 process_one_work+0x20c/0x62c
 worker_thread+0x1bc/0x36c
 kthread+0x120/0x124
 ret_from_fork+0x10/0x20

Fixes: 53bca371cd ("thermal/drivers/qcom: Add support for LMh driver")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241011-lmh-lockdep-v1-1-495cbbe6fef1@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-11-04 15:38:29 +01:00
Linus Torvalds e643edac70 Merge tag 'pm-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These address two issues in the TPMI module of the Intel RAPL power
  capping driver and one issue in the processor part of the Intel
  int340x thermal driver, update a CPU ID list and register definitions
  needed for RAPL PL4 support and remove some unused code.

  Specifics:

   - Fix the TPMI_RAPL_REG_DOMAIN_INFO register offset in the TPMI part
     of the Intel RAPL power capping driver, make it ignore minor
     hardware version mismatches (which only indicate exposing
     additional features) and update register definitions in it to
     enable PL4 support (Zhang Rui)

   - Add Arrow Lake-U to the list of processors supporting PL4 in the
     MSR part of the Intel RAPL power capping driver (Sumeet Pawnikar)

   - Remove excess pci_disable_device() calls from the processor part of
     the int340x thermal driver to address a warning triggered during
     module unload and remove unused CPU hotplug code related to RAPL
     support from it (Zhang Rui)"

* tag 'pm-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: int340x: processor: Add MMIO RAPL PL4 support
  thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug support
  powercap: intel_rapl_msr: Add PL4 support for Arrowlake-U
  powercap: intel_rapl_tpmi: Ignore minor version change
  thermal: intel: int340x: processor: Fix warning during module unload
  powercap: intel_rapl_tpmi: Fix bogus register reading
2024-10-11 11:41:20 -07:00
Linus Torvalds f8fafb690b Merge tag 'thermal-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
 "Address possible use-after-free scenarios during the processing of
  thermal netlink commands and during thermal zone removal (Rafael
  Wysocki)"

* tag 'thermal-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Free tzp copy along with the thermal zone
  thermal: core: Reference count the zone in thermal_zone_get_by_id()
2024-10-11 11:35:30 -07:00
Zhang Rui 3fb0eea8a1 thermal: intel: int340x: processor: Add MMIO RAPL PL4 support
Similar to the MSR RAPL interface, MMIO RAPL supports PL4 too, so add
MMIO RAPL PL4d support to the processor_thermal driver.

As a result, the powercap sysfs for MMIO RAPL will show a new "peak
power" constraint.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-7-rui.zhang@intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:39:33 +02:00
Zhang Rui bfc6819e4b thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug support
CPU0/package0 is always online and the MMIO RAPL driver runs on single
package systems only, so there is no need to handle CPU hotplug in it.

Always register a RAPL package device for package 0 and remove the
unnecessary CPU hotplug support.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-6-rui.zhang@intel.com
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:39:33 +02:00
Zhang Rui 99ca0b57e4 thermal: intel: int340x: processor: Fix warning during module unload
The processor_thermal driver uses pcim_device_enable() to enable a PCI
device, which means the device will be automatically disabled on driver
detach.  Thus there is no need to call pci_disable_device() again on it.

With recent PCI device resource management improvements, e.g. commit
f748a07a0b ("PCI: Remove legacy pcim_release()"), this problem is
exposed and triggers the warining below.

 [  224.010735] proc_thermal_pci 0000:00:04.0: disabling already-disabled device
 [  224.010747] WARNING: CPU: 8 PID: 4442 at drivers/pci/pci.c:2250 pci_disable_device+0xe5/0x100
 ...
 [  224.010844] Call Trace:
 [  224.010845]  <TASK>
 [  224.010847]  ? show_regs+0x6d/0x80
 [  224.010851]  ? __warn+0x8c/0x140
 [  224.010854]  ? pci_disable_device+0xe5/0x100
 [  224.010856]  ? report_bug+0x1c9/0x1e0
 [  224.010859]  ? handle_bug+0x46/0x80
 [  224.010862]  ? exc_invalid_op+0x1d/0x80
 [  224.010863]  ? asm_exc_invalid_op+0x1f/0x30
 [  224.010867]  ? pci_disable_device+0xe5/0x100
 [  224.010869]  ? pci_disable_device+0xe5/0x100
 [  224.010871]  ? kfree+0x21a/0x2b0
 [  224.010873]  pcim_disable_device+0x20/0x30
 [  224.010875]  devm_action_release+0x16/0x20
 [  224.010878]  release_nodes+0x47/0xc0
 [  224.010880]  devres_release_all+0x9f/0xe0
 [  224.010883]  device_unbind_cleanup+0x12/0x80
 [  224.010885]  device_release_driver_internal+0x1ca/0x210
 [  224.010887]  driver_detach+0x4e/0xa0
 [  224.010889]  bus_remove_driver+0x6f/0xf0
 [  224.010890]  driver_unregister+0x35/0x60
 [  224.010892]  pci_unregister_driver+0x44/0x90
 [  224.010894]  proc_thermal_pci_driver_exit+0x14/0x5f0 [processor_thermal_device_pci]
 ...
 [  224.010921] ---[ end trace 0000000000000000 ]---

Remove the excess pci_disable_device() calls.

Fixes: acd65d5d1c ("thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-3-rui.zhang@intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-07 20:10:56 +02:00
Rafael J. Wysocki 827a07525c thermal: core: Free tzp copy along with the thermal zone
The object pointed to by tz->tzp may still be accessed after being
freed in thermal_zone_device_unregister(), so move the freeing of it
to the point after the removal completion has been completed at which
it cannot be accessed any more.

Fixes: 3d439b1a2a ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/4623516.LvFx2qVVIh@rjwysocki.net
2024-10-04 19:05:08 +02:00
Rafael J. Wysocki a42a5839f4 thermal: core: Reference count the zone in thermal_zone_get_by_id()
There are places in the thermal netlink code where nothing prevents
the thermal zone object from going away while being accessed after it
has been returned by thermal_zone_get_by_id().

To address this, make thermal_zone_get_by_id() get a reference on the
thermal zone device object to be returned with the help of get_device(),
under thermal_list_lock, and adjust all of its callers to this change
with the help of the cleanup.h infrastructure.

Fixes: 1ce50e7d40 ("thermal: core: genetlink support for events/cmd/sampling")
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/6112242.lOV4Wx5bFT@rjwysocki.net
2024-10-04 19:05:08 +02:00
Al Viro 5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Al Viro cb787f4ac0 [tree-wide] finally take no_llseek out
no_llseek had been defined to NULL two years ago, in commit 868941b144
("fs: remove no_llseek")

To quote that commit,

  At -rc1 we'll need do a mechanical removal of no_llseek -

  git grep -l -w no_llseek | grep -v porting.rst | while read i; do
	sed -i '/\<no_llseek\>/d' $i
  done

  would do it.

Unfortunately, that hadn't been done.  Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
	.llseek = no_llseek,
so it's obviously safe.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-27 08:18:43 -07:00
Rafael J. Wysocki 3bc5ed15bd Merge tag 'thermal-v6.12-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into
Merge thermal drivers changes for v6.12-rc1 from Daniel Lezcano:

"- Add power domain DT bindings for new Amlogic SoCs (Georges Stark)

 - Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr() in the ST
   driver and add a Kconfig dependency on THERMAL_OF subsystem for the
   STi driver (Raphael Gallais-Pou)

 - Simplify with dev_err_probe() the error code path in the probe
   functions for the brcmstb driver (Yan Zhen)

 - Remove trailing space after \n newline in the Renesas driver (Colin
   Ian King)

 - Add DT binding compatible string for the SA8255p with the tsens
   driver (Nikunj Kela)

 - Use the devm_clk_get_enabled() helpers to simplify the init routine
   in the sprd driver (Huan Yang)

 - Remove __maybe_unused notations for the functions by using the new
   RUNTIME_PM_OPS() and SYSTEM_SLEEP_PM_OPS() macros on the IMx and
   Qoriq drivers (Fabio Estevam)

 - Remove unused declarations in the header file as the functions were
   removed in a previous change on the ti-soc-thermal driver (Zhang
   Zekun)

 - Simplify with dev_err_probe()	the error code path in the probe
   functions for the imx_sc_thermal driver (Alexander Stein)"

* tag 'thermal-v6.12-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/imx_sc_thermal: Use dev_err_probe
  thermal/drivers/ti-soc-thermal: Remove unused declarations
  thermal/drivers/imx: Remove __maybe_unused notations
  thermal/drivers/qoriq: Remove __maybe_unused notations
  thermal/drivers/sprd: Use devm_clk_get_enabled() helpers
  dt-bindings: thermal: tsens: document support on SA8255p
  thermal/drivers/renesas: Remove trailing space after \n newline
  thermal/drivers/brcmstb_thermal: Simplify with dev_err_probe()
  thermal/drivers/sti: Depend on THERMAL_OF subsystem
  thermal/drivers/st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
  dt-bindings: thermal: amlogic,thermal: add optional power-domains
2024-09-10 10:54:15 +02:00