linux-stable-mirror

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2026-04-03 12:05:13 +02:00

Author	SHA1	Message	Date
Dan Carpenter	dca904ffd7	EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size() [ Upstream commit `72f1268361` ] The snprintf() can't really overflow because we're writing a max of 42 bytes to a PAGE_SIZE buffer. But my static checker complains because the limit calculation doesn't take the first 11 space characters that we wrote into the buffer into consideration. Fix this for the sake of correctness even though it doesn't affect runtime. Also delete an earlier "space -= n;" which was not used. Fixes: `68d086f89b` ("i5400_edac: improve debug messages to better represent the filled memory") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://patch.msgid.link/ccd06b91748e7ed8e33eeb2ff1e7b98700879304.1765290801.git.dan.carpenter@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:41 -05:00
Dan Carpenter	1fd7c29a15	EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size() [ Upstream commit `7b5c7e83ac` ] The snprintf() can't really overflow because we're writing a max of 42 bytes to a PAGE_SIZE buffer. But the limit calculation doesn't take the first 11 bytes that we wrote into consideration so the limit is not correct. Just fix it for correctness even though it doesn't affect runtime. Fixes: `64e1fdaf55` ("i5000_edac: Fix the logic that retrieves memory information") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://patch.msgid.link/07cd652c51e77aad5a8350e1a7cd9407e5bbe373.1765290801.git.dan.carpenter@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:41 -05:00
Sebastian Andrzej Siewior	670abc7e56	EDAC/altera: Remove IRQF_ONESHOT [ Upstream commit `5c858d6c66` ] Passing IRQF_ONESHOT ensures that the interrupt source is masked until the secondary (threaded) handler is done. If only a primary handler is used then the flag makes no sense because the interrupt can not fire (again) while its handler is running. The flag also prevents force-threading of the primary handler and the irq-core will warn about this. Remove IRQF_ONESHOT from irqflags. Fixes: `a29d64a45e` ("EDAC, altera: Add IRQ Flags to disable IRQ while handling") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-11-bigeasy@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:39 -05:00
Haoxiang Li	df643bfe1d	EDAC/i3200: Fix a resource leak in i3200_probe1() commit `d42d5715dc` upstream. If edac_mc_alloc() fails, also unmap the window. [ bp: Use separate labels, turning it into the classic unwind pattern. ] Fixes: `dd8ef1db87` ("edac: i3200 memory controller driver") Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251223123202.1492038-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:46 +01:00
Haoxiang Li	4433ddc370	EDAC/x38: Fix a resource leak in x38_probe1() commit `0ff7c44106` upstream. If edac_mc_alloc() fails, also unmap the window. [ bp: Use separate labels, turning it into the classic unwind pattern. ] Fixes: `df8bc08c19` ("edac x38: new MC driver module") Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251223124350.1496325-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:45 +01:00
Niravkumar L Rabara	df0f4b13df	EDAC/altera: Use INTTEST register for Ethernet and USB SBE injection commit `281326be67` upstream. The current single-bit error injection mechanism flips bits directly in ECC RAM by performing write and read operations. When the ECC RAM is actively used by the Ethernet or USB controller, this approach sometimes trigger a false double-bit error. Switch both Ethernet and USB EDAC devices to use the INTTEST register (altr_edac_a10_device_inject_fops) for single-bit error injection, similar to the existing double-bit error injection method. Fixes: `064acbd4f4` ("EDAC, altera: Add Stratix10 peripheral support") Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251111081333.1279635-1-niravkumarlaxmidas.rabara@altera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-11-24 10:36:02 +01:00
Niravkumar L Rabara	0f64b37f19	EDAC/altera: Handle OCRAM ECC enable after warm reset commit `fd3ecda38f` upstream. The OCRAM ECC is always enabled either by the BootROM or by the Secure Device Manager (SDM) during a power-on reset on SoCFPGA. However, during a warm reset, the OCRAM content is retained to preserve data, while the control and status registers are reset to their default values. As a result, ECC must be explicitly re-enabled after a warm reset. Fixes: `17e47dc6db` ("EDAC/altera: Add Stratix10 OCRAM ECC support") Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251111080801.1279401-1-niravkumarlaxmidas.rabara@altera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-11-24 10:36:02 +01:00
Avadhut Naik	e72270986a	EDAC/mc_sysfs: Increase legacy channel support to 16 [ Upstream commit `6e1c2c6c2c` ] Newer AMD systems can support up to 16 channels per EDAC "mc" device. These are detected by the EDAC module running on the device, and the current EDAC interface is appropriately enumerated. The legacy EDAC sysfs interface however, provides device attributes for channels 0 through 11 only. Consequently, the last four channels, 12 through 15, will not be enumerated and will not be visible through the legacy sysfs interface. Add additional device attributes to ensure that all 16 channels, if present, are enumerated by and visible through the legacy EDAC sysfs interface. Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-11-02 22:15:21 +09:00
Qiuxu Zhuo	1652f14cf3	EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller [ Upstream commit `2e6fe1bbef` ] When loading the i10nm_edac driver on some Intel Granite Rapids servers, a call trace may appear as follows: UBSAN: shift-out-of-bounds in drivers/edac/skx_common.c:453:16 shift exponent -66 is negative ... __ubsan_handle_shift_out_of_bounds+0x1e3/0x390 skx_get_dimm_info.cold+0x47/0xd40 [skx_edac_common] i10nm_get_dimm_config+0x23e/0x390 [i10nm_edac] skx_register_mci+0x159/0x220 [skx_edac_common] i10nm_init+0xcb0/0x1ff0 [i10nm_edac] ... This occurs because some BIOS may disable a memory controller if there aren't any memory DIMMs populated on this memory controller. The DIMMMTR register of this disabled memory controller contains the invalid value ~0, resulting in the call trace above. Fix this call trace by skipping DIMM enumeration on a disabled memory controller. Fixes: `ba987eaaab` ("EDAC/i10nm: Add Intel Granite Rapids server support") Reported-by: Jose Jesus Ambriz Meza <jose.jesus.ambriz.meza@intel.com> Reported-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Closes: https://lore.kernel.org/all/20250730063155.2612379-1-acelan.kao@canonical.com/ Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Link: https://lore.kernel.org/r/20250806065707.3533345-1-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-10-15 11:59:55 +02:00
Salah Triki	589a319dcd	EDAC/altera: Delete an inappropriate dma_free_coherent() call commit `ff2a66d21f` upstream. dma_free_coherent() must only be called if the corresponding dma_alloc_coherent() call has succeeded. Calling it when the allocation fails leads to undefined behavior. Delete the wrong call. [ bp: Massage commit message. ] Fixes: `71bcada88b` ("edac: altera: Add Altera SDRAM EDAC support") Signed-off-by: Salah Triki <salah.triki@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/aIrfzzqh4IzYtDVC@pc Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-09-19 16:35:45 +02:00
Shubhrajyoti Datta	694cd8ac4a	EDAC/synopsys: Clear the ECC counters on init [ Upstream commit `b1dc7f097b` ] Clear the ECC error and counter registers during initialization/probe to avoid reporting stale errors that may have occurred before EDAC registration. For that, unify the Zynq and ZynqMP ECC state reading paths and simplify the code. [ bp: Massage commit message. Fix an -Wsometimes-uninitialized warning as reported by Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507141048.obUv3ZUm-lkp@intel.com ] Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250713050753.7042-1-shubhrajyoti.datta@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-08-20 18:30:23 +02:00
Avadhut Naik	8ed96d8e05	EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMs commit `a3f3040657` upstream. Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers. Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used. For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS. Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity. Fixes: `81f5090db8` ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena <zilvinas@natrix.lt> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Žilvinas Žaltiena <zilvinas@natrix.lt> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-06 11:01:42 +02:00
Avadhut Naik	031d274c7b	EDAC/amd64: Correct number of UMCs for family 19h models 70h-7fh commit `b2e673ae53` upstream. AMD's Family 19h-based Models 70h-7fh support 4 unified memory controllers (UMC) per processor die. The amd64_edac driver, however, assumes only 2 UMCs are supported since max_mcs variable for the models has not been explicitly set to 4. The same results in incomplete or incorrect memory information being logged to dmesg by the module during initialization in some instances. Fixes: `6c79e42169` ("EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh") Closes: https://lore.kernel.org/all/27dc093f-ce27-4c71-9e81-786150a040b6@reox.at/ Reported-by: reox <mailinglist@reox.at> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Link: https://lore.kernel.org/20250613005233.2330627-1-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-06-27 11:11:44 +01:00
Niravkumar L Rabara	4904bd8267	EDAC/altera: Use correct write width with the INTTEST register commit `e5ef4cd2a4` upstream. On the SoCFPGA platform, the INTTEST register supports only 16-bit writes. A 32-bit write triggers an SError to the CPU so do 16-bit accesses only. [ bp: AI-massage the commit message. ] Fixes: `c7b4be8db8` ("EDAC, altera: Add Arria10 OCRAM ECC support") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250527145707.25458-1-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-06-27 11:11:21 +01:00
Qiuxu Zhuo	29ce9e71e9	EDAC/{skx_common,i10nm}: Fix the loss of saved RRL for HBM pseudo channel 0 [ Upstream commit `eeed3e03f4` ] When enabling the retry_rd_err_log (RRL) feature during the loading of the i10nm_edac driver with the module parameter retry_rd_err_log=2 (Linux RRL control mode), the default values of the control bits of RRL are saved so that they can be restored during the unloading of the driver. In the current code, the RRL of pseudo channel 1 of HBM overwrites pseudo channel 0 during the loading of the driver, resulting in the loss of saved RRL for pseudo channel 0. This causes the RRL of pseudo channel 0 of HBM to be wrongly restored with the values from pseudo channel 1 when unloading the driver. Fix this issue by creating two separate groups of RRL control registers per channel to save default RRL settings of two {sub-,pseudo-}channels. Fixes: `acd4cf68fe` ("EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Feng Xu <feng.f.xu@intel.com> Link: https://lore.kernel.org/r/20250417150724.1170168-3-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-19 15:31:31 +02:00
Qiuxu Zhuo	a13e8343ff	EDAC/skx_common: Fix general protection fault [ Upstream commit `20d2d476b3` ] After loading i10nm_edac (which automatically loads skx_edac_common), if unload only i10nm_edac, then reload it and perform error injection testing, a general protection fault may occur: mce: [Hardware Error]: Machine check events logged Oops: general protection fault ... ... Workqueue: events mce_gen_pool_process RIP: 0010:string+0x53/0xe0 ... Call Trace: <TASK> ? die_addr+0x37/0x90 ? exc_general_protection+0x1e7/0x3f0 ? asm_exc_general_protection+0x26/0x30 ? string+0x53/0xe0 vsnprintf+0x23e/0x4c0 snprintf+0x4d/0x70 skx_adxl_decode+0x16a/0x330 [skx_edac_common] skx_mce_check_error.part.0+0xf8/0x220 [skx_edac_common] skx_mce_check_error+0x17/0x20 [skx_edac_common] ... The issue arose was because the variable 'adxl_component_count' (inside skx_edac_common), which counts the ADXL components, was not reset. During the reloading of i10nm_edac, the count was incremented by the actual number of ADXL components again, resulting in a count that was double the real number of ADXL components. This led to an out-of-bounds reference to the ADXL component array, causing the general protection fault above. Fix this issue by resetting the 'adxl_component_count' in adxl_put(), which is called during the unloading of {skx,i10nm}_edac. Fixes: `123b158635` ("EDAC, i10nm: make skx_common.o a separate module") Reported-by: Feng Xu <feng.f.xu@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Feng Xu <feng.f.xu@intel.com> Link: https://lore.kernel.org/r/20250417150724.1170168-2-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-06-19 15:31:31 +02:00
Arnd Bergmann	7d0c92af8d	EDAC/ie31200: work around false positive build warning [ Upstream commit `c29dfd661f` ] gcc-14 produces a bogus warning in some configurations: drivers/edac/ie31200_edac.c: In function 'ie31200_probe1.isra': drivers/edac/ie31200_edac.c:412:26: error: 'dimm_info' is used uninitialized [-Werror=uninitialized] 412 \| struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL]; \| ^~~~~~~~~ drivers/edac/ie31200_edac.c:412:26: note: 'dimm_info' declared here 412 \| struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL]; \| ^~~~~~~~~ I don't see any way the unintialized access could really happen here, but I can see why the compiler gets confused by the two loops. Instead, rework the two nested loops to only read the addr_decode registers and then keep only one instance of the dimm info structure. [Tony: Qiuxu pointed out that the "populate DIMM info" comment was left behind in the refactor and suggested moving it. I deleted the comment as unnecessry in front os a call to populate_dimm_info(). That seems pretty self-describing.] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/all/20250122065031.1321015-1-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-29 11:02:43 +02:00
Niravkumar L Rabara	833ef30f01	EDAC/altera: Set DDR and SDMMC interrupt mask before registration commit `6dbe3c5418` upstream. Mask DDR and SDMMC in probe function to avoid spurious interrupts before registration. Removed invalid register write to system manager. Fixes: `1166fde93d` ("EDAC, altera: Add Arria10 ECC memory init functions") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-3-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-05-09 09:50:30 +02:00
Niravkumar L Rabara	349dac4052	EDAC/altera: Test the correct error reg offset commit `4fb7b8fceb` upstream. Test correct structure member, ecc_cecnt_offset, before using it. [ bp: Massage commit message. ] Fixes: `73bcc942f4` ("EDAC, altera: Add Arria10 EDAC support") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-2-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-05-09 09:50:30 +02:00
Qiuxu Zhuo	385a026529	EDAC/ie31200: Fix the error path order of ie31200_init() [ Upstream commit `231e341036` ] The error path order of ie31200_init() is incorrect, fix it. Fixes: `709ed1bcef` ("EDAC/ie31200: Fallback if host bridge device is already initialized") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-4-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:11 +02:00
Qiuxu Zhuo	4294e94f43	EDAC/ie31200: Fix the DIMM size mask for several SoCs [ Upstream commit `3427befbbc` ] The DIMM size mask for {Sky, Kaby, Coffee} Lake is not bits{7:0}, but bits{5:0}. Fix it. Fixes: `953dee9bbd` ("EDAC, ie31200_edac: Add Skylake support") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-3-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:11 +02:00
Qiuxu Zhuo	67d079c0f2	EDAC/ie31200: Fix the size of EDAC_MC_LAYER_CHIP_SELECT layer [ Upstream commit `d59d844e31` ] The EDAC_MC_LAYER_CHIP_SELECT layer pertains to the rank, not the DIMM. Fix its size to reflect the number of ranks instead of the number of DIMMs. Also delete the unused macros IE31200_{DIMMS,RANKS}. Fixes: `7ee40b897d` ("ie31200_edac: Introduce the driver") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-2-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:11 +02:00
Qiuxu Zhuo	2c27c9e1d1	EDAC/{skx_common,i10nm}: Fix some missing error reports on Emerald Rapids [ Upstream commit `d9207cf776` ] When doing error injection to some memory DIMMs on certain Intel Emerald Rapids servers, the i10nm_edac missed error reports for some memory DIMMs. Certain BIOS configurations may hide some memory controllers, and the i10nm_edac doesn't enumerate these hidden memory controllers. However, the ADXL decodes memory errors using memory controller physical indices even if there are hidden memory controllers. Therefore, the memory controller physical indices reported by the ADXL may mismatch the logical indices enumerated by the i10nm_edac, resulting in missed error reports for some memory DIMMs. Fix this issue by creating a mapping table from memory controller physical indices (used by the ADXL) to logical indices (used by the i10nm_edac) and using it to convert the physical indices to the logical indices during the error handling process. Fixes: `c545f5e412` ("EDAC/i10nm: Skip the absent memory controllers") Reported-by: Kevin Chang <kevin1.chang@intel.com> Tested-by: Kevin Chang <kevin1.chang@intel.com> Reported-by: Thomas Chen <Thomas.Chen@intel.com> Tested-by: Thomas Chen <Thomas.Chen@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250214002728.6287-1-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:10 +02:00
Komal Bajaj	e28e7d7b20	EDAC/qcom: Correct interrupt enable register configuration commit `c158647c10` upstream. The previous implementation incorrectly configured the cmn_interrupt_2_enable register for interrupt handling. Using cmn_interrupt_2_enable to configure Tag, Data RAM ECC interrupts would lead to issues like double handling of the interrupts (EL1 and EL3) as cmn_interrupt_2_enable is meant to be configured for interrupts which needs to be handled by EL3. EL1 LLCC EDAC driver needs to use cmn_interrupt_0_enable register to configure Tag, Data RAM ECC interrupts instead of cmn_interrupt_2_enable. Fixes: `27450653f1` ("drivers: edac: Add EDAC driver support for QCOM SoCs") Signed-off-by: Komal Bajaj <quic_kbajaj@quicinc.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20241119064608.12326-1-quic_kbajaj@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-27 04:30:23 -08:00
Borislav Petkov (AMD)	06e2132058	EDAC/amd64: Simplify ECC check on unified memory controllers commit `747367340c` upstream. The intent of the check is to see whether at least one UMC has ECC enabled. So do that instead of tracking which ones are enabled in masks which are too small in size anyway and lead to not loading the driver on Zen4 machines with UMCs enabled over UMC8. Fixes: `e2be5955a8` ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh") Reported-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Avadhut Naik <avadhut.naik@amd.com> Reviewed-by: Avadhut Naik <avadhut.naik@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20241210212054.3895697-1-avadhut.naik@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-27 14:02:06 +01:00
Orange Kao	db60326f2c	EDAC/igen6: Avoid segmentation fault on module unload [ Upstream commit `fefaae9039` ] The segmentation fault happens because: During modprobe: 1. In igen6_probe(), igen6_pvt will be allocated with kzalloc() 2. In igen6_register_mci(), mci->pvt_info will point to &igen6_pvt->imc[mc] During rmmod: 1. In mci_release() in edac_mc.c, it will kfree(mci->pvt_info) 2. In igen6_remove(), it will kfree(igen6_pvt); Fix this issue by setting mci->pvt_info to NULL to avoid the double kfree. Fixes: `10590a9d4f` ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219360 Signed-off-by: Orange Kao <orange@aiven.io> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20241104124237.124109-2-orange@aiven.io Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-05 14:01:18 +01:00
Qiuxu Zhuo	a69ef9c0d4	EDAC/{skx_common,i10nm}: Fix incorrect far-memory error source indicator [ Upstream commit `a36667037a` ] The Granite Rapids CPUs with Flat2LM memory configurations may mistakenly report near-memory errors as far-memory errors, resulting in the invalid decoded ADXL results: EDAC skx: Bad imc -1 Fix this incorrect far-memory error source indicator by prefetching the decoded far-memory controller ID, and adjust the error source indicator to near-memory if the far-memory controller ID is invalid. Fixes: `ba987eaaab` ("EDAC/i10nm: Add Intel Granite Rapids server support") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com> Link: https://lore.kernel.org/r/20241015072236.24543-3-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-05 14:01:17 +01:00
Qiuxu Zhuo	b51babf88c	EDAC/skx_common: Differentiate memory error sources [ Upstream commit `2397f79573` ] The current skx_common determines whether the memory error source is the near memory of the 2LM system and then retrieves the decoded error results from the ADXL components (near-memory vs. far-memory) accordingly. However, some memory controllers may have limitations in correctly reporting the memory error source, leading to the retrieval of incorrect decoded parts from the ADXL. To address these limitations, instead of simply determining whether the memory error is from the near memory of the 2LM system, it is necessary to distinguish the memory error source details as follows: Memory error from the near memory of the 2LM system. Memory error from the far memory of the 2LM system. Memory error from the 1LM system. Not a memory error. This will enable the i10nm_edac driver to take appropriate actions for those memory controllers that have limitations in reporting the memory error source. Fixes: `ba987eaaab` ("EDAC/i10nm: Add Intel Granite Rapids server support") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com> Link: https://lore.kernel.org/r/20241015072236.24543-2-qiuxu.zhuo@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-05 14:01:17 +01:00
Priyanka Singh	1324bdc892	EDAC/fsl_ddr: Fix bad bit shift operations [ Upstream commit `9ec22ac4fe` ] Fix undefined behavior caused by left-shifting a negative value in the expression: cap_high ^ (1 << (bad_data_bit - 32)) The variable bad_data_bit ranges from 0 to 63. When it is less than 32, bad_data_bit - 32 becomes negative, and left-shifting by a negative value in C is undefined behavior. Fix this by combining cap_high and cap_low into a 64-bit variable. [ bp: Massage commit message, simplify error bits handling. ] Fixes: `ea2eb9a8b6` ("EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx") Signed-off-by: Priyanka Singh <priyanka.singh@nxp.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20241016-imx95_edac-v3-3-86ae6fc2756a@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-05 14:01:17 +01:00
David Thompson	000930193f	EDAC/bluefield: Fix potential integer overflow [ Upstream commit `1fe774a93b` ] The 64-bit argument for the "get DIMM info" SMC call consists of mem_ctrl_idx left-shifted 16 bits and OR-ed with DIMM index. With mem_ctrl_idx defined as 32-bits wide the left-shift operation truncates the upper 16 bits of information during the calculation of the SMC argument. The mem_ctrl_idx stack variable must be defined as 64-bits wide to prevent any potential integer overflow, i.e. loss of data from upper 16 bits. Fixes: `82413e562e` ("EDAC, mellanox: Add ECC support for BlueField DDR4") Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shravan Kumar Ramani <shravankr@nvidia.com> Link: https://lore.kernel.org/r/20240930151056.10158-1-davthompson@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-12-05 14:01:16 +01:00
Rajendra Nayak	0a97195d21	EDAC/qcom: Make irq configuration optional On most modern qualcomm SoCs, the configuration necessary to enable the Tag/Data RAM related irqs being propagated to the SoC irq controller is already done in firmware (in DSF or 'DDR System Firmware') On some like the x1e80100, these registers aren't even accesible to the kernel causing a crash when edac device is probed. Hence, make the irq configuration optional in the driver and mark x1e80100 as the SoC on which this should be avoided. Fixes: `af16b00578` ("arm64: dts: qcom: Add base X1E80100 dtsi and the QCP dts") Reported-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Rajendra Nayak <quic_rjendra@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20240903101510.3452734-1-quic_rjendra@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-10-05 22:17:08 -05:00
Linus Torvalds	7dfc15c473	Merge tag 'edac_updates_for_v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Drop a now obsolete ppc4xx_edac driver - Fix conversion to physical memory addresses on Intel's Elkhart Lake and Ice Lake hardware when the system address is above the (Top-Of-Memory) TOM address - Pay attention to the memory hole on Zynq UltraScale+ MPSoC DDR controllers when injecting errors for testing purposes - Add support for translating normalized error addresses reported by an AMD memory controller into system physical addresses using an UEFI mechanism called platform runtime mechanism (PRM). - The usual cleanups and fixes * tag 'edac_updates_for_v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Drop obsolete PPC4xx driver EDAC/sb_edac: Fix the compile warning of large frame size EDAC/{skx_common,i10nm}: Remove the AMAP register for determing DDR5 EDAC/{skx_common,skx,i10nm}: Move the common debug code to skx_common EDAC/igen6: Fix conversion of system address to physical memory address EDAC/synopsys: Fix error injection on Zynq UltraScale+ RAS/AMD/ATL: Translate normalized to system physical addresses using PRM ACPI: PRM: Add PRM handler direct call support	2024-09-16 06:36:37 +02:00
Borislav Petkov (AMD)	92f8358bce	Merge remote-tracking branches 'ras/edac-amd-atl', 'ras/edac-misc' and 'ras/edac-drivers' into edac-updates * ras/edac-amd-atl: RAS/AMD/ATL: Translate normalized to system physical addresses using PRM ACPI: PRM: Add PRM handler direct call support * ras/edac-misc: EDAC/synopsys: Fix error injection on Zynq UltraScale+ * ras/edac-drivers: EDAC: Drop obsolete PPC4xx driver EDAC/sb_edac: Fix the compile warning of large frame size EDAC/{skx_common,i10nm}: Remove the AMAP register for determing DDR5 EDAC/{skx_common,skx,i10nm}: Move the common debug code to skx_common EDAC/igen6: Fix conversion of system address to physical memory address Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>	2024-09-09 10:51:30 +02:00
Rob Herring (Arm)	a5f285d9cf	EDAC: Drop obsolete PPC4xx driver Since `47d13a269b` ("powerpc/40x: Remove 40x platforms.") support for PPC40x platforms has been removed. While the EDAC driver also mentions PPC440 and PPC460 processors, the driver refuses to probe on anything other than PPC405. It's unlikely support will ever be added at this point for these other old platforms, so the driver can be removed. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Link: https://lore.kernel.org/r/20240904192224.3060307-2-robh@kernel.org	2024-09-05 16:56:38 +02:00
Qiuxu Zhuo	43247abd09	EDAC/sb_edac: Fix the compile warning of large frame size Compiling sb_edac driver with GCC 11.4.0 and the W=1 option reported the following warning: drivers/edac/sb_edac.c: In function ‘sbridge_mce_output_error’: drivers/edac/sb_edac.c:3249:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=] As there is no concurrent invocation of sbridge_mce_output_error(), fix this warning by moving the large-size variables 'msg' and 'msg_full' from the stack to the pre-allocated data segment. [Tony: Fix checkpatch warnings for code alignment & use of strcpy()] Reported-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/all/20240829120903.84152-1-qiuxu.zhuo@intel.com	2024-09-03 15:09:22 -07:00
Qiuxu Zhuo	7a33c144c2	EDAC/{skx_common,i10nm}: Remove the AMAP register for determing DDR5 The configuration flag 'res_config->support_ddr5 = true' sufficiently indicates DDR5 memory support for Sapphire Rapids and Granite Rapids. Additionally, the i10nm_edac driver doesn't need to use the AMAP register for setting the 'fine_grain_bank' of each DIMM. Therefore, remove the AMAP register for determining DDR5. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/all/20240829061309.57738-1-qiuxu.zhuo@intel.com	2024-09-03 12:36:59 -07:00
Qiuxu Zhuo	8b93582353	EDAC/{skx_common,skx,i10nm}: Move the common debug code to skx_common Commit afdb82fd763c ("EDAC, i10nm: make skx_common.o a separate module") made skx_common.o a separate module. With skx_common.o now a separate module, move the common debug code setup_{skx,i10nm}_debug() and teardown_{skx,i10nm}_debug() in {skx,i10nm}_base.c to skx_common.c to reduce code duplication. Additionally, prefix these function names with 'skx' to maintain consistency with other names in the file. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/all/20240829055101.56245-1-qiuxu.zhuo@intel.com	2024-09-03 12:35:06 -07:00
Qiuxu Zhuo	0ad875f442	EDAC/igen6: Fix conversion of system address to physical memory address The conversion of system address to physical memory address (as viewed by the memory controller) by igen6_edac is incorrect when the system address is above the TOM (Total amount Of populated physical Memory) for Elkhart Lake and Ice Lake (Neural Network Processor). Fix this conversion. Fixes: `10590a9d4f` ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/stable/20240814061011.43545-1-qiuxu.zhuo%40intel.com	2024-09-03 12:27:19 -07:00
Shubhrajyoti Datta	35e6dbfe18	EDAC/synopsys: Fix error injection on Zynq UltraScale+ The Zynq UltraScale+ MPSoC DDR has a disjoint memory from 2GB to 32GB. The DDR host interface has a contiguous memory so while injecting errors, the driver should remove the hole else the injection fails as the address translation is incorrect. Introduce a get_mem_info() function pointer and set it for Zynq UltraScale+ platform to return host address. Fixes: `1a81361f75` ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller") Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240711100656.31376-1-shubhrajyoti.datta@amd.com	2024-08-01 16:27:46 +02:00
Linus Torvalds	1a251f52cf	minmax: make generic MIN() and MAX() macros available everywhere This just standardizes the use of MIN() and MAX() macros, with the very traditional semantics. The goal is to use these for C constant expressions and for top-level / static initializers, and so be able to simplify the min()/max() macros. These macro names were used by various kernel code - they are very traditional, after all - and all such users have been fixed up, with a few different approaches: - trivial duplicated macro definitions have been removed Note that 'trivial' here means that it's obviously kernel code that already included all the major kernel headers, and thus gets the new generic MIN/MAX macros automatically. - non-trivial duplicated macro definitions are guarded with #ifndef This is the "yes, they define their own versions, but no, the include situation is not entirely obvious, and maybe they don't get the generic version automatically" case. - strange use case #1 A couple of drivers decided that the way they want to describe their versioning is with #define MAJ 1 #define MIN 2 #define DRV_VERSION __stringify(MAJ) "." __stringify(MIN) which adds zero value and I just did my Alexander the Great impersonation, and rewrote that pointless Gordian knot as #define DRV_VERSION "1.2" instead. - strange use case #2 A couple of drivers thought that it's a good idea to have a random 'MIN' or 'MAX' define for a value or index into a table, rather than the traditional macro that takes arguments. These values were re-written as C enum's instead. The new function-line macros only expand when followed by an open parenthesis, and thus don't clash with enum use. Happily, there weren't really all that many of these cases, and a lot of users already had the pattern of using '#ifndef' guarding (or in one case just using '#undef MIN') before defining their own private version that does the same thing. I left such cases alone. Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-07-28 15:49:18 -07:00
Linus Torvalds	4477b39c32	minmax: add a few more MIN_T/MAX_T users Commit `3a7e02c040` ("minmax: avoid overly complicated constant expressions in VM code") added the simpler MIN_T/MAX_T macros in order to avoid some excessive expansion from the rather complicated regular min/max macros. The complexity of those macros stems from two issues: (a) trying to use them in situations that require a C constant expression (in static initializers and for array sizes) (b) the type sanity checking and MIN_T/MAX_T avoids both of these issues. Now, in the whole (long) discussion about all this, it was pointed out that the whole type sanity checking is entirely unnecessary for min_t/max_t which get a fixed type that the comparison is done in. But that still leaves min_t/max_t unnecessarily complicated due to worries about the C constant expression case. However, it turns out that there really aren't very many cases that use min_t/max_t for this, and we can just force-convert those. This does exactly that. Which in turn will then allow for much simpler implementations of min_t()/max_t(). All the usual "macros in all upper case will evaluate the arguments multiple times" rules apply. We should do all the same things for the regular min/max() vs MIN/MAX() cases, but that has the added complexity of various drivers defining their own local versions of MIN/MAX, so that needs another level of fixes first. Link: https://lore.kernel.org/all/b47fad1d0cf8449886ad148f8c013dae@AcuMS.aculab.com/ Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-07-28 13:41:14 -07:00
Linus Torvalds	222dfb8326	Merge tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Make error checking of AMD SMN accesses more robust in the callers as they're the only ones who can interpret the results properly - The usual cleanups and fixes, left and right * tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kmsan: Fix hook for unaligned accesses x86/platform/iosf_mbi: Convert PCIBIOS_* return codes to errnos x86/pci/xen: Fix PCIBIOS_* return code handling x86/pci/intel_mid_pci: Fix PCIBIOS_* return code handling x86/of: Return consistent error type from x86_of_pci_irq_enable() hwmon: (k10temp) Rename _data variable hwmon: (k10temp) Remove unused HAVE_TDIE() macro hwmon: (k10temp) Reduce k10temp_get_ccd_support() parameters hwmon: (k10temp) Define a helper function to read CCD temperature x86/amd_nb: Enhance SMN access error checking hwmon: (k10temp) Check return value of amd_smn_read() EDAC/amd64: Check return value of amd_smn_read() EDAC/amd64: Remove unused register accesses tools/x86/kcpuid: Add missing dir via Makefile x86, arm: Add missing license tag to syscall tables files	2024-07-15 19:53:07 -07:00
Linus Torvalds	8028e290b6	Merge tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - The AMD memory controllers data fabric version 4.5 supports non-power-of-2 denormalization in the sense that certain bits of the system physical address cannot be reconstructed from the normalized address reported by the RAS hardware. Add support for handling such addresses - Switch the EDAC drivers to the new Intel CPU model defines - The usual fixes and cleanups all over the place * tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Add missing MODULE_DESCRIPTION() macros EDAC/dmc520: Use devm_platform_ioremap_resource() EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization RAS/AMD/ATL: Validate address map when information is gathered RAS/AMD/ATL: Expand helpers for adding and removing base and hole RAS/AMD/ATL: Read DRAM hole base early RAS/AMD/ATL: Add amd_atl pr_fmt() prefix RAS/AMD/ATL: Add a missing module description EDAC, i10nm: make skx_common.o a separate module EDAC/skx: Switch to new Intel CPU model defines EDAC/sb_edac: Switch to new Intel CPU model defines EDAC, pnd2: Switch to new Intel CPU model defines EDAC/i10nm: Switch to new Intel CPU model defines EDAC/ghes: Add missing newline to pr_info() statement RAS/AMD/ATL: Add missing newline to pr_info() statement EDAC/thunderx: Remove unused struct error_syndrome	2024-07-15 18:20:24 -07:00
Jeff Johnson	3afa157f43	EDAC: Add missing MODULE_DESCRIPTION() macros With ARCH=arm64 make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/edac/layerscape_edac_mod.o Add the missing invocation of the MODULE_DESCRIPTION() macro to all files which have a MODULE_LICENSE(). This includes mpc85xx_edac.c and four octeon_edac-*.c files which, although they did not produce a warning with the arm64 allmodconfig configuration, may cause this warning with other configurations. [ bp: s/module/driver/ for layerscape_edac ] Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240617-md-arm64-drivers-edac-v2-1-6d6c5dd1e5da@quicinc.com	2024-06-29 16:21:01 +02:00
Jai Arora	420c324d59	EDAC/dmc520: Use devm_platform_ioremap_resource() platform_get_resource() and devm_ioremap_resource() are wrapped up in the devm_platform_ioremap_resource() helper. Use the helper and get rid of the local variable for struct resource *. Signed-off-by: Jai Arora <jai.arora@samsung.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240618110226.97395-1-jai.arora@samsung.com	2024-06-23 10:48:55 +02:00
Qiuxu Zhuo	88150cd950	EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support Arrow Lake-U/H SoCs share same IBECC registers with Meteor Lake-P SoCs. Add Arrow Lake-U/H SoC compute die IDs for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240614030354.69180-1-qiuxu.zhuo@intel.com	2024-06-14 08:08:12 -07:00
Yazen Ghannam	5ac6293047	EDAC/amd64: Check return value of amd_smn_read() Check the return value of amd_smn_read() before saving a value. This ensures invalid values aren't saved. The struct umc instance is initialized to 0 during memory allocation. Therefore, a bad read will keep the value as 0 providing the expected Read-as-Zero behavior. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240606-fix-smn-bad-read-v4-2-ffde21931c3f@amd.com	2024-06-12 11:33:45 +02:00
Yazen Ghannam	f97a8b9170	EDAC/amd64: Remove unused register accesses A number of UMC registers are read only for the purpose of debug printing. They are not used in any calculations. Nor do they have any specific debug value. Remove them. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240606-fix-smn-bad-read-v4-1-ffde21931c3f@amd.com	2024-06-12 11:33:45 +02:00
Ilpo Järvinen	f8367a74ae	EDAC/igen6: Convert PCIBIOS_* return codes to errnos errcmd_enable_error_reporting() uses pci_{read,write}_config_word() that return PCIBIOS_* codes. The return code is then returned all the way into the probe function igen6_probe() that returns it as is. The probe functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from errcmd_enable_error_reporting(). Fixes: `10590a9d4f` ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-2-ilpo.jarvinen@linux.intel.com	2024-06-04 11:29:52 +02:00
Ilpo Järvinen	3ec8ebd8a5	EDAC/amd64: Convert PCIBIOS_* return codes to errnos gpu_get_node_map() uses pci_read_config_dword() that returns PCIBIOS_* codes. The return code is then returned all the way into the module init function amd64_edac_init() that returns it as is. The module init functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from gpu_get_node_map(). For consistency, convert also the other similar cases which return PCIBIOS_* codes even if they do not have any bugs at the moment. Fixes: `4251566ebc` ("EDAC/amd64: Cache and use GPU node map") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-1-ilpo.jarvinen@linux.intel.com	2024-06-04 11:24:16 +02:00

1 2 3 4 5 ...

2037 Commits