linux-stable-mirror

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2026-04-03 12:05:13 +02:00

Author	SHA1	Message	Date
Masami Hiramatsu (Google)	a142bbc11f	tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs [ Upstream commit `b9b55c8912` ] Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs. This is for the eBPF which needs this to keep the same pt_regs interface to access registers. Thus when replacing the pt_regs with ftrace_regs in fprobes (which is used by kprobe_multi eBPF event), this will be used. If the architecture defines its own ftrace_regs, this copies partial registers to pt_regs and returns it. If not, ftrace_regs is the same as pt_regs and ftrace_partial_regs() will return ftrace_regs::regs. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Florent Revest <revest@chromium.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Link: https://lore.kernel.org/173518996761.391279.4987911298206448122.stgit@devnote2 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `aea2517999` ("x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path") Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:37 -05:00
Masami Hiramatsu (Google)	33d4e904e2	fgraph: Replace fgraph_ret_regs with ftrace_regs [ Upstream commit `a3ed4157b7` ] Use ftrace_regs instead of fgraph_ret_regs for tracing return value on function_graph tracer because of simplifying the callback interface. The CONFIG_HAVE_FUNCTION_GRAPH_RETVAL is also replaced by CONFIG_HAVE_FUNCTION_GRAPH_FREGS. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Florent Revest <revest@chromium.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/173518991508.391279.16635322774382197642.stgit@devnote2 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `aea2517999` ("x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path") Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:36 -05:00
Steven Rostedt	a701884d13	ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs [ Upstream commit `e4cf33ca48` ] Most architectures use pt_regs within ftrace_regs making a lot of the accessor functions just calls to the pt_regs internally. Instead of duplication this effort, use a HAVE_ARCH_FTRACE_REGS for architectures that have their own ftrace_regs that is not based on pt_regs and will define all the accessor functions, and for the architectures that just use pt_regs, it will leave it undefined, and the default accessor functions will be used. Note, this will also make it easier to add new accessor functions to ftrace_regs as it will mean having to touch less architectures. Cc: <linux-arch@vger.kernel.org> Cc: "x86@kernel.org" <x86@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Naveen N Rao <naveen@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/20241010202114.2289f6fd@gandalf.local.home Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # powerpc Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `aea2517999` ("x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path") Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:36 -05:00
Steven Rostedt	550a16d87d	ftrace: Make ftrace_regs abstract from direct use [ Upstream commit `7888af4166` ] ftrace_regs was created to hold registers that store information to save function parameters, return value and stack. Since it is a subset of pt_regs, it should only be used by its accessor functions. But because pt_regs can easily be taken from ftrace_regs (on most archs), it is tempting to use it directly. But when running on other architectures, it may fail to build or worse, build but crash the kernel! Instead, make struct ftrace_regs an empty structure and have the architectures define __arch_ftrace_regs and all the accessor functions will typecast to it to get to the actual fields. This will help avoid usage of ftrace_regs directly. Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/ Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Cc: "x86@kernel.org" <x86@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Naveen N Rao <naveen@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `aea2517999` ("x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path") Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-03-04 07:19:36 -05:00
Han Gao	c3268c8022	riscv: compat: fix COMPAT_UTS_MACHINE definition commit `0ea05c4f75` upstream. The COMPAT_UTS_MACHINE for riscv was incorrectly defined as "riscv". Change it to "riscv32" to reflect the correct 32-bit compat name. Fixes: `06d0e37236` ("riscv: compat: Add basic compat data type implementation") Cc: stable@vger.kernel.org Signed-off-by: Han Gao <gaohan@iscas.ac.cn> Reviewed-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org> Link: https://patch.msgid.link/20260127190711.2264664-1-gaohan@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-02-06 16:55:44 +01:00
Guo Ren (Alibaba DAMO Academy)	92ff65c660	riscv: pgtable: Cleanup useless VA_USER_XXX definitions [ Upstream commit `5e5be092ff` ] These marcos are not used after commit `b5b4287acc` ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: `b5b4287acc` ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2026-01-17 16:31:24 +01:00
Anup Patel	0e0e4f3220	RISC-V: Define pgprot_dmacoherent() for non-coherent devices [ Upstream commit `ca525d53f9` ] The pgprot_dmacoherent() is used when allocating memory for non-coherent devices and by default pgprot_dmacoherent() is same as pgprot_noncached() unless architecture overrides it. Currently, there is no pgprot_dmacoherent() definition for RISC-V hence non-coherent device memory is being mapped as IO thereby making CPU access to such memory slow. Define pgprot_dmacoherent() to be same as pgprot_writecombine() for RISC-V so that CPU access non-coherent device memory as NOCACHE which is better than accessing it as IO. Fixes: `ff689fd21c` ("riscv: add RISC-V Svpbmt extension support") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Han Gao <rabenda.cn@gmail.com> Tested-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Link: https://lore.kernel.org/r/20250820152316.1012757-1-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-10-29 14:08:58 +01:00
Palmer Dabbelt	cfcde627f0	RISC-V: Remove unnecessary include from compat.h [ Upstream commit `8d4f1e05ff` ] Without this I get a bunch of build errors like In file included from ./include/linux/sched/task_stack.h:12, from ./arch/riscv/include/asm/compat.h:12, from ./arch/riscv/include/asm/pgtable.h:115, from ./include/linux/pgtable.h:6, from ./include/linux/mm.h:30, from arch/riscv/kernel/asm-offsets.c:8: ./include/linux/kasan.h:50:37: error: ‘MAX_PTRS_PER_PTE’ undeclared here (not in a function); did you mean ‘PTRS_PER_PTE’? 50 \| extern pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS]; \| ^~~~~~~~~~~~~~~~ \| PTRS_PER_PTE ./include/linux/kasan.h:51:8: error: unknown type name ‘pmd_t’; did you mean ‘pgd_t’? 51 \| extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD]; \| ^~~~~ \| pgd_t ./include/linux/kasan.h:51:37: error: ‘MAX_PTRS_PER_PMD’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 51 \| extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD]; \| ^~~~~~~~~~~~~~~~ \| PTRS_PER_PGD ./include/linux/kasan.h:52:8: error: unknown type name ‘pud_t’; did you mean ‘pgd_t’? 52 \| extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD]; \| ^~~~~ \| pgd_t ./include/linux/kasan.h:52:37: error: ‘MAX_PTRS_PER_PUD’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 52 \| extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD]; \| ^~~~~~~~~~~~~~~~ \| PTRS_PER_PGD ./include/linux/kasan.h:53:8: error: unknown type name ‘p4d_t’; did you mean ‘pgd_t’? 53 \| extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D]; \| ^~~~~ \| pgd_t ./include/linux/kasan.h:53:37: error: ‘MAX_PTRS_PER_P4D’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 53 \| extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D]; \| ^~~~~~~~~~~~~~~~ \| PTRS_PER_PGD Link: https://lore.kernel.org/r/20241126143250.29708-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-09-19 16:35:50 +02:00
Radim Krčmář	fecd903917	riscv: use lw when reading int cpu in asm_per_cpu commit `f4ea67a722` upstream. REG_L is wrong, because thread_info.cpu is 32-bit, not xlen-bit wide. The struct currently has a hole after cpu, so little endian accesses seemed fine. Fixes: `be97d0db5f` ("riscv: VMAP_STACK overflow detection thread-safe") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Radim Krčmář <rkrcmar@ventanamicro.com> Link: https://lore.kernel.org/r/20250725165410.2896641-5-rkrcmar@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-09-09 18:58:26 +02:00
Sasha Levin	1fc00e1451	riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg [ Upstream commit `1898300abf` ] Sign extend also an unsigned compare value to match what lr.w is doing. Otherwise try_cmpxchg may spuriously return true when used on a u32 value that has the sign bit set, as it happens often in inode_set_ctime_current. Do this in three conversion steps. The first conversion to long is needed to avoid a -Wpointer-to-int-cast warning when arch_cmpxchg is used with a pointer type. Then convert to int and back to long to always sign extend the 32-bit value to 64-bit. Fixes: `6c58f25e69` ("riscv/atomic: Fix sign extension for RV64I") Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Xi Ruoyao <xry111@xry111.site> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/mvmed0k4prh.fsf@suse.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-06 11:01:49 +02:00
Nam Cao	fe30c30bf3	Revert "riscv: Define TASK_SIZE_MAX for __access_ok()" commit `890ba5be63` upstream. This reverts commit `ad5643cf2f` ("riscv: Define TASK_SIZE_MAX for __access_ok()"). This commit changes TASK_SIZE_MAX to be LONG_MAX to optimize access_ok(), because the previous TASK_SIZE_MAX (default to TASK_SIZE) requires some computation. The reasoning was that all user addresses are less than LONG_MAX, and all kernel addresses are greater than LONG_MAX. Therefore access_ok() can filter kernel addresses. Addresses between TASK_SIZE and LONG_MAX are not valid user addresses, but access_ok() let them pass. That was thought to be okay, because they are not valid addresses at hardware level. Unfortunately, one case is missed: get_user_pages_fast() happily accepts addresses between TASK_SIZE and LONG_MAX. futex(), for instance, uses get_user_pages_fast(). This causes the problem reported by Robert [1]. Therefore, revert this commit. TASK_SIZE_MAX is changed to the default: TASK_SIZE. This unfortunately reduces performance, because TASK_SIZE is more expensive to compute compared to LONG_MAX. But correctness first, we can think about optimization later, if required. Reported-by: <rtm@csail.mit.edu> Closes: https://lore.kernel.org/linux-riscv/77605.1750245028@localhost/ Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: `ad5643cf2f` ("riscv: Define TASK_SIZE_MAX for __access_ok()") Link: https://lore.kernel.org/r/20250619155858.1249789-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-06 11:01:39 +02:00
Samuel Holland	6427b5c0f0	riscv: Allow NOMMU kernels to access all of RAM [ Upstream commit `2c0391b29b` ] NOMMU kernels currently cannot access memory below the kernel link address. Remove this restriction by setting PAGE_OFFSET to the actual start of RAM, as determined from the devicetree. The kernel link address must be a constant, so keep using CONFIG_PAGE_OFFSET for that purpose. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Jesse Taube <mr.bossman075@gmail.com> Link: https://lore.kernel.org/r/20241026171441.3047904-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-29 11:02:05 +02:00
Andrew Jones	c9ffbc0792	riscv: Provide all alternative macros all the time [ Upstream commit `fb53a9aa5f` ] We need to provide all six forms of the alternative macros (ALTERNATIVE, ALTERNATIVE_2, _ALTERNATIVE_CFG, _ALTERNATIVE_CFG_2, __ALTERNATIVE_CFG, __ALTERNATIVE_CFG_2) for all four cases derived from the two ifdefs (RISCV_ALTERNATIVE, __ASSEMBLY__) in order to ensure all configs can compile. Define this missing ones and ensure all are defined to consume all parameters passed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504130710.3IKz6Ibs-lkp@intel.com/ Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250414120947.135173-2-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-02 07:59:27 +02:00
Björn Töpel	4715ab8435	riscv: Replace function-like macro by static inline function [ Upstream commit `121f34341d` ] The flush_icache_range() function is implemented as a "function-like macro with unused parameters", which can result in "unused variables" warnings. Replace the macro with a static inline function, as advised by Documentation/process/coding-style.rst. Fixes: `08f051eda3` ("RISC-V: Flush I$ when making a dirty page executable") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250419111402.1660267-1-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-05-02 07:59:04 +02:00
Nathan Chancellor	3f1c81ae13	riscv: Avoid fortify warning in syscall_get_arguments() commit `adf53771a3` upstream. When building with CONFIG_FORTIFY_SOURCE=y and W=1, there is a warning because of the memcpy() in syscall_get_arguments(): In file included from include/linux/string.h:392, from include/linux/bitmap.h:13, from include/linux/cpumask.h:12, from arch/riscv/include/asm/processor.h:55, from include/linux/sched.h:13, from kernel/ptrace.c:13: In function 'fortify_memcpy_chk', inlined from 'syscall_get_arguments.isra' at arch/riscv/include/asm/syscall.h:66:2: include/linux/fortify-string.h:580:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 580 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors The fortified memcpy() routine enforces that the source is not overread and the destination is not overwritten if the size of either field and the size of the copy are known at compile time. The memcpy() in syscall_get_arguments() intentionally overreads from a1 to a5 in 'struct pt_regs' but this is bigger than the size of a1. Normally, this could be solved by wrapping a1 through a5 with struct_group() but there was already a struct_group() applied to these members in commit `bba547810c` ("riscv: tracing: Fix __write_overflow_field in ftrace_partial_regs()"). Just avoid memcpy() altogether and write the copying of args from regs manually, which clears up the warning at the expense of three extra lines of code. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Dmitry V. Levin <ldv@strace.io> Fixes: `e2c0cdfba7` ("RISC-V: User-facing API") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250409-riscv-avoid-fortify-warning-syscall_get_arguments-v1-1-7853436d4755@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-04-25 10:47:54 +02:00
WangYuli	999bd7bb21	riscv: KGDB: Do not inline arch_kgdb_breakpoint() [ Upstream commit `3af4bec9c1` ] The arch_kgdb_breakpoint() function defines the kgdb_compiled_break symbol using inline assembly. There's a potential issue where the compiler might inline arch_kgdb_breakpoint(), which would then define the kgdb_compiled_break symbol multiple times, leading to fail to link vmlinux.o. This isn't merely a potential compilation problem. The intent here is to determine the global symbol address of kgdb_compiled_break, and if this function is inlined multiple times, it would logically be a grave error. Link: https://lore.kernel.org/all/4b4187c1-77e5-44b7-885f-d6826723dd9a@sifive.com/ Link: https://lore.kernel.org/all/5b0adf9b-2b22-43fe-ab74-68df94115b9a@ghiti.fr/ Link: https://lore.kernel.org/all/23693e7f-4fff-40f3-a437-e06d827278a5@ghiti.fr/ Fixes: `fe89bd2be8` ("riscv: Add KGDB support") Co-developed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/r/F22359AFB6FF9FD8+20250411073222.56820-1-wangyuli@uniontech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-25 10:47:48 +02:00
Juhan Jin	4119e80ce2	riscv: ftrace: Add parentheses in macro definitions of make_call_t0 and make_call_ra [ Upstream commit `5f1a58ed91` ] This patch adds parentheses to parameters caller and callee of macros make_call_t0 and make_call_ra. Every existing invocation of these two macros uses a single variable for each argument, so the absence of the parentheses seems okay. However, future invocations might use more complex expressions as arguments. For example, a future invocation might look like this: make_call_t0(a - b, c, call). Without parentheses in the macro definition, the macro invocation expands to: ... unsigned int offset = (unsigned long) c - (unsigned long) a - b; ... which is clearly wrong. The use of parentheses ensures arguments are correctly evaluated and potentially saves future users of make_call_t0 and make_call_ra debugging trouble. Fixes: `6724a76cff` ("riscv: ftrace: Reduce the detour code size to half") Signed-off-by: Juhan Jin <juhan.jin@foxmail.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/tencent_AE90AA59903A628E87E9F80E563DA5BA5508@qq.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-04-10 14:39:32 +02:00
Ryan Roberts	a684bad77e	mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear() commit `02410ac72a` upstream. In order to fix a bug, arm64 needs to be told the size of the huge page for which the huge_pte is being cleared in huge_ptep_get_and_clear(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear() and set_huge_pte_at(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, loongarch, mips, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. Cc: stable@vger.kernel.org Fixes: `66b3923a1a` ("arm64: hugetlb: add support for PTE contiguous bit") Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> # s390 Link: https://lore.kernel.org/r/20250226120656.2400136-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-03-13 13:02:17 +01:00
Andreas Schwab	ac354e54dc	riscv/futex: sign extend compare value in atomic cmpxchg commit `599c44cd21` upstream. Make sure the compare value in the lr/sc loop is sign extended to match what lr.w does. Fortunately, due to the compiler keeping the register contents sign extended anyway the lack of the explicit extension didn't result in wrong code so far, but this cannot be relied upon. Fixes: `b90edb3301` ("RISC-V: Add futex support.") Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/mvmfrkv2vhz.fsf@suse.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-03-07 18:25:44 +01:00
Atish Patra	6191b1a474	drivers/perf: riscv: Fix Platform firmware event data [ Upstream commit `fc58db9aeb` ] Platform firmware event data field is allowed to be 62 bits for Linux as uppper most two bits are reserved to indicate SBI fw or platform specific firmware events. However, the event data field is masked as per the hardware raw event mask which is not correct. Fix the platform firmware event data field with proper mask. Fixes: `f0c9363db2` ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-17 13:40:45 +01:00
Xu Lu	d2bd51954a	riscv: mm: Fix the out of bound issue of vmemmap address [ Upstream commit `f754f27e98` ] In sparse vmemmap model, the virtual address of vmemmap is calculated as: ((struct page )VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)). And the struct page's va can be calculated with an offset: (vmemmap + (pfn)). However, when initializing struct pages, kernel actually starts from the first page from the same section that phys_ram_base belongs to. If the first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then we get an va below VMEMMAP_START when calculating va for it's struct page. For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the first page in the same section is actually pfn 0x80000. During init_unavailable_range(), we will initialize struct page for pfn 0x80000 with virtual address ((struct page )VMEMMAP_START - 0x2000), which is below VMEMMAP_START as well as PCI_IO_END. This commit fixes this bug by introducing a new variable 'vmemmap_start_pfn' which is aligned with memory section size and using it to calculate vmemmap address instead of phys_ram_base. Fixes: `a11dd49dcb` ("riscv: Sparse-Memory/vmemmap out-of-bounds fix") Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-01-17 13:40:44 +01:00
Alexandre Ghiti	3abfc4130c	riscv: Fix IPIs usage in kfence_protect_page() commit `b3431a8bb3` upstream. flush_tlb_kernel_range() may use IPIs to flush the TLBs of all the cores, which triggers the following warning when the irqs are disabled: [ 3.455330] WARNING: CPU: 1 PID: 0 at kernel/smp.c:815 smp_call_function_many_cond+0x452/0x520 [ 3.456647] Modules linked in: [ 3.457218] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7-00010-g91d3de7240b8 #1 [ 3.457416] Hardware name: QEMU QEMU Virtual Machine, BIOS [ 3.457633] epc : smp_call_function_many_cond+0x452/0x520 [ 3.457736] ra : on_each_cpu_cond_mask+0x1e/0x30 [ 3.457786] epc : ffffffff800b669a ra : ffffffff800b67c2 sp : ff2000000000bb50 [ 3.457824] gp : ffffffff815212b8 tp : ff6000008014f080 t0 : 000000000000003f [ 3.457859] t1 : ffffffff815221e0 t2 : 000000000000000f s0 : ff2000000000bc10 [ 3.457920] s1 : 0000000000000040 a0 : ffffffff815221e0 a1 : 0000000000000001 [ 3.457953] a2 : 0000000000010000 a3 : 0000000000000003 a4 : 0000000000000000 [ 3.458006] a5 : 0000000000000000 a6 : ffffffffffffffff a7 : 0000000000000000 [ 3.458042] s2 : ffffffff815223be s3 : 00fffffffffff000 s4 : ff600001ffe38fc0 [ 3.458076] s5 : ff600001ff950d00 s6 : 0000000200000120 s7 : 0000000000000001 [ 3.458109] s8 : 0000000000000001 s9 : ff60000080841ef0 s10: 0000000000000001 [ 3.458141] s11: ffffffff81524812 t3 : 0000000000000001 t4 : ff60000080092bc0 [ 3.458172] t5 : 0000000000000000 t6 : ff200000000236d0 [ 3.458203] status: 0000000200000100 badaddr: ffffffff800b669a cause: 0000000000000003 [ 3.458373] [<ffffffff800b669a>] smp_call_function_many_cond+0x452/0x520 [ 3.458593] [<ffffffff800b67c2>] on_each_cpu_cond_mask+0x1e/0x30 [ 3.458625] [<ffffffff8000e4ca>] __flush_tlb_range+0x118/0x1ca [ 3.458656] [<ffffffff8000e6b2>] flush_tlb_kernel_range+0x1e/0x26 [ 3.458683] [<ffffffff801ea56a>] kfence_protect+0xc0/0xce [ 3.458717] [<ffffffff801e9456>] kfence_guarded_free+0xc6/0x1c0 [ 3.458742] [<ffffffff801e9d6c>] __kfence_free+0x62/0xc6 [ 3.458764] [<ffffffff801c57d8>] kfree+0x106/0x32c [ 3.458786] [<ffffffff80588cf2>] detach_buf_split+0x188/0x1a8 [ 3.458816] [<ffffffff8058708c>] virtqueue_get_buf_ctx+0xb6/0x1f6 [ 3.458839] [<ffffffff805871da>] virtqueue_get_buf+0xe/0x16 [ 3.458880] [<ffffffff80613d6a>] virtblk_done+0x5c/0xe2 [ 3.458908] [<ffffffff8058766e>] vring_interrupt+0x6a/0x74 [ 3.458930] [<ffffffff800747d8>] __handle_irq_event_percpu+0x7c/0xe2 [ 3.458956] [<ffffffff800748f0>] handle_irq_event+0x3c/0x86 [ 3.458978] [<ffffffff800786cc>] handle_simple_irq+0x9e/0xbe [ 3.459004] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459027] [<ffffffff804bf87c>] imsic_handle_irq+0xba/0x120 [ 3.459056] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459080] [<ffffffff804bdb76>] riscv_intc_aia_irq+0x24/0x34 [ 3.459103] [<ffffffff809d0452>] handle_riscv_irq+0x2e/0x4c [ 3.459133] [<ffffffff809d923e>] call_on_irq_stack+0x32/0x40 So only flush the local TLB and let the lazy kfence page fault handling deal with the faults which could happen when a core has an old protected pte version cached in its TLB. That leads to potential inaccuracies which can be tolerated when using kfence. Fixes: `47513f243b` ("riscv: Enable KFENCE for riscv64") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241209074125.52322-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-19 18:13:01 +01:00
Jesse Taube	c4ae2baea5	RISC-V: Check scalar unaligned access on all CPUs commit `8d20a739f1` upstream. Originally, the check_unaligned_access_emulated_all_cpus function only checked the boot hart. This fixes the function to check all harts. Fixes: `71c54b3d16` ("riscv: report misaligned accesses emulation to hwprobe") Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Evan Green <evan@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-1-5b33500160f8@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-05 14:02:52 +01:00
Alexandre Ghiti	cfb10de185	riscv: Fix kernel stack size when KASAN is enabled We use Kconfig to select the kernel stack size, doubling the default size if KASAN is enabled. But that actually only works if KASAN is selected from the beginning, meaning that if KASAN config is added later (for example using menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the default size, which is not enough for KASAN as reported in [1]. So fix this by moving the logic to compute the right kernel stack into a header. Fixes: `a7555f6b62` ("riscv: stack: Add config of thread stack size") Reported-by: syzbot+ba9eac24453387a9d502@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000eb301906222aadc2@google.com/ [1] Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240917150328.59831-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-01 13:08:11 -07:00
Linus Torvalds	97d8894b6f	Merge tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support using Zkr to seed KASLR - Support IPI-triggered CPU backtracing - Support for generic CPU vulnerabilities reporting to userspace - A few cleanups for missing licenses - The size limit on the XIP kernel has been removed - Support for tracing userspace stacks - Support for the Svvptc extension - Various cleanups and fixes throughout the tree * tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits) crash: Fix riscv64 crash memory reserve dead loop perf/riscv-sbi: Add platform specific firmware event handling tools: Optimize ring buffer for riscv tools: Add riscv barrier implementation RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE riscv: Enable bitops instrumentation riscv: Omit optimized string routines when using KASAN ACPI: RISCV: Make acpi_numa_get_nid() to be static riscv: Randomize lower bits of stack address selftests: riscv: Allow mmap test to compile on 32-bit riscv: Make riscv_isa_vendor_ext_andes array static riscv: Use LIST_HEAD() to simplify code riscv: defconfig: Disable RZ/Five peripheral support RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup riscv: avoid Imbalance in RAS riscv: cacheinfo: Add back init_cache_level() function riscv: Remove unused _TIF_WORK_MASK drivers/perf: riscv: Remove redundant macro check riscv: define ILLEGAL_POINTER_VALUE for 64bit ...	2024-09-24 10:59:17 -07:00
Linus Torvalds	617a814f14	Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...	2024-09-21 07:29:05 -07:00
Mayuresh Chitale	f0c9363db2	perf/riscv-sbi: Add platform specific firmware event handling The SBI v2.0 specification pointed to by the link below reserves the event code 0xffff for platform specific firmware events. Update the driver to be able to parse and program such events. The platform specific firmware events must now be specified in the perf command as below: perf stat -e rCxxx ... where bits[63:62] = 0x3 of the event config indicate a platform specific firmware event and xxx indicate the actual event code which is passed as the event data. Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v2.0/riscv-sbi.pdf Link: https://lore.kernel.org/r/20240812051109.6496-1-mchitale@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-20 05:58:11 -07:00
Palmer Dabbelt	ad380f6a0a	RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t I recently ended up with a warning on some compilers along the lines of CC kernel/resource.o In file included from include/linux/ioport.h:16, from kernel/resource.c:15: kernel/resource.c: In function 'gfr_start': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 \| ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) \| ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 \| __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) \| ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 \| #define min_t(type, x, y) __cmp_once(min, type, x, y) \| ^~~~~~~~~~ kernel/resource.c:1829:23: note: in expansion of macro 'min_t' 1829 \| end = min_t(resource_size_t, base->end, \| ^~~~~ kernel/resource.c: In function 'gfr_continue': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 \| ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) \| ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 \| __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) \| ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 \| #define min_t(type, x, y) __cmp_once(min, type, x, y) \| ^~~~~~~~~~ kernel/resource.c:1847:24: note: in expansion of macro 'min_t' 1847 \| addr <= min_t(resource_size_t, base->end, \| ^~~~~ cc1: all warnings being treated as errors which looks like a real problem: our phys_addr_t is only 32 bits now, so having 34-bit masks is just going to result in overflows. Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240731162159.9235-2-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-20 01:32:39 -07:00
Palmer Dabbelt	5835437609	Merge patch series "riscv: Improve KASAN coverage to fix unit tests" Samuel Holland <samuel.holland@sifive.com> says: This series fixes two areas where uninstrumented assembly routines caused gaps in KASAN coverage on RISC-V, which were caught by KUnit tests. The KASAN KUnit test suite passes after applying this series. This series fixes the following test failures: # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520 KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524 KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred not ok 60 kasan_strings # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531 KASAN failure expected in "set_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533 KASAN failure expected in "clear_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535 KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536 KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537 KASAN failure expected in "change_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543 KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545 KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546 KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548 KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred not ok 61 kasan_bitops_generic Samuel Holland (2): riscv: Omit optimized string routines when using KASAN riscv: Enable bitops instrumentation arch/riscv/include/asm/bitops.h \| 43 ++++++++++++++++++--------------- arch/riscv/include/asm/string.h \| 2 ++ arch/riscv/kernel/riscv_ksyms.c \| 3 --- arch/riscv/lib/Makefile \| 2 ++ arch/riscv/lib/strcmp.S \| 1 + arch/riscv/lib/strlen.S \| 1 + arch/riscv/lib/strncmp.S \| 1 + arch/riscv/purgatory/Makefile \| 2 ++ 8 files changed, 32 insertions(+), 23 deletions(-) * b4-shazam-merge: riscv: Enable bitops instrumentation riscv: Omit optimized string routines when using KASAN Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-19 01:10:44 -07:00
Samuel Holland	77514915b7	riscv: Enable bitops instrumentation Instead of implementing the bitops functions directly in assembly, provide the arch_-prefixed versions and use the wrappers from asm-generic to add instrumentation. This improves KASAN coverage and fixes the kasan_bitops_generic() unit test. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240801033725.28816-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-19 01:10:04 -07:00
Samuel Holland	58ff537109	riscv: Omit optimized string routines when using KASAN The optimized string routines are implemented in assembly, so they are not instrumented for use with KASAN. Fall back to the C version of the routines in order to improve KASAN coverage. This fixes the kasan_strings() unit test. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-19 01:10:00 -07:00
Hanjun Guo	21d98d658f	ACPI: RISCV: Make acpi_numa_get_nid() to be static acpi_numa_get_nid() is only called in acpi_numa.c for riscv, no need to add it in head file, so make it static and remove related functions in the asm/acpi.h. Spotted by doing some cleanup for arm64 ACPI. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Haibo Xu <haibo1.xu@intel.com> Link: https://lore.kernel.org/r/20240811031804.3347298-1-guohanjun@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-17 12:02:48 -07:00
Yunhui Cui	048e2906d4	riscv: Randomize lower bits of stack address Implement arch_align_stack() to randomize the lower bits of the stack address. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240625030502.68988-1-cuiyunhui@bytedance.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-17 08:05:10 -07:00
Linus Torvalds	11b3125073	Merge tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream version 20240827, add support for ACPI-based enumeration of interrupt controllers on RISC-V along with some related irqchip updates, clean up the ACPI device object sysfs interface, add some quirks for backlight handling and IRQ overrides, fix assorted issues and clean up code. Specifics: - Check return value in acpi_db_convert_to_package() (Pei Xiao) - Detect FACS and allow setting the waking vector on reduced-hardware ACPI platforms (Jiaqing Zhao) - Allow ACPICA to represent semaphores as integers (Adrien Destugues) - Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui) - Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to DBG2 (Sia Jee Heng) - Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose Marinho) - Correct the typo in struct acpi_mpam_msc_node member (Punit Agrawal) - Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to prevent a Stall() violation warning from being printed every time this takes place (Vasily Khoruzhick) - Allow PCC Data Type in MCTP resource (Adam Young) - Fix memory leaks on acpi_ps_get_next_namepath() and acpi_ps_get_next_field() failures (Armin Wolf) - Add support for supressing leading zeros in hex strings when converting them to integers and update integer-to-hex-string conversions in ACPICA (Armin Wolf) - Add support for Windows 11 22H2 _OSI string (Armin Wolf) - Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski) - Add extended linear address mode to HMAT MSCIS in ACPICA (Dave Jiang) - Handle empty connection_node in iasl (Aleksandrs Vinarskis) - Allow for more flexibility in _DSM args (Saket Dumbre) - Setup for ACPICA release 20240827 (Saket Dumbre) - Add ACPI device enumeration support for interrupt controller probing including taking dependencies into account (Sunil V L) - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L) - Add ACPI support for AIA in riscv-intc and add ACPI support to riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L) - Do not release locks during operation region accesses in the ACPI EC driver (Rafael Wysocki) - Fix up the _STR handling in the ACPI device object sysfs interface, make it represent the device object attributes as an attribute group and make it rely on driver core functionality for sysfs attrubute management (Thomas Weißschuh) - Extend error messages printed to the kernel log when acpi_evaluate_dsm() fails to include revision and function number (David Wang) - Add a new AMDI0015 platform device ID to the ACPi APD driver for AMD SoCs (Shyam Sundar S K) - Use the driver core for the async probing management in the ACPI battery driver (Thomas Weißschuh) - Remove redundant initalizations of a local variable to NULL from the ACPI battery driver (Ilpo Järvinen) - Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr Mishin) - Add support for setting the EPP register through the ACPI CPPC sysfs interface if it is in FFH (Mario Limonciello) - Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger) - Reduce the log level of a per-CPU message about idle states in the ACPI processor driver (Li RongQing) - Fix crash in exit_round_robin() in the ACPI processor aggregator device (PAD) driver (Seiji Nishikawa) - Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI backlight driver (Hans de Goede) - Make the DMI checks related to backlight handling on Lenovo Yoga Tab 3 X90F less strict (Hans de Goede) - Enforce native backlight handling on Apple MacbookPro9,2 (Esther Shimanovich) - Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim Khan, Werner Sembach) - Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones) - Define and use symbols for device and class name lengths in the ACPI bus type code and make the code use strscpy() instead of strcpy() in several places (Muhammad Qasim Abdul Majeed)" * tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits) ACPI: resource: Add another DMI match for the TongFang GMxXGxx ACPI: CPPC: Add support for setting EPP register in FFH ACPI: PM: Quirk ASUS ROG M16 to default to S3 sleep ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 ACPI: battery: use driver core managed async probing ACPI: button: Use strscpy() instead of strcpy() ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB ACPI: CPPC: Fix MASK_VAL() usage irqchip/sifive-plic: Add ACPI support ACPICA: Setup for ACPICA release 20240827 ACPICA: Allow for more flexibility in _DSM args ACPICA: iasl: handle empty connection_node ACPICA: HMAT: Add extended linear address mode to MSCIS ACPICA: Avoid warning for Dump Functions ACPICA: Add support for Windows 11 22H2 _OSI string ACPICA: Update integer-to-hex-string conversions ACPICA: Add support for supressing leading zeros in hex strings ACPICA: Allow for supressing leading zeros when using acpi_ex_convert_to_ascii() ACPICA: Fix memory leak if acpi_ps_get_next_field() fails ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails ...	2024-09-16 07:41:48 +02:00
Palmer Dabbelt	7e340f4fad	Merge patch series "Svvptc extension to remove preventive sfence.vma" Alexandre Ghiti <alexghiti@rivosinc.com> says: In RISC-V, after a new mapping is established, a sfence.vma needs to be emitted for different reasons: - if the uarch caches invalid entries, we need to invalidate it otherwise we would trap on this invalid entry, - if the uarch does not cache invalid entries, a reordered access could fail to see the new mapping and then trap (sfence.vma acts as a fence). We can actually avoid emitting those (mostly) useless and costly sfence.vma by handling the traps instead: - for new kernel mappings: only vmalloc mappings need to be taken care of, other new mapping are rare and already emit the required sfence.vma if needed. That must be achieved very early in the exception path as explained in patch 3, and this also fixes our fragile way of dealing with vmalloc faults. - for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can take some gratuitous page faults (which are very unlikely though). Patch 1 and 2 introduce Svvptc extension probing. On our uarch that does not cache invalid entries and a 6.5 kernel, the gains are measurable: * Kernel boot: 6% * ltp - mmapstress01: 8% * lmbench - lat_pagefault: 20% * lmbench - lat_mmap: 5% Here are the corresponding numbers of sfence.vma emitted: * Ubuntu boot to login: Before: ~630k sfence.vma After: ~200k sfence.vma * ltp - mmapstress01 Before: ~45k After: ~6.3k * lmbench - lat_pagefault Before: ~665k After: 832 (!) * lmbench - lat_mmap Before: ~546k After: 718 (!) Thanks to Ved and Matt Evans for triggering the discussion that led to this patchset! * b4-shazam-merge: riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc riscv: Stop emitting preventive sfence.vma for new vmalloc mappings dt-bindings: riscv: Add Svvptc ISA extension description riscv: Add ISA extension parsing for Svvptc Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-15 20:58:24 -07:00
Jinjie Ruan	cea9d27705	riscv: Remove unused _TIF_WORK_MASK Since commit `f0bddf5058` ("riscv: entry: Convert to generic entry"), _TIF_WORK_MASK is no longer used, so remove it. Fixes: `f0bddf5058` ("riscv: entry: Convert to generic entry") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240711111508.1373322-1-ruanjinjie@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-15 20:15:50 -07:00
Alexandre Ghiti	7a21b2e370	riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc The preventive sfence.vma were emitted because new mappings must be made visible to the page table walker but Svvptc guarantees that it will happen within a bounded timeframe, so no need to sfence.vma for the uarchs that implement this extension, we will then take gratuitous (but very unlikely) page faults, similarly to x86 and arm64. This allows to drastically reduce the number of sfence.vma emitted: * Ubuntu boot to login: Before: ~630k sfence.vma After: ~200k sfence.vma * ltp - mmapstress01 Before: ~45k After: ~6.3k * lmbench - lat_pagefault Before: ~665k After: 832 (!) * lmbench - lat_mmap Before: ~546k After: 718 (!) Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-15 00:11:05 -07:00
Alexandre Ghiti	503638e0ba	riscv: Stop emitting preventive sfence.vma for new vmalloc mappings In 6.5, we removed the vmalloc fault path because that can't work (see [1] [2]). Then in order to make sure that new page table entries were seen by the page table walker, we had to preventively emit a sfence.vma on all harts [3] but this solution is very costly since it relies on IPI. And even there, we could end up in a loop of vmalloc faults if a vmalloc allocation is done in the IPI path (for example if it is traced, see [4]), which could result in a kernel stack overflow. Those preventive sfence.vma needed to be emitted because: - if the uarch caches invalid entries, the new mapping may not be observed by the page table walker and an invalidation may be needed. - if the uarch does not cache invalid entries, a reordered access could "miss" the new mapping and traps: in that case, we would actually only need to retry the access, no sfence.vma is required. So this patch removes those preventive sfence.vma and actually handles the possible (and unlikely) exceptions. And since the kernel stacks mappings lie in the vmalloc area, this handling must be done very early when the trap is taken, at the very beginning of handle_exception: this also rules out the vmalloc allocations in the fault path. Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1] Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2] Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3] Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-15 00:11:04 -07:00
Alexandre Ghiti	a6efe33cc5	riscv: Add ISA extension parsing for Svvptc Add support to parse the Svvptc string in the riscv,isa string. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-15 00:11:02 -07:00
Paolo Bonzini	0cdcc99eea	Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.12 - Fix sbiret init before forwarding to userspace - Don't zero-out PMU snapshot area before freeing data - Allow legacy PMU access from guest - Fix to allow hpmcounter31 from the guest	2024-09-15 02:43:17 -04:00
Palmer Dabbelt	9ea7b92b77	Merge patch series "remove size limit on XIP kernel" Nam Cao <namcao@linutronix.de> says: Hi, For XIP kernel, the writable data section is always at offset specified in XIP_OFFSET, which is hard-coded to 32MB. Unfortunately, this means the read-only section (placed before the writable section) is restricted in size. This causes build failure if the kernel gets too large. This series remove the use of XIP_OFFSET one by one, then remove this macro entirely at the end, with the goal of lifting this size restriction. Also some cleanup and documentation along the way. * b4-shazam-merge riscv: remove limit on the size of read-only section for XIP kernel riscv: drop the use of XIP_OFFSET in create_kernel_page_table() riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa() riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET riscv: replace misleading va_kernel_pa_offset on XIP kernel riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel riscv: cleanup XIP_FIXUP macro riscv: change XIP's kernel_map.size to be size of the entire kernel ... Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:23:05 -07:00
Nam Cao	b635a84bde	riscv: remove limit on the size of read-only section for XIP kernel XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. This causes build failures if the kernel gets too big [1]. Remove this limit. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1] Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:23:02 -07:00
Nam Cao	75fdf791df	riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa() XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, remove the use of XIP_OFFSET in kernel_mapping_va_to_pa(). The macro XIP_OFFSET is used in this case to check if the virtual address is mapped to Flash or to RAM. The same check can be done with kernel_map.xiprom_sz. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/644c13d9467525a06f5d63d157875a35b2edb4bc.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:23:00 -07:00
Nam Cao	23311f57ee	riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop using XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET. Instead, use __data_loc and _sdata to do the same thing. While at it, also add a description for XIP_FIXUP_FLASH_OFFSET. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/7b3319657edd1822f3457e7e7c07aaa326cc2f87.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:22:59 -07:00
Nam Cao	e4eac34fed	riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop using XIP_OFFSET in XIP_FIXUP_OFFSET. Instead, use CONFIG_PHYS_RAM_BASE and _sdata to do the same thing. While at it, also add a description for XIP_FIXUP_OFFSET. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/dba0409518b14ee83b346e099b1f7f934daf7b74.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:22:58 -07:00
Nam Cao	5cf0896721	riscv: replace misleading va_kernel_pa_offset on XIP kernel On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike "normal" kernel, it is not the virtual-physical address offset of kernel mapping, it is the offset of kernel mapping's first virtual address to first physical address in DRAM, which is not meaningful because the kernel's first physical address is not in DRAM. For XIP kernel, there are 2 different offsets because the read-only part of the kernel resides in ROM while the rest is in RAM. The offset to ROM is in kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored anywhere: it is calculated on-the-fly. Remove this confusing "va_kernel_pa_offset" and add "va_kernel_xip_data_pa_offset" as its replacement. This new variable is the offset of virtual mapping of the kernel's data portion to the corresponding physical addresses. With the introduction of this new variable, also rename va_kernel_xip_pa_offset -> va_kernel_xip_text_pa_offset to make it clear that this one is about the .text section. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:22:57 -07:00
Nam Cao	aa3457f22f	riscv: cleanup XIP_FIXUP macro The XIP_FIXUP macro is used to fix addresses early during boot before MMU: generated code "thinks" the data section is in ROM while it is actually in RAM. So this macro corrects the addresses in the data section. This macro determines if the address needs to be fixed by checking if it is within the range starting from ROM address up to the size of (2 * XIP_OFFSET). This means if the kernel size is bigger than (2 * XIP_OFFSET), some addresses would not be fixed up. XIP kernel can still work if the above scenario does not happen. But this macro is obviously incorrect. Rewrite this macro to only fix up addresses within the data section. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/95f50a4ec8204ec4fcbf2a80c9addea0e0609e3b.1717789719.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-12 07:22:55 -07:00
Rafael J. Wysocki	45de40574f	Merge branch 'acpi-riscv' Merge ACPI and irqchip updates related to external interrupt controller support on RISC-V: - Add ACPI device enumeration support for interrupt controller probing including taking dependencies into account (Sunil V L). - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L). - Add ACPI support for AIA in riscv-intc and add ACPI support to riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L). * acpi-riscv: irqchip/sifive-plic: Add ACPI support irqchip/riscv-aplic: Add ACPI support irqchip/riscv-imsic: Add ACPI support irqchip/riscv-imsic-state: Create separate function for DT irqchip/riscv-intc: Add ACPI support for AIA ACPI: RISC-V: Implement function to add implicit dependencies ACPI: RISC-V: Initialize GSI mapping structures ACPI: RISC-V: Implement function to reorder irqchip probe entries ACPI: RISC-V: Implement PCI related functionality ACPI: pci_link: Clear the dependencies after probe ACPI: bus: Add RINTC IRQ model for RISC-V ACPI: scan: Define weak function to populate dependencies ACPI: scan: Add RISC-V interrupt controllers to honor list ACPI: scan: Refactor dependency creation ACPI: bus: Add acpi_riscv_init() function ACPI: scan: Add a weak arch_sort_irqchip_probe() to order the IRQCHIP probe arm64: PCI: Migrate ACPI related functions to pci-acpi.c	2024-09-11 21:44:22 +02:00
Mike Rapoport (Microsoft)	46bcce5031	arch, mm: move definition of node_data to generic code Every architecture that supports NUMA defines node_data in the same way: struct pglist_data *node_data[MAX_NUMNODES]; No reason to keep multiple copies of this definition and its forward declarations, especially when such forward declaration is the only thing in include/asm/mmzone.h for many architectures. Add definition and declaration of node_data to generic code and drop architecture-specific versions. Link: https://lkml.kernel.org/r/20240807064110.1003856-8-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64 Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU] Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Rob Herring (Arm) <robh@kernel.org> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-09-03 21:15:28 -07:00
Alexandre Ghiti	1ff95eb2be	riscv: Fix RISCV_ALTERNATIVE_EARLY RISCV_ALTERNATIVE_EARLY will issue sbi_ecall() very early in the boot process, before the first memory mapping is setup so we can't have any instrumentation happening here. In addition, when the kernel is relocatable, we must also not issue any relocation this early since they would have been patched virtually only. So, instead of disabling instrumentation for the whole kernel/sbi.c file and compiling it with -fno-pie, simply move __sbi_ecall() and __sbi_base_ecall() into their own file where this is fixed. Reported-by: Conor Dooley <conor.dooley@microchip.com> Closes: https://lore.kernel.org/linux-riscv/20240813-pony-truck-3e7a83e9759e@spud/ Reported-by: syzbot+cfbcb82adf6d7279fd35@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-riscv/00000000000065062c061fcec37b@google.com/ Fixes: `1745cfafeb` ("riscv: don't use global static vars to store alternative data") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240829165048.49756-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-09-03 07:57:55 -07:00

1 2 3 4 5 ...

1375 Commits