mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2026-04-03 12:05:13 +02:00
[ Upstream commit398edaa12f] Historically SVE state was discarded deterministically early in the syscall entry path, before ptrace is notified of syscall entry. This permitted ptrace to modify SVE state before and after the "real" syscall logic was executed, with the modified state being retained. This behaviour was changed by commit:8c845e2731("arm64/sve: Leave SVE enabled on syscall if we don't context switch") That commit was intended to speed up workloads that used SVE by opportunistically leaving SVE enabled when returning from a syscall. The syscall entry logic was modified to truncate the SVE state without disabling userspace access to SVE, and fpsimd_save_user_state() was modified to discard userspace SVE state whenever in_syscall(current_pt_regs()) is true, i.e. when current_pt_regs()->syscallno != NO_SYSCALL. Leaving SVE enabled opportunistically resulted in a couple of changes to userspace visible behaviour which weren't described at the time, but are logical consequences of opportunistically leaving SVE enabled: * Signal handlers can observe the type of saved state in the signal's sve_context record. When the kernel only tracks FPSIMD state, the 'vq' field is 0 and there is no space allocated for register contents. When the kernel tracks SVE state, the 'vq' field is non-zero and the register contents are saved into the record. As a result of the above commit, 'vq' (and the presence of SVE register state) is non-deterministically zero or non-zero for a period of time after a syscall. The effective register state is still deterministic. Hopefully no-one relies on this being deterministic. In general, handlers for asynchronous events cannot expect a deterministic state. * Similarly to signal handlers, ptrace requests can observe the type of saved state in the NT_ARM_SVE and NT_ARM_SSVE regsets, as this is exposed in the header flags. As a result of the above commit, this is now in a non-deterministic state after a syscall. The effective register state is still deterministic. Hopefully no-one relies on this being deterministic. In general, debuggers would have to handle this changing at arbitrary points during program flow. Discarding the SVE state within fpsimd_save_user_state() resulted in other changes to userspace visible behaviour which are not desirable: * A ptrace tracer can modify (or create) a tracee's SVE state at syscall entry or syscall exit. As a result of the above commit, the tracee's SVE state can be discarded non-deterministically after modification, rather than being retained as it previously was. Note that for co-operative tracer/tracee pairs, the tracer may (re)initialise the tracee's state arbitrarily after the tracee sends itself an initial SIGSTOP via a syscall, so this affects realistic design patterns. * The current_pt_regs()->syscallno field can be modified via ptrace, and can be altered even when the tracee is not really in a syscall, causing non-deterministic discarding to occur in situations where this was not previously possible. Further, using current_pt_regs()->syscallno in this way is unsound: * There are data races between readers and writers of the current_pt_regs()->syscallno field. The current_pt_regs()->syscallno field is written in interruptible task context using plain C accesses, and is read in irq/softirq context using plain C accesses. These accesses are subject to data races, with the usual concerns with tearing, etc. * Writes to current_pt_regs()->syscallno are subject to compiler reordering. As current_pt_regs()->syscallno is written with plain C accesses, the compiler is free to move those writes arbitrarily relative to anything which doesn't access the same memory location. In theory this could break signal return, where prior to restoring the SVE state, restore_sigframe() calls forget_syscall(). If the write were hoisted after restore of some SVE state, that state could be discarded unexpectedly. In practice that reordering cannot happen in the absence of LTO (as cross compilation-unit function calls happen prevent this reordering), and that reordering appears to be unlikely in the presence of LTO. Additionally, since commit:f130ac0ae4("arm64: syscall: unmask DAIF earlier for SVCs") ... DAIF is unmasked before el0_svc_common() sets regs->syscallno to the real syscall number. Consequently state may be saved in SVE format prior to this point. Considering all of the above, current_pt_regs()->syscallno should not be used to infer whether the SVE state can be discarded. Luckily we can instead use cpu_fp_state::to_save to track when it is safe to discard the SVE state: * At syscall entry, after the live SVE register state is truncated, set cpu_fp_state::to_save to FP_STATE_FPSIMD to indicate that only the FPSIMD portion is live and needs to be saved. * At syscall exit, once the task's state is guaranteed to be live, set cpu_fp_state::to_save to FP_STATE_CURRENT to indicate that TIF_SVE must be considered to determine which state needs to be saved. * Whenever state is modified, it must be saved+flushed prior to manipulation. The state will be truncated if necessary when it is saved, and reloading the state will set fp_state::to_save to FP_STATE_CURRENT, preventing subsequent discarding. This permits SVE state to be discarded *only* when it is known to have been truncated (and the non-FPSIMD portions must be zero), and ensures that SVE state is retained after it is explicitly modified. For backporting, note that this fix depends on the following commits: *b2482807fb("arm64/sme: Optimise SME exit on syscall entry") *f130ac0ae4("arm64: syscall: unmask DAIF earlier for SVCs") *929fa99b12("arm64/fpsimd: signal: Always save+flush state early") Fixes:8c845e2731("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Fixes:f130ac0ae4("arm64: syscall: unmask DAIF earlier for SVCs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
442 lines
12 KiB
C
442 lines
12 KiB
C
/* SPDX-License-Identifier: GPL-2.0-only */
|
|
/*
|
|
* Copyright (C) 2012 ARM Ltd.
|
|
*/
|
|
#ifndef __ASM_FP_H
|
|
#define __ASM_FP_H
|
|
|
|
#include <asm/errno.h>
|
|
#include <asm/percpu.h>
|
|
#include <asm/ptrace.h>
|
|
#include <asm/processor.h>
|
|
#include <asm/sigcontext.h>
|
|
#include <asm/sysreg.h>
|
|
|
|
#ifndef __ASSEMBLY__
|
|
|
|
#include <linux/bitmap.h>
|
|
#include <linux/build_bug.h>
|
|
#include <linux/bug.h>
|
|
#include <linux/cache.h>
|
|
#include <linux/init.h>
|
|
#include <linux/stddef.h>
|
|
#include <linux/types.h>
|
|
|
|
/* Masks for extracting the FPSR and FPCR from the FPSCR */
|
|
#define VFP_FPSCR_STAT_MASK 0xf800009f
|
|
#define VFP_FPSCR_CTRL_MASK 0x07f79f00
|
|
/*
|
|
* The VFP state has 32x64-bit registers and a single 32-bit
|
|
* control/status register.
|
|
*/
|
|
#define VFP_STATE_SIZE ((32 * 8) + 4)
|
|
|
|
static inline unsigned long cpacr_save_enable_kernel_sve(void)
|
|
{
|
|
unsigned long old = read_sysreg(cpacr_el1);
|
|
unsigned long set = CPACR_EL1_FPEN_EL1EN | CPACR_EL1_ZEN_EL1EN;
|
|
|
|
write_sysreg(old | set, cpacr_el1);
|
|
isb();
|
|
return old;
|
|
}
|
|
|
|
static inline unsigned long cpacr_save_enable_kernel_sme(void)
|
|
{
|
|
unsigned long old = read_sysreg(cpacr_el1);
|
|
unsigned long set = CPACR_EL1_FPEN_EL1EN | CPACR_EL1_SMEN_EL1EN;
|
|
|
|
write_sysreg(old | set, cpacr_el1);
|
|
isb();
|
|
return old;
|
|
}
|
|
|
|
static inline void cpacr_restore(unsigned long cpacr)
|
|
{
|
|
write_sysreg(cpacr, cpacr_el1);
|
|
isb();
|
|
}
|
|
|
|
/*
|
|
* When we defined the maximum SVE vector length we defined the ABI so
|
|
* that the maximum vector length included all the reserved for future
|
|
* expansion bits in ZCR rather than those just currently defined by
|
|
* the architecture. Using this length to allocate worst size buffers
|
|
* results in excessively large allocations, and this effect is even
|
|
* more pronounced for SME due to ZA. Define more suitable VLs for
|
|
* these situations.
|
|
*/
|
|
#define ARCH_SVE_VQ_MAX ((ZCR_ELx_LEN_MASK >> ZCR_ELx_LEN_SHIFT) + 1)
|
|
#define SME_VQ_MAX ((SMCR_ELx_LEN_MASK >> SMCR_ELx_LEN_SHIFT) + 1)
|
|
|
|
struct task_struct;
|
|
|
|
extern void fpsimd_save_state(struct user_fpsimd_state *state);
|
|
extern void fpsimd_load_state(struct user_fpsimd_state *state);
|
|
|
|
extern void fpsimd_thread_switch(struct task_struct *next);
|
|
extern void fpsimd_flush_thread(void);
|
|
|
|
extern void fpsimd_signal_preserve_current_state(void);
|
|
extern void fpsimd_preserve_current_state(void);
|
|
extern void fpsimd_restore_current_state(void);
|
|
extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
|
|
extern void fpsimd_kvm_prepare(void);
|
|
|
|
struct cpu_fp_state {
|
|
struct user_fpsimd_state *st;
|
|
void *sve_state;
|
|
void *sme_state;
|
|
u64 *svcr;
|
|
u64 *fpmr;
|
|
unsigned int sve_vl;
|
|
unsigned int sme_vl;
|
|
enum fp_type *fp_type;
|
|
enum fp_type to_save;
|
|
};
|
|
|
|
DECLARE_PER_CPU(struct cpu_fp_state, fpsimd_last_state);
|
|
|
|
extern void fpsimd_bind_state_to_cpu(struct cpu_fp_state *fp_state);
|
|
|
|
extern void fpsimd_flush_task_state(struct task_struct *target);
|
|
extern void fpsimd_save_and_flush_cpu_state(void);
|
|
|
|
static inline bool thread_sm_enabled(struct thread_struct *thread)
|
|
{
|
|
return system_supports_sme() && (thread->svcr & SVCR_SM_MASK);
|
|
}
|
|
|
|
static inline bool thread_za_enabled(struct thread_struct *thread)
|
|
{
|
|
return system_supports_sme() && (thread->svcr & SVCR_ZA_MASK);
|
|
}
|
|
|
|
/* Maximum VL that SVE/SME VL-agnostic software can transparently support */
|
|
#define VL_ARCH_MAX 0x100
|
|
|
|
/* Offset of FFR in the SVE register dump */
|
|
static inline size_t sve_ffr_offset(int vl)
|
|
{
|
|
return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
|
|
}
|
|
|
|
static inline void *sve_pffr(struct thread_struct *thread)
|
|
{
|
|
unsigned int vl;
|
|
|
|
if (system_supports_sme() && thread_sm_enabled(thread))
|
|
vl = thread_get_sme_vl(thread);
|
|
else
|
|
vl = thread_get_sve_vl(thread);
|
|
|
|
return (char *)thread->sve_state + sve_ffr_offset(vl);
|
|
}
|
|
|
|
static inline void *thread_zt_state(struct thread_struct *thread)
|
|
{
|
|
/* The ZT register state is stored immediately after the ZA state */
|
|
unsigned int sme_vq = sve_vq_from_vl(thread_get_sme_vl(thread));
|
|
return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
|
|
}
|
|
|
|
extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
|
|
extern void sve_load_state(void const *state, u32 const *pfpsr,
|
|
int restore_ffr);
|
|
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
|
|
extern unsigned int sve_get_vl(void);
|
|
extern void sve_set_vq(unsigned long vq_minus_1);
|
|
extern void sme_set_vq(unsigned long vq_minus_1);
|
|
extern void sme_save_state(void *state, int zt);
|
|
extern void sme_load_state(void const *state, int zt);
|
|
|
|
struct arm64_cpu_capabilities;
|
|
extern void cpu_enable_fpsimd(const struct arm64_cpu_capabilities *__unused);
|
|
extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused);
|
|
extern void cpu_enable_sme(const struct arm64_cpu_capabilities *__unused);
|
|
extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused);
|
|
extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused);
|
|
extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused);
|
|
|
|
/*
|
|
* Helpers to translate bit indices in sve_vq_map to VQ values (and
|
|
* vice versa). This allows find_next_bit() to be used to find the
|
|
* _maximum_ VQ not exceeding a certain value.
|
|
*/
|
|
static inline unsigned int __vq_to_bit(unsigned int vq)
|
|
{
|
|
return SVE_VQ_MAX - vq;
|
|
}
|
|
|
|
static inline unsigned int __bit_to_vq(unsigned int bit)
|
|
{
|
|
return SVE_VQ_MAX - bit;
|
|
}
|
|
|
|
|
|
struct vl_info {
|
|
enum vec_type type;
|
|
const char *name; /* For display purposes */
|
|
|
|
/* Minimum supported vector length across all CPUs */
|
|
int min_vl;
|
|
|
|
/* Maximum supported vector length across all CPUs */
|
|
int max_vl;
|
|
int max_virtualisable_vl;
|
|
|
|
/*
|
|
* Set of available vector lengths,
|
|
* where length vq encoded as bit __vq_to_bit(vq):
|
|
*/
|
|
DECLARE_BITMAP(vq_map, SVE_VQ_MAX);
|
|
|
|
/* Set of vector lengths present on at least one cpu: */
|
|
DECLARE_BITMAP(vq_partial_map, SVE_VQ_MAX);
|
|
};
|
|
|
|
#ifdef CONFIG_ARM64_SVE
|
|
|
|
extern void sve_alloc(struct task_struct *task, bool flush);
|
|
extern void fpsimd_release_task(struct task_struct *task);
|
|
extern void fpsimd_sync_to_sve(struct task_struct *task);
|
|
extern void fpsimd_force_sync_to_sve(struct task_struct *task);
|
|
extern void sve_sync_to_fpsimd(struct task_struct *task);
|
|
extern void sve_sync_from_fpsimd_zeropad(struct task_struct *task);
|
|
|
|
extern int vec_set_vector_length(struct task_struct *task, enum vec_type type,
|
|
unsigned long vl, unsigned long flags);
|
|
|
|
extern int sve_set_current_vl(unsigned long arg);
|
|
extern int sve_get_current_vl(void);
|
|
|
|
static inline void sve_user_disable(void)
|
|
{
|
|
sysreg_clear_set(cpacr_el1, CPACR_EL1_ZEN_EL0EN, 0);
|
|
}
|
|
|
|
static inline void sve_user_enable(void)
|
|
{
|
|
sysreg_clear_set(cpacr_el1, 0, CPACR_EL1_ZEN_EL0EN);
|
|
}
|
|
|
|
#define sve_cond_update_zcr_vq(val, reg) \
|
|
do { \
|
|
u64 __zcr = read_sysreg_s((reg)); \
|
|
u64 __new = __zcr & ~ZCR_ELx_LEN_MASK; \
|
|
__new |= (val) & ZCR_ELx_LEN_MASK; \
|
|
if (__zcr != __new) \
|
|
write_sysreg_s(__new, (reg)); \
|
|
} while (0)
|
|
|
|
/*
|
|
* Probing and setup functions.
|
|
* Calls to these functions must be serialised with one another.
|
|
*/
|
|
enum vec_type;
|
|
|
|
extern void __init vec_init_vq_map(enum vec_type type);
|
|
extern void vec_update_vq_map(enum vec_type type);
|
|
extern int vec_verify_vq_map(enum vec_type type);
|
|
extern void __init sve_setup(void);
|
|
|
|
extern __ro_after_init struct vl_info vl_info[ARM64_VEC_MAX];
|
|
|
|
static inline void write_vl(enum vec_type type, u64 val)
|
|
{
|
|
u64 tmp;
|
|
|
|
switch (type) {
|
|
#ifdef CONFIG_ARM64_SVE
|
|
case ARM64_VEC_SVE:
|
|
tmp = read_sysreg_s(SYS_ZCR_EL1) & ~ZCR_ELx_LEN_MASK;
|
|
write_sysreg_s(tmp | val, SYS_ZCR_EL1);
|
|
break;
|
|
#endif
|
|
#ifdef CONFIG_ARM64_SME
|
|
case ARM64_VEC_SME:
|
|
tmp = read_sysreg_s(SYS_SMCR_EL1) & ~SMCR_ELx_LEN_MASK;
|
|
write_sysreg_s(tmp | val, SYS_SMCR_EL1);
|
|
break;
|
|
#endif
|
|
default:
|
|
WARN_ON_ONCE(1);
|
|
break;
|
|
}
|
|
}
|
|
|
|
static inline int vec_max_vl(enum vec_type type)
|
|
{
|
|
return vl_info[type].max_vl;
|
|
}
|
|
|
|
static inline int vec_max_virtualisable_vl(enum vec_type type)
|
|
{
|
|
return vl_info[type].max_virtualisable_vl;
|
|
}
|
|
|
|
static inline int sve_max_vl(void)
|
|
{
|
|
return vec_max_vl(ARM64_VEC_SVE);
|
|
}
|
|
|
|
static inline int sve_max_virtualisable_vl(void)
|
|
{
|
|
return vec_max_virtualisable_vl(ARM64_VEC_SVE);
|
|
}
|
|
|
|
/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
|
|
static inline bool vq_available(enum vec_type type, unsigned int vq)
|
|
{
|
|
return test_bit(__vq_to_bit(vq), vl_info[type].vq_map);
|
|
}
|
|
|
|
static inline bool sve_vq_available(unsigned int vq)
|
|
{
|
|
return vq_available(ARM64_VEC_SVE, vq);
|
|
}
|
|
|
|
size_t sve_state_size(struct task_struct const *task);
|
|
|
|
#else /* ! CONFIG_ARM64_SVE */
|
|
|
|
static inline void sve_alloc(struct task_struct *task, bool flush) { }
|
|
static inline void fpsimd_release_task(struct task_struct *task) { }
|
|
static inline void sve_sync_to_fpsimd(struct task_struct *task) { }
|
|
static inline void sve_sync_from_fpsimd_zeropad(struct task_struct *task) { }
|
|
|
|
static inline int sve_max_virtualisable_vl(void)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
static inline int sve_set_current_vl(unsigned long arg)
|
|
{
|
|
return -EINVAL;
|
|
}
|
|
|
|
static inline int sve_get_current_vl(void)
|
|
{
|
|
return -EINVAL;
|
|
}
|
|
|
|
static inline int sve_max_vl(void)
|
|
{
|
|
return -EINVAL;
|
|
}
|
|
|
|
static inline bool sve_vq_available(unsigned int vq) { return false; }
|
|
|
|
static inline void sve_user_disable(void) { BUILD_BUG(); }
|
|
static inline void sve_user_enable(void) { BUILD_BUG(); }
|
|
|
|
#define sve_cond_update_zcr_vq(val, reg) do { } while (0)
|
|
|
|
static inline void vec_init_vq_map(enum vec_type t) { }
|
|
static inline void vec_update_vq_map(enum vec_type t) { }
|
|
static inline int vec_verify_vq_map(enum vec_type t) { return 0; }
|
|
static inline void sve_setup(void) { }
|
|
|
|
static inline size_t sve_state_size(struct task_struct const *task)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
#endif /* ! CONFIG_ARM64_SVE */
|
|
|
|
#ifdef CONFIG_ARM64_SME
|
|
|
|
static inline void sme_user_disable(void)
|
|
{
|
|
sysreg_clear_set(cpacr_el1, CPACR_EL1_SMEN_EL0EN, 0);
|
|
}
|
|
|
|
static inline void sme_user_enable(void)
|
|
{
|
|
sysreg_clear_set(cpacr_el1, 0, CPACR_EL1_SMEN_EL0EN);
|
|
}
|
|
|
|
static inline void sme_smstart_sm(void)
|
|
{
|
|
asm volatile(__msr_s(SYS_SVCR_SMSTART_SM_EL0, "xzr"));
|
|
}
|
|
|
|
static inline void sme_smstop_sm(void)
|
|
{
|
|
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SM_EL0, "xzr"));
|
|
}
|
|
|
|
static inline void sme_smstop(void)
|
|
{
|
|
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SMZA_EL0, "xzr"));
|
|
}
|
|
|
|
extern void __init sme_setup(void);
|
|
|
|
static inline int sme_max_vl(void)
|
|
{
|
|
return vec_max_vl(ARM64_VEC_SME);
|
|
}
|
|
|
|
static inline int sme_max_virtualisable_vl(void)
|
|
{
|
|
return vec_max_virtualisable_vl(ARM64_VEC_SME);
|
|
}
|
|
|
|
extern void sme_alloc(struct task_struct *task, bool flush);
|
|
extern unsigned int sme_get_vl(void);
|
|
extern int sme_set_current_vl(unsigned long arg);
|
|
extern int sme_get_current_vl(void);
|
|
extern void sme_suspend_exit(void);
|
|
|
|
/*
|
|
* Return how many bytes of memory are required to store the full SME
|
|
* specific state for task, given task's currently configured vector
|
|
* length.
|
|
*/
|
|
static inline size_t sme_state_size(struct task_struct const *task)
|
|
{
|
|
unsigned int vl = task_get_sme_vl(task);
|
|
size_t size;
|
|
|
|
size = ZA_SIG_REGS_SIZE(sve_vq_from_vl(vl));
|
|
|
|
if (system_supports_sme2())
|
|
size += ZT_SIG_REG_SIZE;
|
|
|
|
return size;
|
|
}
|
|
|
|
#else
|
|
|
|
static inline void sme_user_disable(void) { BUILD_BUG(); }
|
|
static inline void sme_user_enable(void) { BUILD_BUG(); }
|
|
|
|
static inline void sme_smstart_sm(void) { }
|
|
static inline void sme_smstop_sm(void) { }
|
|
static inline void sme_smstop(void) { }
|
|
|
|
static inline void sme_alloc(struct task_struct *task, bool flush) { }
|
|
static inline void sme_setup(void) { }
|
|
static inline unsigned int sme_get_vl(void) { return 0; }
|
|
static inline int sme_max_vl(void) { return 0; }
|
|
static inline int sme_max_virtualisable_vl(void) { return 0; }
|
|
static inline int sme_set_current_vl(unsigned long arg) { return -EINVAL; }
|
|
static inline int sme_get_current_vl(void) { return -EINVAL; }
|
|
static inline void sme_suspend_exit(void) { }
|
|
|
|
static inline size_t sme_state_size(struct task_struct const *task)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
#endif /* ! CONFIG_ARM64_SME */
|
|
|
|
/* For use by EFI runtime services calls only */
|
|
extern void __efi_fpsimd_begin(void);
|
|
extern void __efi_fpsimd_end(void);
|
|
|
|
#endif
|
|
|
|
#endif
|