Files
linux-stable-mirror/include/linux/sched/exec_state.h
T
Christian Brauner (Amutable) 6b1c66c9cc exec_state: relocate dumpable information
The dumpable flag captured at execve() is consulted by
__ptrace_may_access() and several /proc owner / visibility checks.
It lives on mm_struct today, which exit_mm() clears from the task
long before the task itself is reaped.

exec_state is anchored to the execve() that established the current
privilege domain.  CLONE_VM siblings refcount-share the parent's
exec_state via copy_exec_state(); non-CLONE_VM clones allocate a
fresh exec_state inheriting the parent's dumpable mode and user_ns
reference via task_exec_state_copy().  execve() allocates a fresh
instance (via alloc_task_exec_state() in begin_new_exec()) and
installs it under task_lock + exec_update_lock with
task_exec_state_replace().  init_task uses a static instance.

The dumpable mode now lives on task->exec_state->dumpable.
task->mm->flags no longer carries dumpability; MMF_DUMPABLE_MASK is
removed, but MMF_DUMPABLE_BITS is reserved so MMF_DUMP_FILTER_* bit
positions remain stable for the /proc/<pid>/coredump_filter ABI. The
task->user_dumpable cache bit and its assignment in exit_mm() are
removed; readers go through get_dumpable(task) directly.

coredump_params gains a snapshot field cprm.dumpable, populated from
get_dumpable(current) at vfs_coredump() entry, replacing the previous
__get_dumpable(cprm->mm_flags) consumers in fs/coredump.c and
fs/pidfs.c.

The user namespace recorded at execve() is consulted by
__ptrace_may_access() and by /proc/PID/* owner derivation. Move the
captured user_ns onto task_exec_state, which stays attached to the task
past exit_mm() and across exit_files().

bprm grows a user_ns field staged in bprm_mm_init() with the caller's
user_ns, narrowed by would_dump() to the closest privileged ancestor,
and consumed by exec_mmap() via alloc_task_exec_state(bprm->user_ns).
free_bprm() releases the staging reference.

mm_struct loses ->user_ns entirely.  Initializers in init-mm, efi_mm,
and the implicit one in mm_init()/dup_mm()/mm_alloc() are removed;
__mmdrop() drops the matching put_user_ns(). The kthread_use_mm()
WARN_ON_ONCE(!mm->user_ns) is no longer meaningful and goes too.

Reviewed-by: Jann Horn <jannh@google.com>
Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-4-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2026-05-26 11:02:01 +02:00

32 lines
1.0 KiB
C

// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2026 Christian Brauner <brauner@kernel.org> */
#ifndef _LINUX_SCHED_EXEC_STATE_H
#define _LINUX_SCHED_EXEC_STATE_H
#include <linux/init.h>
#include <linux/rcupdate.h>
#include <linux/refcount.h>
#include <linux/sched/coredump.h>
#include <linux/user_namespace.h>
struct task_exec_state {
refcount_t count;
enum task_dumpable dumpable;
struct user_namespace *user_ns;
struct rcu_head rcu;
};
extern struct task_exec_state init_task_exec_state;
struct task_exec_state *alloc_task_exec_state(struct user_namespace *user_ns);
void put_task_exec_state(struct task_exec_state *exec_state);
struct task_exec_state *task_exec_state_rcu(const struct task_struct *tsk);
struct task_exec_state *task_exec_state_replace(struct task_struct *tsk,
struct task_exec_state *exec_state);
int task_exec_state_copy(struct task_struct *tsk);
void __init exec_state_init(void);
DEFINE_FREE(put_task_exec_state, struct task_exec_state *, put_task_exec_state(_T))
#endif /* _LINUX_SCHED_EXEC_STATE_H */