Files
linux-stable-mirror/include/linux/mmu_notifier.h
T
Linus Torvalds 334fbe734e Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00

649 lines
22 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_MMU_NOTIFIER_H
#define _LINUX_MMU_NOTIFIER_H
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/mm_types.h>
#include <linux/mmap_lock.h>
#include <linux/srcu.h>
#include <linux/interval_tree.h>
struct mmu_notifier_subscriptions;
struct mmu_notifier;
struct mmu_notifier_range;
struct mmu_interval_notifier;
/**
* enum mmu_notifier_event - reason for the mmu notifier callback
* @MMU_NOTIFY_UNMAP: either munmap() that unmap the range or a mremap() that
* move the range
*
* @MMU_NOTIFY_CLEAR: clear page table entry (many reasons for this like
* madvise() or replacing a page by another one, ...).
*
* @MMU_NOTIFY_PROTECTION_VMA: update is due to protection change for the range
* ie using the vma access permission (vm_page_prot) to update the whole range
* is enough no need to inspect changes to the CPU page table (mprotect()
* syscall)
*
* @MMU_NOTIFY_PROTECTION_PAGE: update is due to change in read/write flag for
* pages in the range so to mirror those changes the user must inspect the CPU
* page table (from the end callback).
*
* @MMU_NOTIFY_SOFT_DIRTY: soft dirty accounting (still same page and same
* access flags). User should soft dirty the page in the end callback to make
* sure that anyone relying on soft dirtiness catch pages that might be written
* through non CPU mappings.
*
* @MMU_NOTIFY_RELEASE: used during mmu_interval_notifier invalidate to signal
* that the mm refcount is zero and the range is no longer accessible.
*
* @MMU_NOTIFY_MIGRATE: used during migrate_vma_collect() invalidate to signal
* a device driver to possibly ignore the invalidation if the
* owner field matches the driver's device private pgmap owner.
*
* @MMU_NOTIFY_EXCLUSIVE: conversion of a page table entry to device-exclusive.
* The owner is initialized to the value provided by the caller of
* make_device_exclusive(), such that this caller can filter out these
* events.
*/
enum mmu_notifier_event {
MMU_NOTIFY_UNMAP = 0,
MMU_NOTIFY_CLEAR,
MMU_NOTIFY_PROTECTION_VMA,
MMU_NOTIFY_PROTECTION_PAGE,
MMU_NOTIFY_SOFT_DIRTY,
MMU_NOTIFY_RELEASE,
MMU_NOTIFY_MIGRATE,
MMU_NOTIFY_EXCLUSIVE,
};
#define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
struct mmu_notifier_ops {
/*
* Called either by mmu_notifier_unregister or when the mm is
* being destroyed by exit_mmap, always before all pages are
* freed. This can run concurrently with other mmu notifier
* methods (the ones invoked outside the mm context) and it
* should tear down all secondary mmu mappings and freeze the
* secondary mmu. If this method isn't implemented you've to
* be sure that nothing could possibly write to the pages
* through the secondary mmu by the time the last thread with
* tsk->mm == mm exits.
*
* As side note: the pages freed after ->release returns could
* be immediately reallocated by the gart at an alias physical
* address with a different cache model, so if ->release isn't
* implemented because all _software_ driven memory accesses
* through the secondary mmu are terminated by the time the
* last thread of this mm quits, you've also to be sure that
* speculative _hardware_ operations can't allocate dirty
* cachelines in the cpu that could not be snooped and made
* coherent with the other read and write operations happening
* through the gart alias address, so leading to memory
* corruption.
*/
void (*release)(struct mmu_notifier *subscription,
struct mm_struct *mm);
/*
* clear_flush_young is called after the VM is
* test-and-clearing the young/accessed bitflag in the
* pte. This way the VM will provide proper aging to the
* accesses to the page through the secondary MMUs and not
* only to the ones through the Linux pte.
* Start-end is necessary in case the secondary MMU is mapping the page
* at a smaller granularity than the primary MMU.
*/
bool (*clear_flush_young)(struct mmu_notifier *subscription,
struct mm_struct *mm,
unsigned long start,
unsigned long end);
/*
* clear_young is a lightweight version of clear_flush_young. Like the
* latter, it is supposed to test-and-clear the young/accessed bitflag
* in the secondary pte, but it may omit flushing the secondary tlb.
*/
bool (*clear_young)(struct mmu_notifier *subscription,
struct mm_struct *mm,
unsigned long start,
unsigned long end);
/*
* test_young is called to check the young/accessed bitflag in
* the secondary pte. This is used to know if the page is
* frequently used without actually clearing the flag or tearing
* down the secondary mapping on the page.
*/
bool (*test_young)(struct mmu_notifier *subscription,
struct mm_struct *mm,
unsigned long address);
/*
* invalidate_range_start() and invalidate_range_end() must be
* paired and are called only when the mmap_lock and/or the
* locks protecting the reverse maps are held. If the subsystem
* can't guarantee that no additional references are taken to
* the pages in the range, it has to implement the
* invalidate_range() notifier to remove any references taken
* after invalidate_range_start().
*
* Invalidation of multiple concurrent ranges may be
* optionally permitted by the driver. Either way the
* establishment of sptes is forbidden in the range passed to
* invalidate_range_begin/end for the whole duration of the
* invalidate_range_begin/end critical section.
*
* invalidate_range_start() is called when all pages in the
* range are still mapped and have at least a refcount of one.
*
* invalidate_range_end() is called when all pages in the
* range have been unmapped and the pages have been freed by
* the VM.
*
* The VM will remove the page table entries and potentially
* the page between invalidate_range_start() and
* invalidate_range_end(). If the page must not be freed
* because of pending I/O or other circumstances then the
* invalidate_range_start() callback (or the initial mapping
* by the driver) must make sure that the refcount is kept
* elevated.
*
* If the driver increases the refcount when the pages are
* initially mapped into an address space then either
* invalidate_range_start() or invalidate_range_end() may
* decrease the refcount. If the refcount is decreased on
* invalidate_range_start() then the VM can free pages as page
* table entries are removed. If the refcount is only
* dropped on invalidate_range_end() then the driver itself
* will drop the last refcount but it must take care to flush
* any secondary tlb before doing the final free on the
* page. Pages will no longer be referenced by the linux
* address space but may still be referenced by sptes until
* the last refcount is dropped.
*
* If blockable argument is set to false then the callback cannot
* sleep and has to return with -EAGAIN if sleeping would be required.
* 0 should be returned otherwise. Please note that notifiers that can
* fail invalidate_range_start are not allowed to implement
* invalidate_range_end, as there is no mechanism for informing the
* notifier that its start failed.
*/
int (*invalidate_range_start)(struct mmu_notifier *subscription,
const struct mmu_notifier_range *range);
void (*invalidate_range_end)(struct mmu_notifier *subscription,
const struct mmu_notifier_range *range);
/*
* arch_invalidate_secondary_tlbs() is used to manage a non-CPU TLB
* which shares page-tables with the CPU. The
* invalidate_range_start()/end() callbacks should not be implemented as
* invalidate_secondary_tlbs() already catches the points in time when
* an external TLB needs to be flushed.
*
* This requires arch_invalidate_secondary_tlbs() to be called while
* holding the ptl spin-lock and therefore this callback is not allowed
* to sleep.
*
* This is called by architecture code whenever invalidating a TLB
* entry. It is assumed that any secondary TLB has the same rules for
* when invalidations are required. If this is not the case architecture
* code will need to call this explicitly when required for secondary
* TLB invalidation.
*/
void (*arch_invalidate_secondary_tlbs)(
struct mmu_notifier *subscription,
struct mm_struct *mm,
unsigned long start,
unsigned long end);
/*
* These callbacks are used with the get/put interface to manage the
* lifetime of the mmu_notifier memory. alloc_notifier() returns a new
* notifier for use with the mm.
*
* free_notifier() is only called after the mmu_notifier has been
* fully put, calls to any ops callback are prevented and no ops
* callbacks are currently running. It is called from a SRCU callback
* and cannot sleep.
*/
struct mmu_notifier *(*alloc_notifier)(struct mm_struct *mm);
void (*free_notifier)(struct mmu_notifier *subscription);
};
/*
* The notifier chains are protected by mmap_lock and/or the reverse map
* semaphores. Notifier chains are only changed when all reverse maps and
* the mmap_lock locks are taken.
*
* Therefore notifier chains can only be traversed when either
*
* 1. mmap_lock is held.
* 2. One of the reverse map locks is held (i_mmap_rwsem or anon_vma->rwsem).
* 3. No other concurrent thread can access the list (release)
*/
struct mmu_notifier {
struct hlist_node hlist;
const struct mmu_notifier_ops *ops;
struct mm_struct *mm;
struct rcu_head rcu;
unsigned int users;
};
/**
* struct mmu_interval_notifier_finish - mmu_interval_notifier two-pass abstraction
* @link: Lockless list link for the notifiers pending pass list
* @notifier: The mmu_interval_notifier for which the finish pass is called.
*
* Allocate, typically using GFP_NOWAIT in the interval notifier's start pass.
* Note that with a large number of notifiers implementing two passes,
* allocation with GFP_NOWAIT will become increasingly likely to fail, so consider
* implementing a small pool instead of using kmalloc() allocations.
*
* If the implementation needs to pass data between the start and the finish passes,
* the recommended way is to embed struct mmu_interval_notifier_finish into a larger
* structure that also contains the data needed to be shared. Keep in mind that
* a notifier callback can be invoked in parallel, and each invocation needs its
* own struct mmu_interval_notifier_finish.
*
* If allocation fails, then the &mmu_interval_notifier_ops->invalidate_start op
* needs to implements the full notifier functionality. Please refer to its
* documentation.
*/
struct mmu_interval_notifier_finish {
struct llist_node link;
struct mmu_interval_notifier *notifier;
};
/**
* struct mmu_interval_notifier_ops - callback for range notification
* @invalidate: Upon return the caller must stop using any SPTEs within this
* range. This function can sleep. Return false only if sleeping
* was required but mmu_notifier_range_blockable(range) is false.
* @invalidate_start: Similar to @invalidate, but intended for two-pass notifier
* callbacks where the call to @invalidate_start is the first
* pass and any struct mmu_interval_notifier_finish pointer
* returned in the @finish parameter describes the finish pass.
* If *@finish is %NULL on return, then no final pass will be
* called, and @invalidate_start needs to implement the full
* notifier, behaving like @invalidate. The value of *@finish
* is guaranteed to be %NULL at function entry.
* @invalidate_finish: Called as the second pass for any notifier that returned
* a non-NULL *@finish from @invalidate_start. The @finish
* pointer passed here is the same one returned by
* @invalidate_start.
*/
struct mmu_interval_notifier_ops {
bool (*invalidate)(struct mmu_interval_notifier *interval_sub,
const struct mmu_notifier_range *range,
unsigned long cur_seq);
bool (*invalidate_start)(struct mmu_interval_notifier *interval_sub,
const struct mmu_notifier_range *range,
unsigned long cur_seq,
struct mmu_interval_notifier_finish **finish);
void (*invalidate_finish)(struct mmu_interval_notifier_finish *finish);
};
struct mmu_interval_notifier {
struct interval_tree_node interval_tree;
const struct mmu_interval_notifier_ops *ops;
struct mm_struct *mm;
struct hlist_node deferred_item;
unsigned long invalidate_seq;
};
#ifdef CONFIG_MMU_NOTIFIER
#ifdef CONFIG_LOCKDEP
extern struct lockdep_map __mmu_notifier_invalidate_range_start_map;
#endif
struct mmu_notifier_range {
struct mm_struct *mm;
unsigned long start;
unsigned long end;
unsigned flags;
enum mmu_notifier_event event;
void *owner;
};
static inline int mm_has_notifiers(struct mm_struct *mm)
{
return unlikely(mm->notifier_subscriptions);
}
struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
struct mm_struct *mm);
static inline struct mmu_notifier *
mmu_notifier_get(const struct mmu_notifier_ops *ops, struct mm_struct *mm)
{
struct mmu_notifier *ret;
mmap_write_lock(mm);
ret = mmu_notifier_get_locked(ops, mm);
mmap_write_unlock(mm);
return ret;
}
void mmu_notifier_put(struct mmu_notifier *subscription);
void mmu_notifier_synchronize(void);
extern int mmu_notifier_register(struct mmu_notifier *subscription,
struct mm_struct *mm);
extern int __mmu_notifier_register(struct mmu_notifier *subscription,
struct mm_struct *mm);
extern void mmu_notifier_unregister(struct mmu_notifier *subscription,
struct mm_struct *mm);
unsigned long
mmu_interval_read_begin(struct mmu_interval_notifier *interval_sub);
int mmu_interval_notifier_insert(struct mmu_interval_notifier *interval_sub,
struct mm_struct *mm, unsigned long start,
unsigned long length,
const struct mmu_interval_notifier_ops *ops);
int mmu_interval_notifier_insert_locked(
struct mmu_interval_notifier *interval_sub, struct mm_struct *mm,
unsigned long start, unsigned long length,
const struct mmu_interval_notifier_ops *ops);
void mmu_interval_notifier_remove(struct mmu_interval_notifier *interval_sub);
/**
* mmu_interval_set_seq - Save the invalidation sequence
* @interval_sub: The subscription passed to invalidate
* @cur_seq: The cur_seq passed to the invalidate() callback
*
* This must be called unconditionally from the invalidate callback of a
* struct mmu_interval_notifier_ops under the same lock that is used to call
* mmu_interval_read_retry(). It updates the sequence number for later use by
* mmu_interval_read_retry(). The provided cur_seq will always be odd.
*
* If the caller does not call mmu_interval_read_begin() or
* mmu_interval_read_retry() then this call is not required.
*/
static inline void
mmu_interval_set_seq(struct mmu_interval_notifier *interval_sub,
unsigned long cur_seq)
{
WRITE_ONCE(interval_sub->invalidate_seq, cur_seq);
}
/**
* mmu_interval_read_retry - End a read side critical section against a VA range
* @interval_sub: The subscription
* @seq: The return of the paired mmu_interval_read_begin()
*
* This MUST be called under a user provided lock that is also held
* unconditionally by op->invalidate() when it calls mmu_interval_set_seq().
*
* Each call should be paired with a single mmu_interval_read_begin() and
* should be used to conclude the read side.
*
* Returns: true if an invalidation collided with this critical section, and
* the caller should retry.
*/
static inline bool
mmu_interval_read_retry(struct mmu_interval_notifier *interval_sub,
unsigned long seq)
{
return interval_sub->invalidate_seq != seq;
}
/**
* mmu_interval_check_retry - Test if a collision has occurred
* @interval_sub: The subscription
* @seq: The return of the matching mmu_interval_read_begin()
*
* This can be used in the critical section between mmu_interval_read_begin()
* and mmu_interval_read_retry().
*
* This call can be used as part of loops and other expensive operations to
* expedite a retry.
* It can be called many times and does not have to hold the user
* provided lock.
*
* Returns: true indicates an invalidation has collided with this critical
* region and a future mmu_interval_read_retry() will return true.
* False is not reliable and only suggests a collision may not have
* occurred.
*/
static inline bool
mmu_interval_check_retry(struct mmu_interval_notifier *interval_sub,
unsigned long seq)
{
/* Pairs with the WRITE_ONCE in mmu_interval_set_seq() */
return READ_ONCE(interval_sub->invalidate_seq) != seq;
}
extern void __mmu_notifier_subscriptions_destroy(struct mm_struct *mm);
extern void __mmu_notifier_release(struct mm_struct *mm);
bool __mmu_notifier_clear_flush_young(struct mm_struct *mm,
unsigned long start, unsigned long end);
bool __mmu_notifier_clear_young(struct mm_struct *mm,
unsigned long start, unsigned long end);
bool __mmu_notifier_test_young(struct mm_struct *mm,
unsigned long address);
extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r);
extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r);
extern void __mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm,
unsigned long start, unsigned long end);
extern bool
mmu_notifier_range_update_to_read_only(const struct mmu_notifier_range *range);
static inline bool
mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
{
return (range->flags & MMU_NOTIFIER_RANGE_BLOCKABLE);
}
static inline void mmu_notifier_release(struct mm_struct *mm)
{
if (mm_has_notifiers(mm))
__mmu_notifier_release(mm);
}
static inline bool mmu_notifier_clear_flush_young(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
if (mm_has_notifiers(mm))
return __mmu_notifier_clear_flush_young(mm, start, end);
return false;
}
static inline bool mmu_notifier_clear_young(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
if (mm_has_notifiers(mm))
return __mmu_notifier_clear_young(mm, start, end);
return false;
}
static inline bool mmu_notifier_test_young(struct mm_struct *mm,
unsigned long address)
{
if (mm_has_notifiers(mm))
return __mmu_notifier_test_young(mm, address);
return false;
}
static inline void
mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
{
might_sleep();
lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
if (mm_has_notifiers(range->mm)) {
range->flags |= MMU_NOTIFIER_RANGE_BLOCKABLE;
__mmu_notifier_invalidate_range_start(range);
}
lock_map_release(&__mmu_notifier_invalidate_range_start_map);
}
/*
* This version of mmu_notifier_invalidate_range_start() avoids blocking, but it
* can return an error if a notifier can't proceed without blocking, in which
* case you're not allowed to modify PTEs in the specified range.
*
* This is mainly intended for OOM handling.
*/
static inline int __must_check
mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
{
int ret = 0;
lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
if (mm_has_notifiers(range->mm)) {
range->flags &= ~MMU_NOTIFIER_RANGE_BLOCKABLE;
ret = __mmu_notifier_invalidate_range_start(range);
}
lock_map_release(&__mmu_notifier_invalidate_range_start_map);
return ret;
}
static inline void
mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range)
{
if (mmu_notifier_range_blockable(range))
might_sleep();
if (mm_has_notifiers(range->mm))
__mmu_notifier_invalidate_range_end(range);
}
static inline void mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
if (mm_has_notifiers(mm))
__mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
}
static inline void mmu_notifier_subscriptions_init(struct mm_struct *mm)
{
mm->notifier_subscriptions = NULL;
}
static inline void mmu_notifier_subscriptions_destroy(struct mm_struct *mm)
{
if (mm_has_notifiers(mm))
__mmu_notifier_subscriptions_destroy(mm);
}
static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
enum mmu_notifier_event event,
unsigned flags,
struct mm_struct *mm,
unsigned long start,
unsigned long end)
{
range->event = event;
range->mm = mm;
range->start = start;
range->end = end;
range->flags = flags;
}
static inline void mmu_notifier_range_init_owner(
struct mmu_notifier_range *range,
enum mmu_notifier_event event, unsigned int flags,
struct mm_struct *mm, unsigned long start,
unsigned long end, void *owner)
{
mmu_notifier_range_init(range, event, flags, mm, start, end);
range->owner = owner;
}
#else /* CONFIG_MMU_NOTIFIER */
struct mmu_notifier_range {
unsigned long start;
unsigned long end;
};
static inline void _mmu_notifier_range_init(struct mmu_notifier_range *range,
unsigned long start,
unsigned long end)
{
range->start = start;
range->end = end;
}
#define mmu_notifier_range_init(range,event,flags,mm,start,end) \
_mmu_notifier_range_init(range, start, end)
#define mmu_notifier_range_init_owner(range, event, flags, mm, start, \
end, owner) \
_mmu_notifier_range_init(range, start, end)
static inline bool
mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
{
return true;
}
static inline int mm_has_notifiers(struct mm_struct *mm)
{
return 0;
}
static inline void mmu_notifier_release(struct mm_struct *mm)
{
}
static inline bool mmu_notifier_clear_flush_young(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
return false;
}
static inline bool mmu_notifier_clear_young(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
return false;
}
static inline bool mmu_notifier_test_young(struct mm_struct *mm,
unsigned long address)
{
return false;
}
static inline void
mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
{
}
static inline int
mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
{
return 0;
}
static inline
void mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range)
{
}
static inline void mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
}
static inline void mmu_notifier_subscriptions_init(struct mm_struct *mm)
{
}
static inline void mmu_notifier_subscriptions_destroy(struct mm_struct *mm)
{
}
#define mmu_notifier_range_update_to_read_only(r) false
static inline void mmu_notifier_synchronize(void)
{
}
#endif /* CONFIG_MMU_NOTIFIER */
#endif /* _LINUX_MMU_NOTIFIER_H */