aboutsummaryrefslogtreecommitdiff
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Make UMA and malloc(9) return non-executable memory in most cases.Jonathan T. Looney2018-06-137-23/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most kernel memory that is allocated after boot does not need to be executable. There are a few exceptions. For example, kernel modules do need executable memory, but they don't use UMA or malloc(9). The BPF JIT compiler also needs executable memory and did use malloc(9) until r317072. (Note that a side effect of r316767 was that the "small allocation" path in UMA on amd64 already returned non-executable memory. This meant that some calls to malloc(9) or the UMA zone(9) allocator could return executable memory, while others could return non-executable memory. This change makes the behavior consistent.) This change makes malloc(9) return non-executable memory unless the new M_EXEC flag is specified. After this change, the UMA zone(9) allocator will always return non-executable memory, and a KASSERT will catch attempts to use the M_EXEC flag to allocate executable memory using uma_zalloc() or its variants. Allocations that do need executable memory have various choices. They may use the M_EXEC flag to malloc(9), or they may use a different VM interfact to obtain executable pages. Now that malloc(9) again allows executable allocations, this change also reverts most of r317072. PR: 228927 Reviewed by: alc, kib, markj, jhb (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D15691 Notes: svn path=/head/; revision=335068
* uma: implement provisional api for per-cpu zonesMateusz Guzik2018-06-082-0/+42
| | | | | | | | | | | | | Per-cpu zone allocations are very rarely done compared to regular zones. The intent is to avoid pessimizing the latter case with per-cpu specific code. In particular contrary to the claim in r334824, M_ZERO is sometimes being used for such zones. But the zeroing method is completely different and braching on it in the fast path for regular zones is a waste of time. Notes: svn path=/head/; revision=334858
* uma: fix up r334824Mateusz Guzik2018-06-081-1/+2
| | | | | | | | | Turns out there is code which ends up passing M_ZERO to counters. Since counters zero unconditionally on their own, just ignore drop the flag in that place. Notes: svn path=/head/; revision=334830
* uma: remove M_ZERO support for pcpu zonesMateusz Guzik2018-06-081-6/+3
| | | | | | | | | | | Nothing in the tree uses it and pcpu zones have a fundamentally different use case than the regular zones - they are not supposed to be allocated and freed all the time. This reduces pollution in the allocation fast path. Notes: svn path=/head/; revision=334824
* UMA memory debugging enabled with INVARIANTS consists of two things:Gleb Smirnoff2018-06-081-20/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | trashing freed memory and checking that allocated memory is properly trashed, and also of keeping a bitset of freed items. Trashing/checking creates a lot of CPU cache poisoning, while keeping debugging bitsets consistent creates a lot of contention on UMA zone lock(s). The performance difference between INVARIANTS kernel and normal one is mostly attributed to UMA debugging, rather than to all KASSERT checks in the kernel. Add loader tunable vm.debug.divisor that allows either to turn off UMA debugging completely, or turn it on only for a fraction of allocations, while still running all KASSERTs in kernel. That allows to run INVARIANTS kernels in production environments without reducing load by orders of magnitude, but still doing useful extra checks. Default value is 1, meaning debug every allocation. Value of 0 would disable UMA debugging completely. Values above 1 enable debugging only for every N-th item. It isn't possible to strictly follow the number, but still amount of debugging is reduced roughly by (N-1)/N percent. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D15199 Notes: svn path=/head/; revision=334819
* Fix a typo in vm_domain_set(). When a domain crosses into the severe range,Jonathan T. Looney2018-06-071-1/+1
| | | | | | | | | | | we need to set the domain bit from the vm_severe_domains bitset (instead of clearing it). Reviewed by: jeff, markj Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=334783
* Reimplement brk() and sbrk() to avoid the use of _end.Mark Johnston2018-06-041-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, libc.so would initialize its notion of the break address using _end, a special symbol emitted by the static linker following the bss section. Compatibility issues between lld and ld.bfd could cause the wrong definition of _end (libc.so's definition rather than that of the executable) to be used, breaking the brk()/sbrk() interface. Avoid this problem and future interoperability issues by simply not relying on _end. Instead, modify the break() system call to return the kernel's view of the current break address, and have libc initialize its state using an extra syscall upon the first use of the interface. As a side effect, this appears to fix brk()/sbrk() usage in executables run with rtld direct exec, since the kernel and libc.so no longer maintain separate views of the process' break address. PR: 228574 Reviewed by: kib (previous version) MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D15663 Notes: svn path=/head/; revision=334626
* Correct the description of vm_pageout_scan_inactive() after r334508.Mark Johnston2018-06-041-3/+2
| | | | | | | Reported by: alc Notes: svn path=/head/; revision=334622
* Use a single, consistent approach to returning success versus failure inAlan Cox2018-06-042-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | vm_map_madvise(). Previously, vm_map_madvise() used a traditional Unix- style "return (0);" to indicate success in the common case, but Mach- style return values in the edge cases. Since KERN_SUCCESS equals zero, the only problem with this inconsistency was stylistic. vm_map_madvise() has exactly two callers in the entire source tree, and only one of them cares about the return value. That caller, kern_madvise(), can be simplified if vm_map_madvise() consistently uses Unix-style return values. Since vm_map_madvise() uses the variable modify_map as a Boolean, make it one. Eliminate a redundant error check from kern_madvise(). Add a comment explaining where the check is performed. Explicitly note that exec_release_args_kva() doesn't care about vm_map_madvise()'s return value. Since MADV_FREE is passed as the behavior, the return value will always be zero. Reviewed by: kib, markj MFC after: 7 days Notes: svn path=/head/; revision=334621
* Align UMA data to 128 byte cacheline sizeJustin Hibbits2018-06-041-1/+1
| | | | | | | Suggested by: mjg Notes: svn path=/head/; revision=334618
* Remove the "pass" variable from the page daemon control loop.Mark Johnston2018-06-021-48/+41
| | | | | | | | | | | | | It serves little purpose after r308474 and r329882. As a side effect, the removal fixes a bug in r329882 which caused the page daemon to periodically invoke lowmem handlers even in the absence of memory pressure. Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D15491 Notes: svn path=/head/; revision=334508
* Only check for MAP_32BIT when available.Konstantin Belousov2018-06-011-1/+4
| | | | | | | | | Reported by: mmacy Sponsored by: The FreeBSD Foundation MFC after: 10 days Notes: svn path=/head/; revision=334507
* Only a small subset of mmap(2)'s flags should be used in combination withAlan Cox2018-06-011-2/+2
| | | | | | | | | | | | | | | the flag MAP_GUARD. Rather than enumerating the flags that are not allowed, enumerate the flags that are allowed. The list of allowed flags is much shorter and less likely to change. (As an aside, one of the previously enumerated flags, MAP_PREFAULT, was not even a legal flag for mmap(2). However, because of an earlier check within kern_mmap(), this misuse of MAP_PREFAULT was harmless.) Reviewed by: kib MFC after: 10 days Notes: svn path=/head/; revision=334499
* Typo.Mark Johnston2018-05-301-1/+1
| | | | | | | | | PR: 228533 Submitted by: Jakub Piecuch <j.piecuch96@gmail.com> MFC after: 1 week Notes: svn path=/head/; revision=334389
* Addendum to r334233. In vm_fault_populate(), since the page lock is held,Alan Cox2018-05-281-1/+1
| | | | | | | | | | | we must use vm_page_xunbusy_maybelocked() rather than vm_page_xunbusy() to unbusy the page. Reviewed by: kib X-MFC with: r334233 Notes: svn path=/head/; revision=334287
* Eliminate duplicate assertions. We assert at the start of vm_fault_hold()Alan Cox2018-05-281-6/+8
| | | | | | | | | | | | | | that the map entry is wired if the caller passes the flag VM_FAULT_WIRE. Eliminate the same assertion, but spelled differently, at the end of vm_fault_hold() and vm_fault_populate(). Repeat the assertion only if the map is unlocked and the map lookup must be repeated. Reviewed by: kib MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D15582 Notes: svn path=/head/; revision=334274
* Use pmap_enter(..., psind=1) in vm_fault_populate() on amd64. WhileAlan Cox2018-05-261-18/+38
| | | | | | | | | | | | | | | superpage mappings were already being created by automatic promotion in vm_fault_populate(), this change reduces the cost of creating those mappings. Essentially, one pmap_enter(..., psind=1) call takes the place of 512 pmap_enter(..., psind=0) calls, and that one pmap_enter(..., psind=1) call eliminates the allocation of a page table page. Reviewed by: kib MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D15572 Notes: svn path=/head/; revision=334233
* Make vadvise compat freebsd11.Brooks Davis2018-05-251-13/+4
| | | | | | | | | | | | | The vadvise syscall (aka ovadvise) is undocumented and has always been implmented as returning EINVAL. Put the syscall under COMPAT11 and provide a userspace implementation. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15557 Notes: svn path=/head/; revision=334223
* Eliminate an unused parameter from vm_fault_populate().Alan Cox2018-05-241-4/+4
| | | | | | | | Reviewed by: kib MFC after: 10 days Notes: svn path=/head/; revision=334180
* Update r334154 with review feedback from D15490.Mark Johnston2018-05-241-18/+21
| | | | | | | | | An old revision was committed by accident. Differential Revision: https://reviews.freebsd.org/D15490 Notes: svn path=/head/; revision=334179
* Don't implement break(2) at all on aarch64 and riscv.Brooks Davis2018-05-241-5/+4
| | | | | | | | | | | | This should have been done when they were removed from libc, but was overlooked in the runup to 11.0. No users should exist. Approved by: andrew Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15539 Notes: svn path=/head/; revision=334168
* Split the active and inactive queue scans into separate subroutines.Mark Johnston2018-05-241-179/+215
| | | | | | | | | | | | | | The scans are largely independent, so this helps make the code marginally neater, and makes it easier to incorporate feedback from the active queue scan into the page daemon control loop. Improve some comments while here. No functional change intended. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D15490 Notes: svn path=/head/; revision=334154
* Ensure that "m" is initialized in vm_page_alloc_freelist_domain().Mark Johnston2018-05-221-3/+1
| | | | | | | | | | While here, remove a superfluous comment. Coverity CID: 1383559 MFC after: 3 days Notes: svn path=/head/; revision=334057
* Use the canonical check for reservation support.Mark Johnston2018-05-191-4/+3
| | | | Notes: svn path=/head/; revision=333903
* Don't increment addl_page_shortage for wired pages.Mark Johnston2018-05-181-2/+1
| | | | | | | | | | | | Such pages are dequeued as they're encountered during the inactive queue scan, so by the time we get to the active queue scan, they should have already been subtracted from the inactive queue length. Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D15479 Notes: svn path=/head/; revision=333799
* Fix a race in vm_page_pagequeue_lockptr().Mark Johnston2018-05-172-3/+4
| | | | | | | | | | | | The value of m->queue must be cached after comparing it with PQ_NONE, since it may be concurrently changing. Reported by: glebius Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D15462 Notes: svn path=/head/; revision=333703
* Fix powerpc64 LINTMatt Macy2018-05-171-1/+4
| | | | | | | | | | | vm_object_reserve() == true is impossible on power. Make conditional on VM_LEVEL_0_ORDER being defined. Reviewed by: jeff Approved by: sbruno Notes: svn path=/head/; revision=333701
* Get rid of vm_pageout_page_queued().Mark Johnston2018-05-131-23/+7
| | | | | | | | | | | vm_page_queue(), added in r333256, generalizes vm_pageout_page_queued(), so use it instead. No functional change intended. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D15402 Notes: svn path=/head/; revision=333581
* uma: increase alignment to 128 bytes on amd64Mateusz Guzik2018-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current UMA internals are not suited for efficient operation in multi-socket environments. In particular there is very common use of MAXCPU arrays and other fields which are not always properly aligned and are not local for target threads (apart from the first node of course). Turns out the existing UMA_ALIGN macro can be used to mostly work around the problem until the code get fixed. The current setting of 64 bytes runs into trouble when adjacent cache line prefetcher gets to work. An example 128-way benchmark doing a lot of malloc/frees has the following instruction samples: before: kernel`lf_advlockasync+0x43b 32940 kernel`malloc+0xe5 42380 kernel`bzero+0x19 47798 kernel`spinlock_exit+0x26 60423 kernel`0xffffffff80 78238 0x0 136947 kernel`uma_zfree_arg+0x46 159594 kernel`uma_zalloc_arg+0x672 180556 kernel`uma_zfree_arg+0x2a 459923 kernel`uma_zalloc_arg+0x5ec 489910 after: kernel`bzero+0xd 46115 kernel`lf_advlockasync+0x25f 46134 kernel`lf_advlockasync+0x38a 49078 kernel`fget_unlocked+0xd1 49942 kernel`lf_advlockasync+0x43b 55392 kernel`copyin+0x4a 56963 kernel`bzero+0x19 81983 kernel`spinlock_exit+0x26 91889 kernel`0xffffffff80 136357 0x0 239424 See the review for more details. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D15346 Notes: svn path=/head/; revision=333484
* Fix some races introduced in r332974.Mark Johnston2018-05-044-33/+36
| | | | | | | | | | | | | | | | | With r332974, when performing a synchronized access of a page's "queue" field, one must first check whether the page is logically dequeued. If so, then the page lock does not prevent the page from being removed from its page queue. Intoduce vm_page_queue(), which returns the page's logical queue index. In some cases, direct access to the "queue" field is still required, but such accesses should be confined to sys/vm. Reported and tested by: pho Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15280 Notes: svn path=/head/; revision=333256
* Eliminate some vm object relocks in vm fault.Konstantin Belousov2018-04-291-9/+13
| | | | | | | | | | | | | | | | | | For the vm_fault_prefault() call from vm_fault_soft_fast(), extend the scope of the object rlock to avoid re-taking it inside vm_fault_prefault(). It causes pmap_enter_quick() sometimes called with shadow object lock as well as the page lock, but this looks innocent. Noted and measured by: mjg Reviewed by: alc, markj (as part of the larger patch) Tested by: pho (as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15122 Notes: svn path=/head/; revision=333091
* uma: whack main zone counter update in the slow pathMateusz Guzik2018-04-271-8/+0
| | | | | | | | | | | Cached counters are typically zero at this point so it performs avoidable atomics. Everything reading them also reads the cached ones, thus there is really no point. Reviewed by: jeff Notes: svn path=/head/; revision=333052
* vm: move vm_cnt to __read_mostly now that it is not written toMateusz Guzik2018-04-271-1/+1
| | | | | | | | | While here whack unused locking keys for the struct. Discussed with: jeff Notes: svn path=/head/; revision=333051
* Improve VM page queue scalability.Mark Johnston2018-04-247-548/+889
| | | | | | | | | | | | | | | | | | | | | Currently both the page lock and a page queue lock must be held in order to enqueue, dequeue or requeue a page in a given page queue. The queue locks are a scalability bottleneck in many workloads. This change reduces page queue lock contention by batching queue operations. To detangle the page and page queue locks, per-CPU batch queues are used to reference pages with pending queue operations. The requested operation is encoded in the page's aflags field with the page lock held, after which the page is enqueued for a deferred batch operation. Page queue scans are similarly optimized to minimize the amount of work performed with a page queue lock held. Reviewed by: kib, jeff (previous versions) Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14893 Notes: svn path=/head/; revision=332974
* Add a UMA zone flag to disable the use of buckets.Mark Johnston2018-04-242-5/+10
| | | | | | | | | | | | | | | | This allows the creation of zones which don't do any caching in front of the keg. If the zone is a cache zone, this means that UMA will not attempt any memory allocations when allocating an item from the backend. This is intended for use after a panic by netdump, but likely has other applications. Reviewed by: kib MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15184 Notes: svn path=/head/; revision=332968
* Initialize marker pages in vm_page_domain_init().Mark Johnston2018-04-194-42/+44
| | | | | | | | | | | | | They were previously initialized by the corresponding page daemon threads, but for vmd_inacthead this may be too late if vm_page_deactivate_noreuse() is called during boot. Reported and tested by: cperciva Reviewed by: alc, kib MFC after: 1 week Notes: svn path=/head/; revision=332771
* Ensure that m and skip_m belong to the same object.Mark Johnston2018-04-171-0/+2
| | | | | | | | | | | | | | Pages allocated from a given reservation may belong to different objects. It is therefore possible for vm_page_ps_test() to be called with the base page's object unlocked. Check for this case before asserting that the object lock is held. Reported by: jhb Reviewed by: kib MFC after: 1 week Notes: svn path=/head/; revision=332658
* Handle Skylake-X errata SKZ63.Konstantin Belousov2018-04-072-16/+25
| | | | | | | | | | | | | | | | | | | | | | SKZ63 Processor May Hang When Executing Code In an HLE Transaction Region Problem: Under certain conditions, if the processor acquires an HLE (Hardware Lock Elision) lock via the XACQUIRE instruction in the Host Physical Address range between 40000000H and 403FFFFFH, it may hang with an internal timeout error (MCACOD 0400H) logged into IA32_MCi_STATUS. Move the pages from the range into the blacklist. Add a tunable to not waste 4M if local DoS is not the issue. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15001 Notes: svn path=/head/; revision=332182
* Move most of the contents of opt_compat.h to opt_global.h.Brooks Davis2018-04-064-6/+0
| | | | | | | | | | | | | | | | | | | | | opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941 Notes: svn path=/head/; revision=332122
* Ensure the background laundering threshold is positive after a scan.Mark Johnston2018-04-021-3/+5
| | | | | | | | | | | | | | | | The division added in r331732 meant that we wouldn't attempt a background laundering until at least v_free_target - v_free_min clean pages had been freed by the page daemon since the last laundering. If the inactive queue is depleted but not completely empty (e.g., because it contains busy pages), it can thus take a long time to meet this threshold. Restore the pre-r331732 behaviour of using a non-zero background laundering threshold if at least one inactive queue scan has elapsed since the last attempt at background laundering. Submitted by: tijl (original version) Notes: svn path=/head/; revision=331879
* Use UMA_SLAB_SPACE macro. No functional change here.Gleb Smirnoff2018-04-021-1/+1
| | | | Notes: svn path=/head/; revision=331873
* In uma_startup_count() handle special case when zone will fit intoGleb Smirnoff2018-04-021-1/+3
| | | | | | | | | | | single slab, but with alignment adjustment it won't. Again, when there is only one item in a slab alignment can be ignored. See previous revision of this file for more info. PR: 227116 Notes: svn path=/head/; revision=331872
* Handle a special case when a slab can fit only one allocation,Gleb Smirnoff2018-04-021-1/+9
| | | | | | | | | | | | | | | | | | | | | and zone has a large alignment. With alignment taken into account uk_rsize will be greater than space in a slab. However, since we have only one item per slab, it is always naturally aligned. Code that will panic before this change with 4k page: z = uma_zcreate("test", 3984, NULL, NULL, NULL, NULL, 31, 0); uma_zalloc(z, M_WAITOK); A practical scenario to hit the panic is a machine with 56 CPUs and 2 NUMA domains, which yields in zone size of 3984. PR: 227116 MFC after: 2 weeks Notes: svn path=/head/; revision=331871
* Add a uma cache of free pages in the DEFAULT freepool. This gives usJeff Roberson2018-04-014-8/+109
| | | | | | | | | | | | | | | | | per-cpu alloc and free of pages. The cache is filled with as few trips to the phys allocator as possible by the use of a new vm_phys_alloc_npages() function which allocates as many as N pages. This code was originally by markj with the import function rewritten by me. Reviewed by: markj, kib Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14905 Notes: svn path=/head/; revision=331863
* Add the flag ZONE_NOBUCKETCACHE. This flag instructions UMA not to keepJeff Roberson2018-04-012-1/+11
| | | | | | | | | | | | | | a cache of fully populated buckets. This will be used in a follow-on commit. The flag idea was originally from markj. Reviewed by: markj, kib Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Notes: svn path=/head/; revision=331862
* Make vm_map_max/min/pmap KBI stable.Konstantin Belousov2018-03-302-0/+30
| | | | | | | | | | | | | | | | There are out of tree consumers of vm_map_min() and vm_map_max(), and I believe there are consumers of vm_map_pmap(), although the later is arguably less in the need of KBI-stable interface. For the consumers benefit, make modules using this KPI not depended on the struct vm_map layout. Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D14902 Notes: svn path=/head/; revision=331760
* Fix the background laundering mechanism after r329882.Mark Johnston2018-03-292-20/+19
| | | | | | | | | | | | | | Rather than using the number of inactive queue scans as a metric for how many clean pages are being freed by the page daemon, have the page daemon keep a running counter of the number of pages it has freed, and have the laundry thread use that when computing the background laundering threshold. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D14884 Notes: svn path=/head/; revision=331732
* Implement several enhancements to NUMA policies.Jeff Roberson2018-03-294-26/+76
| | | | | | | | | | | | | | | | | | | | | | | | | Add a new "interleave" allocation policy which stripes pages across domains with a stride or width keeping contiguity within a multi-page region. Move the kernel to the dedicated numbered cpuset #2 making it possible to assign kernel threads and memory policy separately from user. This also eliminates the need for the complicated interrupt binding code. Add a sysctl API for viewing and manipulating domainsets. Refactor some of the cpuset_t manipulation code using the generic bitset type so that it can be used for both. This probably belongs in a dedicated subr file. Attempt to improve the include situation. Reviewed by: kib Discussed with: jhb (cpuset parts) Tested by: pho (before review feedback) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14839 Notes: svn path=/head/; revision=331723
* Move vm_ndomains to vm.h where it can be used with a single header includeJeff Roberson2018-03-272-1/+2
| | | | | | | | | | rather than requiring a half-dozen. Many non-vm files may want to know the number of valid domains. Sponsored by: Netflix, Dell/EMC Isilon Notes: svn path=/head/; revision=331605
* Allow to specify for vm_fault_quick_hold_pages() that nofault modeKonstantin Belousov2018-03-262-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | should be honored. We must not sleep or acquire any MI VM locks if TDP_NOFAULTING is specified. On the other hand, there were some callers in the tree which set TDP_NOFAULTING for larger scope than needed, I fixed the code which I wrote, but I suspect that linuxkpi and out of tree drm drivers might abuse this still. So only enable the mode for vm_fault_quick_hold_pages() where vm_fault_hold() is not called when specifically asked by user. I decided to use vm_prot_t flag to not change KPI. Since number of flags in vm_prot_t is limited, I reused the same flag which was already consumed for vm_map_lookup(). Reported and tested by: pho (as part of the larger patch) Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D14825 Notes: svn path=/head/; revision=331557