aboutsummaryrefslogtreecommitdiff
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Move kstack_cache_entry into the private header, and make theKonstantin Belousov2011-12-161-7/+2
| | | | | | | | | stack cache list header accessible outside vm_glue.c. MFC after: 1 week Notes: svn path=/head/; revision=228567
* - The previous commit (r228449) accidentally moved the vm.stats.vm.* sysctlsEitan Adler2011-12-141-47/+50
| | | | | | | | | | | | | to vm.stats.sys. Move them back. Noticed by: pho Reviewed by: bde (earlier version) Approved by: bz MFC after: 1 week Pointy hat to: me Notes: svn path=/head/; revision=228498
* Document a large number of currently undocumented sysctls. While hereEitan Adler2011-12-131-108/+63
| | | | | | | | | | | | | | | fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week Notes: svn path=/head/; revision=228449
* Fix printf.Konstantin Belousov2011-12-121-1/+1
| | | | | | | | Submitted by: az MFC after: 1 week Notes: svn path=/head/; revision=228432
* Introduce vm_reserv_alloc_contig() and teach vm_page_alloc_contig() how toAlan Cox2011-12-053-71/+269
| | | | | | | | | | | | | | | | | | use superpage reservations. So, for the first time, kernel virtual memory that is allocated by contigmalloc(), kmem_alloc_attr(), and kmem_alloc_contig() can be promoted to superpages. In fact, even a series of small contigmalloc() allocations may collectively result in a promoted superpage. Eliminate some duplication of code in vm_reserv_alloc_page(). Change the type of vm_reserv_reclaim_contig()'s first parameter in order that it be consistent with other vm_*_contig() functions. Tested by: marius (sparc64) Notes: svn path=/head/; revision=228287
* Rename vm_page_set_valid() to vm_page_set_valid_range().Konstantin Belousov2011-11-303-6/+6
| | | | | | | | | | The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc Notes: svn path=/head/; revision=228156
* Hide the internals of vm_page_lock(9) from the loadable modules.Konstantin Belousov2011-11-292-0/+49
| | | | | | | | | | | | | | Since the address of vm_page lock mutex depends on the kernel options, it is easy for module to get out of sync with the kernel. No vm_page_lockptr() accessor is provided for modules. It can be added later if needed, unless proper KPI is developed to serve the needs. Reviewed by: attilio, alc MFC after: 3 weeks Notes: svn path=/head/; revision=228133
* Introduce the same mutex-wise fix in r227758 for sx locks.Attilio Rao2011-11-211-29/+13
| | | | | | | | | | | | | | | | | | | | | | | | The functions that offer file and line specifications are: - sx_assert_ - sx_downgrade_ - sx_slock_ - sx_slock_sig_ - sx_sunlock_ - sx_try_slock_ - sx_try_xlock_ - sx_try_upgrade_ - sx_unlock_ - sx_xlock_ - sx_xlock_sig_ - sx_xunlock_ Now vm_map locking is fully converted and can avoid to know specifics about locking procedures. Reviewed by: kib MFC after: 1 month Notes: svn path=/head/; revision=227788
* Introduce macro stubs in the mutex implementation that will be alwaysAttilio Rao2011-11-201-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | defined and will allow consumers, willing to provide options, file and line to locking requests, to not worry about options redefining the interfaces. This is typically useful when there is the need to build another locking interface on top of the mutex one. The introduced functions that consumers can use are: - mtx_lock_flags_ - mtx_unlock_flags_ - mtx_lock_spin_flags_ - mtx_unlock_spin_flags_ - mtx_assert_ - thread_lock_flags_ Spare notes: - Likely we can get rid of all the 'INVARIANTS' specification in the ppbus code by using the same macro as done in this patch (but this is left to the ppbus maintainer) - all the other locking interfaces may require a similar cleanup, where the most notable case is sx which will allow a further cleanup of vm_map locking facilities - The patch should be fully compatible with older branches, thus a MFC is previewed (infact it uses all the underlying mechanisms already present). Comments review by: eadler, Ben Kaduk Discussed with: kib, jhb MFC after: 1 month Notes: svn path=/head/; revision=227758
* Eliminate end-of-line white space.Alan Cox2011-11-171-2/+2
| | | | Notes: svn path=/head/; revision=227606
* Refactor the code that performs physically contiguous memory allocation,Alan Cox2011-11-165-109/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | yielding a new public interface, vm_page_alloc_contig(). This new function addresses some of the limitations of the current interfaces, contigmalloc() and kmem_alloc_contig(). For example, the physically contiguous memory that is allocated with those interfaces can only be allocated to the kernel vm object and must be mapped into the kernel virtual address space. It also provides functionality that vm_phys_alloc_contig() doesn't, such as wiring the returned pages. Moreover, unlike that function, it respects the low water marks on the paging queues and wakes up the page daemon when necessary. That said, at present, this new function can't be applied to all types of vm objects. However, that restriction will be eliminated in the coming weeks. From a design standpoint, this change also addresses an inconsistency between vm_phys_alloc_contig() and the other vm_phys_alloc*() functions. Specifically, vm_phys_alloc_contig() manipulated vm_page fields that other functions in vm/vm_phys.c didn't. Moreover, vm_phys_alloc_contig() knew about vnodes and reservations. Now, vm_page_alloc_contig() is responsible for these things. Reviewed by: kib Discussed with: jhb Notes: svn path=/head/; revision=227568
* Update the device pager interface, while keeping the compatibilityKonstantin Belousov2011-11-153-75/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | layer for old KPI and KBI. New interface should be used together with d_mmap_single cdevsw method. Device pager can be allocated with the cdev_pager_allocate(9) function, which takes struct cdev_pager_ops, containing constructor/destructor and page fault handler methods supplied by driver. Constructor and destructor, called at the pager allocation and deallocation time, allow the driver to handle per-object private data. The pager handler is called to handle page fault on the vm map entry backed by the driver pager. Driver shall return either the vm_page_t which should be mapped, or error code (which does not cause kernel panic anymore). The page handler interface has a placeholder to specify the access mode causing the fault, but currently PROT_READ is always passed there. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 1 month Notes: svn path=/head/; revision=227530
* Remove the condition that is always true.Konstantin Belousov2011-11-151-1/+1
| | | | | | | | Submitted by: alc MFC after: 1 week Notes: svn path=/head/; revision=227529
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.Ed Schouten2011-11-073-3/+4
| | | | | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static. Notes: svn path=/head/; revision=227309
* Wake up the page daemon in vm_page_alloc_freelist() if it couldn'tAlan Cox2011-11-061-20/+36
| | | | | | | | | | | | | | allocate the requested page because too few pages are cached or free. Document the VM_ALLOC_COUNT() option to vm_page_alloc() and vm_page_alloc_freelist(). Make style changes to vm_page_alloc() and vm_page_alloc_freelist(), such as using a variable name that more closely corresponds to the comments. Notes: svn path=/head/; revision=227127
* Remove redundand definitions. The chunk was missed from r227102.Konstantin Belousov2011-11-051-10/+0
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=227103
* Provide typedefs for the type of bit mask for the page bits.Konstantin Belousov2011-11-053-30/+33
| | | | | | | | | | | | Use the defined types instead of int when manipulating masks. Supposedly, it could fix support for 32KB page size in the machine-independend VM layer. Reviewed by: alc MFC after: 2 weeks Notes: svn path=/head/; revision=227102
* Simplify the implementation of the failure case in kmem_alloc_attr().Alan Cox2011-11-041-8/+7
| | | | Notes: svn path=/head/; revision=227072
* Add the posix_fadvise(2) system call. It is somewhat similar toJohn Baldwin2011-11-042-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month Notes: svn path=/head/; revision=227070
* Add support for VM_ALLOC_WIRED and VM_ALLOC_ZERO to vm_page_alloc_freelist()Alan Cox2011-11-021-9/+42
| | | | | | | | | | | | | | | and use these new options in the mips pmap. Wake up the page daemon in vm_page_alloc_freelist() if the number of free and cached pages becomes too low. Tidy up vm_page_alloc_init(). In particular, add a comment about an important restriction on its use. Tested by: jchandra@ Notes: svn path=/head/; revision=227012
* Eliminate vm_phys_bootstrap_alloc(). It was a failed attempt atAlan Cox2011-10-306-57/+76
| | | | | | | | | | | | | | | | | | | | eliminating duplicated code in the various pmap implementations. Micro-optimize vm_phys_free_pages(). Introduce vm_phys_free_contig(). It is fast routine for freeing an arbitrary number of physically contiguous pages. In particular, it doesn't require the number of pages to be a power of two. Use "u_long" instead of "unsigned long". Bruce Evans (bde@) has convinced me that the "boundary" parameters to kmem_alloc_contig(), vm_phys_alloc_contig(), and vm_reserv_reclaim_contig() should be of type "vm_paddr_t" and not "u_long". Make this change. Notes: svn path=/head/; revision=226928
* Use "u_long" instead of "unsigned long".Alan Cox2011-10-282-5/+4
| | | | Notes: svn path=/head/; revision=226891
* Tidy up the comment at the head of vm_page_alloc, and mention that theAlan Cox2011-10-271-6/+8
| | | | | | | returned page has the flag VPO_BUSY set. Notes: svn path=/head/; revision=226848
* Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls toAlan Cox2011-10-271-1/+1
| | | | | | | | | vm_page_alloc(). While I'm here, for the sake of consistency, always specify the allocation class, such as VM_ALLOC_NORMAL, as the first of the flags. Notes: svn path=/head/; revision=226843
* contigmalloc(9) and contigfree(9) are now implemented in terms of otherAlan Cox2011-10-271-28/+0
| | | | | | | | | more general VM system interfaces. So, their implementation can now reside in kern_malloc.c alongside the other functions that are declared in malloc.h. Notes: svn path=/head/; revision=226824
* Speed up vm_page_cache() and vm_page_remove() by checking for a fewAlan Cox2011-10-251-18/+72
| | | | | | | | | | | | common cases that can be handled in constant time. The insight being that a page's parent in the vm object's tree is very often its predecessor or successor in the vm object's ordered memq. Tested by: jhb MFC after: 10 days Notes: svn path=/head/; revision=226740
* VN_NRESERVLEVEL is used in this file but opt_vm is not includedAttilio Rao2011-10-221-0/+1
| | | | | | | | | | | thus the stub switch won't be correctly handled. Include opt_vm.h. Submitted by: jeff MFC after: 3 days Notes: svn path=/head/; revision=226642
* Control the execution permission of the readable segments forKonstantin Belousov2011-10-151-1/+1
| | | | | | | | | | i386 binaries on the amd64 and ia64 with the sysctl, instead of unconditionally enabling it. Reviewed by: marcel Notes: svn path=/head/; revision=226388
* Fix a typo in a comment.John Baldwin2011-10-141-1/+1
| | | | Notes: svn path=/head/; revision=226366
* In sys_obreak() and when compiling for amd64 or ia64, when the processMarcel Moolenaar2011-10-131-2/+12
| | | | | | | | is ILP32 (i.e. i386) grant execute permissions by default. The JDK 1.4.x depends on being able to execute from the heap on i386. Notes: svn path=/head/; revision=226343
* Make memguard(9) capable to guard uma(9) allocations.Gleb Smirnoff2011-10-124-14/+84
| | | | Notes: svn path=/head/; revision=226313
* Style nit.Konstantin Belousov2011-09-291-1/+0
| | | | | | | | Submitted by: jhb MFC after: 2 weeks Notes: svn path=/head/; revision=225856
* Fix grammar.Konstantin Belousov2011-09-282-5/+5
| | | | | | | | Submitted by: bf MFC after: 2 weeks Notes: svn path=/head/; revision=225843
* Use the trick of performing the atomic operation on the contained alignedKonstantin Belousov2011-09-283-49/+50
| | | | | | | | | | | | | | word to handle the dirty mask updates in vm_page_clear_dirty_mask(). Remove the vm page queue lock around vm_page_dirty() call in vm_fault_hold() the sole purpose of which was to protect dirty on architectures which does not provide short or byte-wide atomics. Reviewed by: alc, attilio Tested by: flo (sparc64) MFC after: 2 weeks Notes: svn path=/head/; revision=225840
* Use the explicitly-sized types for the dirty and valid masks.Konstantin Belousov2011-09-281-8/+8
| | | | | | | | | Requested by: attilio Reviewed by: alc MFC after: 2 weeks Notes: svn path=/head/; revision=225838
* In order to maximize the re-usability of kernel code in user space thisKip Macy2011-09-163-19/+19
| | | | | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz) Notes: svn path=/head/; revision=225617
* Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomicKonstantin Belousov2011-09-068-89/+106
| | | | | | | | | | | | | | | | | | | | flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz) Notes: svn path=/head/; revision=225418
* Update some comments in swap_pager.c.Konstantin Belousov2011-08-221-30/+17
| | | | | | | | | Reviewed and most wording by: alc MFC after: 1 week Approved by: re (bz) Notes: svn path=/head/; revision=225089
* Apply the limit to avoid the overflows in the radix tree subr_blist.cKonstantin Belousov2011-08-221-10/+12
| | | | | | | | | | | | | | after the conversion of the swap device size to the page size units, not before. That lifts the limit on the usable swap partition size from 32GB to 256GB, that is less depressing for the modern systems. Submitted by: Alexander V. Chernikov <melifaro ipfw ru> Reviewed by: alc Approved by: re (bz) MFC after: 2 weeks Notes: svn path=/head/; revision=225076
* Second-to-last commit implementing Capsicum capabilities in the FreeBSDRobert Watson2011-08-111-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc Notes: svn path=/head/; revision=224778
* - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flagKonstantin Belousov2011-08-094-33/+37
| | | | | | | | | | | | | | | | | to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz) Notes: svn path=/head/; revision=224746
* Fix an error in kmem_alloc_attr(). Unless "tries" is updated,Alan Cox2011-08-071-0/+1
| | | | | | | | | | kmem_alloc_attr() could get stuck in a loop. Approved by: re (kib) MFC after: 3 days Notes: svn path=/head/; revision=224689
* Implement the linprocfs swaps file, providing information about theKonstantin Belousov2011-08-012-21/+40
| | | | | | | | | | | | | configured swap devices in the Linux-compatible format. Based on the submission by: Robert Millan <rmh debian org> PR: kern/159281 Reviewed by: bde Approved by: re (kensmith) MFC after: 2 weeks Notes: svn path=/head/; revision=224582
* Fix a race in the device pager allocation. If another thread won andKonstantin Belousov2011-07-301-2/+9
| | | | | | | | | | | | | | | allocated the device pager for the given handle, then the object fictitious pages list and the object membership in the global object list still need to be initialized. Otherwise, dev_pager_dealloc() will traverse uninitialized pointers. Reported and tested by: pho Reviewed by: jhb Approved by: re (kensmith) MFC after: 1 week Notes: svn path=/head/; revision=224522
* Extract the code to translate VM error into errno, into an exportedKonstantin Belousov2011-07-102-0/+8
| | | | | | | | | | | | | function vm_mmap_to_errno(). It is useful for the drivers that implement mmap(2)-like functionality, to be able to return error codes consistent with mmap(2). Sponsored by: The FreeBSD Foundation No objections from: alc MFC after: 1 week Notes: svn path=/head/; revision=223914
* Style.Konstantin Belousov2011-07-101-1/+1
| | | | | | | MFC after: 3 days Notes: svn path=/head/; revision=223913
* Add a facility to disable processing page faults. When activated,Konstantin Belousov2011-07-092-0/+18
| | | | | | | | | | | uiomove generates EFAULT if any accessed address is not mapped, as opposed to handling the fault. Sponsored by: The FreeBSD Foundation Reviewed by: alc (previous version) Notes: svn path=/head/; revision=223889
* All the racct_*() calls need to happen with the proc locked. Fixing thisEdward Tomasz Napierala2011-07-066-0/+42
| | | | | | | | | won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT". Notes: svn path=/head/; revision=223825
* Handle a race between device_pager and devsw in a more graceful manner:Attilio Rao2011-07-061-2/+4
| | | | | | | | | | | | return an error code rather than panic the kernel. Sponsored by: Sandvine Incorporated Reviewed by: kib Tested by: pho MFC after: 2 weeks Notes: svn path=/head/; revision=223823
* Initialize marker pages as held rather than fictitious/wired. Marking theAlan Cox2011-07-021-2/+8
| | | | | | | | | | page as held is more useful as a safety precaution in case someone forgets to check for PG_MARKER. Reviewed by: kib Notes: svn path=/head/; revision=223729