aboutsummaryrefslogtreecommitdiff
path: root/sys/vm/vnode_pager.c
Commit message (Collapse)AuthorAgeFilesLines
* Add vnode_pager_clean_{a,}sync(9)Konstantin Belousov2024-01-181-0/+27
| | | | (cherry picked from commit b068bb09a1a82d9fef0e939ad6135443a959e290)
* vnode_pager_generic_putpages(): rename maxblksz local to max_offsetKonstantin Belousov2024-01-181-7/+7
| | | | (cherry picked from commit ed1a88a3116a59b4fd37912099a575b4c8f559dc)
* vnode_pager_generic_putpages(): correctly handle clean block at EOFKonstantin Belousov2024-01-181-1/+2
| | | | | | PR: 276191 (cherry picked from commit bdb46c21a3e68d4395d6e0b6a205187e655532b0)
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-231-2/+0
| | | | | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/ Similar commit in current: (cherry picked from commit 685dc743dc3b)
* vnode_pager_input: return runningbufspace backKonstantin Belousov2023-03-311-1/+10
| | | | (cherry picked from commit 28f957b8b3a22086927451fee89789fdf596260b)
* vm_object_kvme_type(): reimplement by embedding kvme_type into pageropsKonstantin Belousov2021-05-221-0/+2
| | | | (cherry picked from commit 00a3fe968b840ee197c32dfe4107dab730bd9915)
* Constify vm_pager-related virtual tables.Konstantin Belousov2021-05-221-1/+1
| | | | (cherry picked from commit d474440ab33c683b0e3f55e8e854f055615db6ec)
* Add pgo_getvp methodKonstantin Belousov2021-05-221-0/+8
| | | | (cherry picked from commit 192112b74fed56ca652cf1d70c11ba7e17bc1ce2)
* Add pgo_mightbedirty methodKonstantin Belousov2021-05-221-0/+1
| | | | (cherry picked from commit c23c555bc15ce1523b95fb8da99ae77c0bb0977e)
* vm_pager: add pgo_set_writeable_dirty methodKonstantin Belousov2021-05-221-0/+1
| | | | (cherry picked from commit 180bcaa46c5d297d137749258b23593d578d76a5)
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-281-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* vm_ooffset_t is now unsignedEric van Gyzen2020-09-181-3/+0
| | | | | | | | | | | | | vm_ooffset_t is now unsigned. Remove some tests for negative values, or make other adjustments accordingly. Reported by: Coverity Reviewed by: kib markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26214 Notes: svn path=/head/; revision=365886
* vm: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-1/+0
| | | | Notes: svn path=/head/; revision=365074
* vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_errorMateusz Guzik2020-08-191-2/+2
| | | | | | | Most consumers pass NULL. Notes: svn path=/head/; revision=364372
* Atomically update vm_object vnp_size, where atomic is available.Konstantin Belousov2020-08-161-0/+4
| | | | | | | | | | | | This will be used later, where it matters on 32bit arches. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25968 Notes: svn path=/head/; revision=364285
* Remove most lingering references to the page lock in comments.Mark Johnston2020-08-041-1/+1
| | | | | | | | | | | | | Finish updating comments to reflect new locking protocols introduced over the past year. In particular, vm_page_lock is now effectively unused. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25868 Notes: svn path=/head/; revision=363840
* Fix vnode_pager handling of read ahead/behind pages when a disk read fails.Chuck Silvers2020-07-171-0/+15
| | | | | | | | | | | | Rather than marking the read ahead/behind pages valid even though they were not initialized, free them using the new function vm_page_free_invalid(). Reviewed by: markj, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25430 Notes: svn path=/head/; revision=363296
* Revert my change from r361855 in favor of a better fix.Chuck Silvers2020-07-171-24/+22
| | | | | | | | | Reviewed by: markj, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25430 Notes: svn path=/head/; revision=363294
* Don't mark pages as valid if reading the contents from disk fails.Chuck Silvers2020-06-061-22/+24
| | | | | | | | | | | | | Instead, just skip marking pages valid if the read fails. Future attempts to access such pages will notice that they are not marked valid and try to read them from disk again. Reviewed by: kib, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25138 Notes: svn path=/head/; revision=361855
* VOP_GETPAGES_ASYNC(): consistently call iodone() callback in case of error.Konstantin Belousov2020-03-301-2/+6
| | | | | | | | | | | Reviewed by: glebius, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24038 Notes: svn path=/head/; revision=359466
* Don't convert all lower-layer errors to EIO.Warner Losh2020-02-201-3/+8
| | | | | | | | | | | | | Don't convert all lower layer errors to EIO. Instead, pass the actual error up the stack. This will allow the upper layers that look for ENXIO to react properly to that signal from the lower layers and, for UFS, unmount the filesystem. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23755 Notes: svn path=/head/; revision=358143
* Don't spam the console with an additional, and useless, error message.Warner Losh2020-02-201-2/+0
| | | | | | | | | | | | There's no need to spam the console with this error message. If there's an I/O error, the disk/cam driver will report it at the lower levels. If that's an actual problem, the upper layers will report that. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23756 Notes: svn path=/head/; revision=358134
* Fix up various vnode-related asserts which did not dump the used vnodeMateusz Guzik2020-02-031-1/+1
| | | | Notes: svn path=/head/; revision=357446
* Don't hold the object lock while calling getpages.Jeff Roberson2020-01-191-4/+1
| | | | | | | | | | | | | The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033 Notes: svn path=/head/; revision=356902
* It has not been possible to recursively terminate a vnode object for some timeJeff Roberson2020-01-191-25/+13
| | | | | | | | | | now. Eliminate the dead code that supports it. Approved by: kib, markj Differential Revision: https://reviews.freebsd.org/D22908 Notes: svn path=/head/; revision=356887
* vm: add missing CLTFLAG_MPSAFE annotationsMateusz Guzik2020-01-121-3/+3
| | | | | | | This covers all vm/* files. Notes: svn path=/head/; revision=356653
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-031-1/+1
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-081-4/+4
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* Use atomics in more cases for object references. We now can completelyJeff Roberson2019-11-271-9/+15
| | | | | | | | | | | omit the object lock if we are above a certain threshold. Hold only a single vnode reference when the vnode object has any ref > 0. This allows us to only lock the object and vnode on 0-1 and 1-0 transitions. Differential Revision: https://reviews.freebsd.org/D22452 Notes: svn path=/head/; revision=355122
* Remove unnecessary object locking from the vnode pager. Recent changes toJeff Roberson2019-11-191-24/+7
| | | | | | | | | | busy/valid/dirty locking make these acquires redundant. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D22186 Notes: svn path=/head/; revision=354870
* Use atomics and a shared object lock to protect the object reference count.Jeff Roberson2019-10-291-5/+6
| | | | | | | | | | | | | | Certain consumers still need to guarantee a stable reference so we can not switch entirely to atomics yet. Exclusive lock holders can still modify and examine the refcount without using the ref api. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21598 Notes: svn path=/head/; revision=354157
* Check for bogus_page in vnode_pager_generic_getpages_done().Mark Johnston2019-10-231-0/+2
| | | | | | | | | | | | | | | We now assert that a page is busy when updating its validity-tracking state, but bogus_page is not busied during a getpages operation. Reported by: syzkaller Reviewed by: alc, kib Discussed with: jeff MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22124 Notes: svn path=/head/; revision=353957
* Assert that vnode_pager_setsize() is called with the vnode exclusively lockedKonstantin Belousov2019-10-221-1/+10
| | | | | | | | | | | | except for filesystems that set the MNTK_VMSETSIZE_BUG, Set the flag for ZFS. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883 Notes: svn path=/head/; revision=353892
* Add VV_VMSIZEVNLOCK flag.Konstantin Belousov2019-10-221-1/+5
| | | | | | | | | | | | | | | The flag specifies that vm_fault() handler should check the vnode' vm_object size under the vnode lock. It is converted into the object' OBJ_SIZEVNLOCK flag in vnode_pager_alloc(). Tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883 Notes: svn path=/head/; revision=353890
* (4/6) Protect page valid with the busy lock.Jeff Roberson2019-10-151-7/+12
| | | | | | | | | | | | | | Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594 Notes: svn path=/head/; revision=353539
* vm pager: writemapping accounting for OBJT_SWAPKyle Evans2019-09-031-2/+8
| | | | | | | | | | | | | | | | | | | | | Currently writemapping accounting is only done for vnode_pager which does some accounting on the underlying vnode. Extend this to allow accounting to be possible for any of the pager types. New pageops are added to update/release writecount that need to be implemented for any pager wishing to do said accounting, and we implement these methods now for both vnode_pager (unchanged) and swap_pager. The primary motivation for this is to allow other systems with OBJT_SWAP objects to check if their objects have any write mappings and reject operations with EBUSY if so. posixshm will be the first to do so in order to reject adding write seals to the shmfd if any writable mappings exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456 Notes: svn path=/head/; revision=351795
* Rework v_object lifecycle for vnodes.Konstantin Belousov2019-08-291-38/+13
| | | | | | | | | | | | | | | | | | | | | | | Current implementation of vnode_create_vobject() and vnode_destroy_vobject() is written so that it prepared to handle the vm object destruction for live vnode. Practically, no filesystems use this, except for some remnants that were present in UFS till today. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim. Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21412 Notes: svn path=/head/; revision=351598
* Move OBJT_VNODE specific code from vm_object_terminate() toKonstantin Belousov2019-08-251-0/+15
| | | | | | | | | | | | | vnode_destroy_vobject(). Reviewed by: alc, jeff (previous version), markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21357 Notes: svn path=/head/; revision=351478
* Permit vm_pager_has_page() to run with a shared lock. IntroduceJeff Roberson2019-08-191-3/+4
| | | | | | | | | | | | | VM_OBJECT_DROP/VM_OBJECT_PICKUP to handle functions that are called with uncertain lock state. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21310 Notes: svn path=/head/; revision=351235
* Fix style(9) violations involving division by PAGE_SIZE.Doug Moore2019-07-061-2/+2
| | | | | | | | | Reviewed by: alc Approved by: markj (mentor) Differential Revision: https://reviews.freebsd.org/D20847 Notes: svn path=/head/; revision=349791
* Include ktr.h in more compilation unitsConrad Meyer2019-05-211-0/+1
| | | | | | | | | | | | | | | | | | Similar to r348026, exhaustive search for uses of CTRn() and cross reference ktr.h includes. Where it was obvious that an OS compat header of some kind included ktr.h indirectly, .c files were left alone. Some of these files clearly got ktr.h via header pollution in some scenarios, or tinderbox would not be passing prior to this revision, but go ahead and explicitly include it in files using it anyway. Like r348026, these CUs did not show up in tinderbox as missing the include. Reported by: peterj (arm64/mp_machdep.c) X-MFC-With: r347984 Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=348064
* Switch to use shared vnode locks for text files during image activation.Konstantin Belousov2019-05-051-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kern_execve() locks text vnode exclusive to be able to set and clear VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0 condition. The change removes VV_TEXT, replacing it with the condition v_writecount <= -1, and puts v_writecount under the vnode interlock. Each text reference decrements v_writecount. To clear the text reference when the segment is unmapped, it is recorded in the vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and v_writecount is incremented on the map entry removal The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that v_writecount does not contradict the desired change. vn_writecheck() is now racy and its use was eliminated everywhere except access. Atomic check for writeability and increment of v_writecount is performed by the VOP. vn_truncate() now increments v_writecount around VOP_SETATTR() call, lack of which is arguably a bug on its own. nullfs bypasses v_writecount to the lower vnode always, so nullfs vnode has its own v_writecount correct, and lower vnode gets all references, since object->handle is always lower vnode. On the text vnode' vm object dealloc, the v_writecount value is reset to zero, and deadfs vop_unset_text short-circuit the operation. Reclamation of lowervp always reclaims all nullfs vnodes referencing lowervp first, so no stray references are left. Reviewed by: markj, trasz Tested by: mjg, pho Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D19923 Notes: svn path=/head/; revision=347151
* Fix incorrect assertion in vnode_pager_generic_getpages()Jason A. Harmening2019-02-261-1/+1
| | | | | | | | Reviewed by: kib, glebius MFC after: 1 week Notes: svn path=/head/; revision=344561
* For 32-bit machines rollback the default number of vnode pager pbufsGleb Smirnoff2019-02-151-1/+11
| | | | | | | | | | | | back to the lever before r343030. For 64-bit machines reduce it slightly, too. Together with r343030 I bumped the limit up to the value we use at Netflix to serve 100 Gbit/s of sendfile traffic, and it probably isn't a good default. Provide a loader tunable to change vnode pager pbufs count. Document it. Notes: svn path=/head/; revision=344188
* Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.Gleb Smirnoff2019-01-151-23/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho Notes: svn path=/head/; revision=343030
* Implement several enhancements to NUMA policies.Jeff Roberson2018-03-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Add a new "interleave" allocation policy which stripes pages across domains with a stride or width keeping contiguity within a multi-page region. Move the kernel to the dedicated numbered cpuset #2 making it possible to assign kernel threads and memory policy separately from user. This also eliminates the need for the complicated interrupt binding code. Add a sysctl API for viewing and manipulating domainsets. Refactor some of the cpuset_t manipulation code using the generic bitset type so that it can be used for both. This probably belongs in a dedicated subr file. Attempt to improve the include situation. Reviewed by: kib Discussed with: jhb (cpuset parts) Tested by: pho (before review feedback) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14839 Notes: svn path=/head/; revision=331723
* Use per-domain locks for vm page queue free. Move paging control fromJeff Roberson2018-02-061-1/+1
| | | | | | | | | | | | | | global to per-domain state. Protect reservations with the free lock from the domain that they belong to. Refactor to make vm domains more of a first class object. Reviewed by: markj, kib, gallatin Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14000 Notes: svn path=/head/; revision=328954
* On pageout, in vnode generic pager, for partially dirty page, onlyKonstantin Belousov2018-02-021-0/+2
| | | | | | | | | | | | | | | | | clear dirty bits for completely invalid blocks. Otherwise we might not write out the last chunk that is shorter than 512 bytes, if the file end is not aligned on disk block boundary. This become important after the r324794. PR: 225586 Reported by: tris_vern@hotmail.com Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=328773
* spdx: initial adoption of licensing ID tags.Pedro F. Giffuni2017-11-181-0/+2
| | | | | | | | | | | | | | | | | | | | The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133 Notes: svn path=/head/; revision=325966
* Take the vm object lock in read mode in vnode_generic_putpages().Konstantin Belousov2017-10-201-5/+13
| | | | | | | | | | | | Only upgrade it to write mode if we need to clear dirty bits of the partially valid page after EOF. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=324807