aboutsummaryrefslogtreecommitdiff
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* Retire two unused background fsck sysctls.John Baldwin2020-04-212-186/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These two sysctls were added to support UFS softupdates journalling with snapshots. However, the changes to fsck to use them were never committed and there have never been any in-tree uses of these sysctls. More details from Kirk: When journalling got added to soft updates, its journal rollback freed blocks that it thought were no longer in use. But it does not take snapshots into account (i.e., if a snapshot is still using it, then it cannot be freed). So I added the needed logic to fsck by having the free go through the kernel's blkfree code so it could grab blocks that were still needed by snapshots. That is done using the setbufoutput hack. I never got that code working reliably, so it is still sitting in my work directory. Which also explains why you still cannot take snapshots on filesystems running with journalling... In looking over my use of this feature, and in particular the troubles I was having with it, I conclude that it may be better to extract the code from the kernel that handles freeing blocks claimed by snapshots and putting it into fsck directly. My original intent was that it is complex and at the time changing, so only having to maintain it in one place was appealing. But at this point it has not changed in years and the hacks like setinode and setbufoutput to be able to use the kernel code is sufficiently ugly, that I am leaning towards just extracting it. Reviewed by: mckusick MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D24484 Notes: svn path=/head/; revision=360170
* ufs: apply suspension for non-forced rw unmounts.Konstantin Belousov2020-04-101-4/+2
| | | | | | | | | | | | | | | | | | | | | Forced rw unmounts and remounts from rw to ro already suspend filesystem, which closes races with writers instantiating new vnodes while unmount flushes the queue. Original intent of not including non-forced unmounts into this regime was to allow such unmounts to fail if writer was active, but this did not worked well. Similar change, but causing all unmount, even involving only ro filesystem, were proposed in D24088, but I believe that suspending ro is undesirable, and definitely spends CPU time. Reported by: markj Discussed with: chs, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=359766
* Fixing the soft update macros in -r359612 triggered a previouslyKirk McKusick2020-04-091-0/+1
| | | | | | | | | | | hidden bug in the file truncation code. Until that bug is tracked down and fixed, revert to the old behavior. Reported by: Peter Holm Reviewed by: kib, Chuck Silvers Notes: svn path=/head/; revision=359760
* Revert -r359612 as it can cause other panics.Kirk McKusick2020-04-061-5/+5
| | | | | | | | | An updated version will be made when the issue has been resolved. Reported by: Peter Holm Notes: svn path=/head/; revision=359668
* When shrinking the size of a directory it is sometimes necessary toKirk McKusick2020-04-031-5/+5
| | | | | | | | | | | | | sync it to disk before shrinking it. Complete the sync before getting the buffer for the block to be updated to do the shrink to avoid panicing with a recursive lock on one of the directory's buffers. Reviewed by: Chuck Silvers (chs) MFC after: 3 days Sponsored by: Netflix Notes: svn path=/head/; revision=359613
* Convert DOINGSOFTDEP, MOUNTEDSOFTDEP, DOINGSUJ, and MOUNTEDSUJ to beingKirk McKusick2020-04-031-4/+5
| | | | | | | | | | | | boolean expressions so that their values are not lost when assigned to `bool' or `int' variables. Reviewed by: Chuck Silvers (chs) MFC after: 3 days Sponsored by: Netflix Notes: svn path=/head/; revision=359612
* VOP_GETPAGES_ASYNC(): consistently call iodone() callback in case of error.Konstantin Belousov2020-03-301-7/+14
| | | | | | | | | | | Reviewed by: glebius, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24038 Notes: svn path=/head/; revision=359466
* When mounting a UFS filesystem, return EINTEGRITY rather than EIOKirk McKusick2020-03-111-1/+1
| | | | | | | | | | | | when a superblock check-hash error is detected. This change clarifies a mount that failed due to media hardware failures (EIO) from a mount that failed due to media errors (EINTEGRITY) that can be corrected by running fsck(8). Sponsored by: Netflix Notes: svn path=/head/; revision=358899
* Use the devfs vnode rather than the mntfs vnode for permissions checks.Chuck Silvers2020-03-091-3/+3
| | | | | | | | | | | | I missed this one in r358714. Reported by: pho Reviewed by: mckusick Approved by: imp (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=358812
* fd: use smr for managing struct pwdMateusz Guzik2020-03-081-1/+3
| | | | | | | | | | | | This has a side effect of eliminating filedesc slock/sunlock during path lookup, which in turn removes contention vs concurrent modifications to the fd table. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D23889 Notes: svn path=/head/; revision=358734
* Add a new "mntfs" pseudo file system which provides private device vnodes forChuck Silvers2020-03-063-13/+43
| | | | | | | | | | | | | | file systems to safely access their disk devices, and adapt FFS to use it. Also add a new BO_NOBUFS flag to allow enforcing that file systems using mntfs vnodes do not accidentally use the original devfs vnode to create buffers. Reviewed by: kib, mckusick Approved by: imp (mentor) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D23787 Notes: svn path=/head/; revision=358714
* fd: move vnodes out of filedesc into a dedicated structureMateusz Guzik2020-03-011-4/+6
| | | | | | | | | | | | | | | | The new structure is copy-on-write. With the assumption that path lookups are significantly more frequent than chdirs and chrooting this is a win. This provides stable root and jail root vnodes without the need to reference them on lookup, which in turn means less work on globally shared structures. Note this also happens to fix a bug where jail vnode was never referenced, meaning subsequent access on lookup could run into use-after-free. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23884 Notes: svn path=/head/; revision=358503
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-264-42/+67
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Additional KASSERTs to ensure the consistency of the soft updatesKirk McKusick2020-02-181-1/+8
| | | | | | | | | | indirdep structure. No functional change. Tested by: Peter Holm (as part of a larger patch) Sponsored by: Netflix Notes: svn path=/head/; revision=358085
* Add rudamentary support for UFS to probe whether a block device supports theScott Long2020-02-163-1/+11
| | | | | | | | BIO_SPEEDUP command. Add complimentary support to the CAM periphs that support it. This is a redo of r357710. Notes: svn path=/head/; revision=358009
* ufs: use faster lockgmr entry points in ffs_lockMateusz Guzik2020-02-151-6/+3
| | | | Notes: svn path=/head/; revision=357981
* Revert r357710 and 357711 until they can be debuggedScott Long2020-02-103-12/+1
| | | | Notes: svn path=/head/; revision=357730
* Missed a file in r357710, add it here.Scott Long2020-02-101-0/+1
| | | | Notes: svn path=/head/; revision=357711
* Add rudamentary support for UFS to probe whether a block device supports theScott Long2020-02-102-1/+11
| | | | | | | | BIO_SPEEDUP command. Add complimentary support to the CAM periphs that support it. Notes: svn path=/head/; revision=357710
* With INVARIANTS, track all softdep dependency structures centrallyChuck Silvers2020-02-032-1/+20
| | | | | | | | | | so that we can find them in dumps. Approved by: mckusick (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=357456
* Fix up various vnode-related asserts which did not dump the used vnodeMateusz Guzik2020-02-031-2/+1
| | | | Notes: svn path=/head/; revision=357446
* vfs: replace VOP_MARKATIME with VOP_MMAPPEDMateusz Guzik2020-02-011-10/+13
| | | | | | | | | | The routine is only provided by ufs and is only used on mmap and exec. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23422 Notes: svn path=/head/; revision=357361
* ufs: drop ufs_markatime from ufs_fifoopsMateusz Guzik2020-02-011-1/+0
| | | | | | | | | | | The routine is only called on mmap and exec, both of which are invalid for this type. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23421 Notes: svn path=/head/; revision=357360
* ufs: add the missing vn_need_pageq_flush call to ufs_need_inactiveMateusz Guzik2020-01-301-0/+2
| | | | Notes: svn path=/head/; revision=357286
* ufs: add vgone calls for unconstructed vnodes in the error pathMateusz Guzik2020-01-262-1/+10
| | | | | | | | | | | | | | | | | This mostly eliminates the requirement that vput never unlocks the vnode before calling VOP_INACTIVE. Note it may still be present for other filesystems. See r356126 for an example bug. Note vput stopped doing early unlock in r357070 thus this change does not affect correctness as it is. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23215 Notes: svn path=/head/; revision=357129
* vfs: stop handling VI_OWEINACT in vgetMateusz Guzik2020-01-241-8/+0
| | | | | | | | | | | | | vget is almost always called with LK_SHARED, meaning the flag (if present) is almost guaranteed to get cleared. Stop handling it in the first place and instead let the thread which wanted to do inactive handle the bumepd usecount. Reviewed by: jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D23184 Notes: svn path=/head/; revision=357071
* We only want to send the speedup to the lower layers when there's a shortage.Warner Losh2020-01-171-8/+13
| | | | | | | | | | | | | | | Only send a speedup when there's a shortage. While this is a little racy, lost races aren't a big deal for this function. If there's a shorage just popping up after we check these values, then we'll catch it next time. If there's a shortage that's just clearing up, we may do some work at the lower layers a little sooner than we otherwise would have. Sicne shortages are relatively rare events, both races are acceptable. Reviewed by: chs Differential Revision: https://reviews.freebsd.org/D23182 Notes: svn path=/head/; revision=356820
* Use buf to send speedupWarner Losh2020-01-171-5/+18
| | | | | | | | | | | | | | | It turns out there's a problem with using g_io to send the speedup. It leads to a race when there's a resource shortage when a disk fails. Instead, send BIO_SPEEDUP via struct buf. This is pretty straight forward, except we need to transfer the bio_flags from b_ioflags for BIO_SPEEDUP commands in g_vfs_strategy. Reviewed by: kirk, chs Differential Revision: https://reviews.freebsd.org/D23117 Notes: svn path=/head/; revision=356819
* When sync'ing a mount point, the mount point's vnodes were scannedKirk McKusick2020-01-141-11/+13
| | | | | | | | | | | twice. Once to update the changed inodes, and a second time to update changed quota information. This change merges these two scans into a single scan which does both inode and quota updates. MFC after: 7 days Notes: svn path=/head/; revision=356739
* Fix a long standing bug in journaled soft-updates. The dirrem structureJeff Roberson2020-01-141-4/+10
| | | | | | | | | | | | | | | | | | needs to handle file removal, directory removal, file move, directory move, etc. The code in handle_workitem_remove() needs to propagate any completed journal entries to the write that will render the change stable. In the case of a moved directory this means the new parent. However, for an overwrite that frees a directory (DIRCHG) we must move the jsegdep to the removed inode to be released when it is stable in the cg bitmap or the unlinked inode list. This case was previously unhandled and caused a panic. Reported by: mckusick, pho Reviewed by: mckusick Tested by: pho Notes: svn path=/head/; revision=356714
* ufs: relax an overzealous assert added in r356671Mateusz Guzik2020-01-132-1/+7
| | | | | | | | | | | | | Part of i_flag can persist across a drop to hold count of 0, at which point the vnode is taken off the lazy list. Then whoever locks and unlocks the vnode can trip on the assert. This trips over kyua running a test untarring character devices to ufs. Reported by: lwhsu Notes: svn path=/head/; revision=356683
* vfs: rework vnode list managementMateusz Guzik2020-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | The current notion of an active vnode is eliminated. Vnodes transition between 0<->1 hold counts all the time and the associated traversal between different lists induces significant scalability problems in certain workloads. Introduce a global list containing all allocated vnodes. They get unlinked only when UMA reclaims memory and are only requeued when hold count reaches 0. Sample result from an incremental make -s -j 104 bzImage on tmpfs: stock: 118.55s user 3649.73s system 7479% cpu 50.382 total patched: 122.38s user 1780.45s system 6242% cpu 30.480 total Reviewed by: jeff Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D22997 Notes: svn path=/head/; revision=356672
* ufs: use lazy list instead of active list for syncerMateusz Guzik2020-01-134-12/+78
| | | | | | | | | | | Quota code is temporarily regressed to do a full vnode scan. Reviewed by: jeff Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D22996 Notes: svn path=/head/; revision=356671
* ufs: add a setter for inode i_flag fieldMateusz Guzik2020-01-1312-96/+103
| | | | | | | | | | | This will be used later to add vnodes to the lazy list. Reviewed by: kib (previous version), jeff Tested by: pho (in a larger patch) Differential Revision: https://reviews.freebsd.org/D22994 Notes: svn path=/head/; revision=356669
* When a read error occurs while fetching a directory block to deleteKirk McKusick2020-01-111-13/+34
| | | | | | | | | | | or rename an entry in it, properly reset the link count of the inode associated with the entry that was to have been changed. Tested by: Peter Holm MFC after: 7 days Notes: svn path=/head/; revision=356627
* vfs: drop thread argument from vinactiveMateusz Guzik2020-01-051-3/+1
| | | | Notes: svn path=/head/; revision=356363
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-0311-80/+80
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* ufs: do not leave non-reclaimed vnodes with zero i_mode around.Konstantin Belousov2019-12-271-2/+11
| | | | | | | | | | | | | | | | | | | | | | After a recent change, vput() relocks even the exclusively locked vnode before inactivating it. Before that, UFS could safely instantiate a vnode for cleared inode, then the last vput() after ffs_vgetf() noted that ip->i_mode == 0 and recycled. Now, it is possible for other threads to note the half-constructed vnode, e.g. to insert it into hash, which makes other threads to use it despite mode is zero, before inactivation and reclaim. Handle the found cases in SU code, by explicitly doing reclaim. Assert that other places get fully constructed inode from ffs_vgetf(), which cannot be cleared before dependencies are resolved. Reported and tested by: pho Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=356126
* Drop a sleepable lock when we plan on sleepingWarner Losh2019-12-181-2/+6
| | | | | | | | | | | | | g_io_speedup waits for the completion of the speedup request before proceeding using biowait(), but check_clear_deps is called with the softdeps lock held (which is non-sleepable). It's safe to drop this lock around the call to speedup, so do that. Submitted by: Peter Holm Reviewed by: kib@ Notes: svn path=/head/; revision=355882
* Add BIO_SPEEDUP signalling to UFSWarner Losh2019-12-171-2/+16
| | | | | | | | | | | | | When we have a resource shortage in UFS, send down a BIO_SPEEDUP to give the CAM I/O scheduler a heads up that we have a resource shortage and that it should bias its decisions knowing that. Reviewed by: kirk, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D18351 Notes: svn path=/head/; revision=355836
* vfs: flatten vop vectorsMateusz Guzik2019-12-162-0/+6
| | | | | | | | | | | | | | | This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738 Notes: svn path=/head/; revision=355790
* UFS: implement VOP_INACTIVE()Konstantin Belousov2019-12-103-0/+36
| | | | | | | | | | | | | The checks literally repeat conditions that make ufs_inactive() to take some actions. Reviewed by: jeff Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22616 Notes: svn path=/head/; revision=355584
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-085-7/+7
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* Currently the breadn_flags() and getblkx() interfaces are passedKirk McKusick2019-12-034-53/+22
| | | | | | | | | | | | | | | | | | the vnode, logical block number, and size of data block that is being requested. They then use the VOP_BMAP function to calculate the mapping from logical block number to physical block number from which to access the data. This change expands the interface to also pass the physical block number in cases where the VOP_MAP function may no longer work, for example when a file is being truncated. No functional change. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix Notes: svn path=/head/; revision=355371
* As part of creating a snapshot, set fs->fs_fmod to 0 in the snapshot imageChuck Silvers2019-11-281-0/+1
| | | | | | | | | | | | because nothing ever changes this field for read-only mounts and we want to verify that it is still 0 when we unmount. Reviewed by: mckusick Approved by: mckusick (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=355150
* In ffs_freefile(), use a separate variable to hold the inode number withinChuck Silvers2019-11-251-8/+8
| | | | | | | | | | | | | | the cg rather than reusuing "ino" for this purpose. This reduces the diff for an upcoming change that improves handling of I/O errors. No functional change. Reviewed by: mckusick Approved by: mckusick (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=355098
* Add some KASSERTs. Reacquire a mutex after a kernel printf ratherKirk McKusick2019-11-201-2/+8
| | | | | | | | | than holding it during the printf. White space cleanup. Sponsored by: Netflix Notes: svn path=/head/; revision=354872
* In ufs_dir_dd_ino(), always initialize *dd_vp since the caller expects it.Chuck Silvers2019-11-121-1/+1
| | | | | | | | | Reviewed by: kib, mckusick Approved by: imp (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=354632
* Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTYJeff Roberson2019-10-291-2/+2
| | | | | | | | | | | | | | flag and use the same system. This enables further fault locking improvements by allowing more faults to proceed with a shared lock. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22116 Notes: svn path=/head/; revision=354158
* After the unlink() of one name of a file with multiple links, aKirk McKusick2019-10-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | stat() of one of the remaining names of the file does not show an updated ctime (inode modification time) until several seconds after the unlink() completes. The problem only occurs when the filesystem is running with soft updates enabled. When running with soft updates, the ctime is not updated until the soft updates background process has settled all the needed I/O operations. This commit causes the ctime to be updated immediately during the unlink(). A side effect of this change is that the ctime is updated again when soft updates has finished its processing because that is the time that is correct from the perspective of programs that look at the disk (like dump). This change does not cause any extra I/O to be done, it just ensures that stat() updates the ctime before handing it back. PR: 241373 Reported by: Alan Somers Tested by: Alan Somers MFC after: 3 days Sponsored by: Netflix Notes: svn path=/head/; revision=354050