src - FreeBSD source tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Do not allow snapshots on UFS filesystems using gjournal.	Kirk McKusick	2024-07-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \|	The gjournal implementation does not properly handle the freeing of blocks that may be part of a snapshot. Adding this support to gjournal would require considerable effort. For now we simply do not allow snapshots to be taken on filesystems using gjournal. Reported by: ant_mail@inbox.ru PR: 280216 MFC after: 1 week
*	sys: Remove ancient SCCS tags.	Warner Losh	2023-11-27	1	-2/+0
\| \| \| \| \| \| \| \|	Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
*	sys: Remove $FreeBSD$: one-line .c pattern	Warner Losh	2023-08-16	1	-2/+0
\| \| \| \|	Remove /^[\s]__FBSDID$"\$FreeBSD\$"$;?\s*\n/
*	Set UFS/FFS file type to snapshot before changing its block pointers.	Kirk McKusick	2023-08-12	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A UFS/FFS snapshot file is identified with the SF_SNAPSHOT flag to identify it as a snapshot. This flag needs to be set before setting some of its block pointers to the special values BLK_SNAP and BLK_NOCOPY. If the snapshot creation fails and we call VOP_REMOVE(), the SF_SNAPSHOT flag will let the remove routine know that the special block pointer values need to be rolled back before attempting deletion of the file. Also ensure that an fsck is required after setting superblock values in the ffs_checkcgintegrity() routine. Reported-by: Peter Holm Tested-by: Peter Holm MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
*	Remove a partial UFS/FFS snapshot if it fails to build successfully.	Kirk McKusick	2023-08-09	1	-8/+23
\| \| \| \| \| \| \| \| \| \| \|	When taking a UFS/FFS snapshot, it may not succeed for example if the filesystem is too full to hold it. When a snapshot is unable to be successfully taken, the partial snapshot should be removed. Reported-by: Peter Holm Tested-by: Peter Holm MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
*	UFS/FFS: Migrate to modern uintXX_t from u_intXX_t.	Kirk McKusick	2023-07-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	As per https://lists.freebsd.org/archives/freebsd-scsi/2023-July/000257.html move to the modern uintXX_t. While here also migrate u_char to uint8_t. Where other kernel interfaces allow, migrate u_long to uint64_t. No functional changes intended. MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
*	vfs: use __enum_uint8 for vtype and vstate	Mateusz Guzik	2023-07-05	1	-2/+2
\| \| \| \| \| \|	This whacks hackery around only reading v_type once. Bump __FreeBSD_version to 1400093
*	Write out corrected superblock when creating a UFS/FFS snapshot.	Kirk McKusick	2023-06-13	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \|	When taking a snapshot on a UFS version 1 filesystem we need to call ffs_oldfscompat_write() to unwind any in-memory changes that were made to the superblock before writing it. The cause of this bug was that the trimmed down maximum file size was not being reverted. PR: 271352 Tested-by: Peter Holm MFC-after: 1 week Sponsored-by: The FreeBSD Foundation
*	spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD	Warner Losh	2023-05-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
*	Enable taking snapshots on UFS/FFS filesystems using journaled soft updates.	Kirk McKusick	2022-11-13	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \|	All the needed infrastructure updates have been made to allow snapshots to be taken on UFS/FFS filesystems that are using journaled soft updates. The most immediate benefit is the ability to use a snapshot to take a consistent filesystem dump on a live filesystem using the -L option to dump(8). Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D36491
*	Fix an incorrectly placed parenthesis.	Kirk McKusick	2022-09-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	While syntactically correct and even looking correct, it was definitely not providing the desired result. And it has been this way for nearly twenty years. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
*	Fix unused variable warning in ffs_snapshot.c	Dimitry Andric	2022-07-26	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	With clang 15, the following -Werror warning is produced: sys/ufs/ffs/ffs_snapshot.c:204:7: error: variable 'redo' set but not used [-Werror,-Wunused-but-set-variable] long redo = 0, snaplistsize = 0; ^ The 'redo' variable is only used when DIAGNOSTIC is defined. Ensure it is only declared and set in that case. MFC after: 3 days
*	Rewrite function definitions in the UFS/FFS code base with identifier lists.	Kirk McKusick	2022-07-13	1	-145/+110
\| \| \| \| \| \| \| \| \| \| \| \| \|	The K&R style in UFS and other places in the tree's days are numbered as this syntax is removed in C2x proposal N2432: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf Though running to nearly 6000 lines of diffs this update should cause no functional change to the code. Requested by: Warner Losh MFC after: 2 weeks
*	vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)	Mateusz Guzik	2022-03-24	1	-4/+4
\|
*	ffs_snapblkfree(): add a comment explaining lockmgr invocation	Konstantin Belousov	2022-01-31	1	-0/+6
\| \| \| \| \| \| \|	Reviewed by: markj, mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072
*	ufs: handle LoR between snap lock and vnode lock	Kirk McKusick	2022-01-28	1	-33/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a filesystem is mounted all of its associated snapshots must be activated. It first allocates a snapshot lock (snaplk) that will be shared by all the snapshot vnodes associated with the filesystem. As part of each snapshot file activation, it must replace its own ufs vnode lock with the snaplk. In this way acquiring the snaplk gives exclusive access to all the snapshots for the filesystem. A write to a ufs vnode first acquires the ufs vnode lock for the file to be written then acquires the snaplk. Once it has the snaplk, it can check all the snapshots to see if any of them needs to make a copy of the block that is about to be written. This ffs_copyonwrite() code path establishes the ufs vnode followed by snaplk locking order. When a filesystem is unmounted it has to release all of its snapshot vnodes. Part of doing the release is to revert the snapshot vnode from using the snaplk to using its original vnode lock. While holding the snaplk, the vnode lock has to be acquired, the vnode updated to reference it, then the snaplk released. Acquiring the vnode lock while holding the snaplk violates the ufs vnode then snaplk order. Because the vnode lock is unused, using LK_EXCLUSIVE \| LK_NOWAIT to acquire it will always succeed and the LK_NOWAIT prevents the reverse lock order from being recorded. This change was made in January 2021 (173779b98f) to avoid an LOR violation in ffs_snapshot_unmount(). The same LOR issue was recently found again when removing a snapshot in ffs_snapremove() which must also revert the snaplk to the original vnode lock as part of freeing it. The unwind in ffs_snapremove() deals with the case in which the snaplk is held as a recursive lock holding multiple references. Specifically an equal number of references are made on the vnode lock. This change factors out the lock reversion operations into a new function revert_snaplock() which handles both the recursive locks and avoids the LOR. The new revert_snaplock() function is then used in both ffs_snapshot_unmount() and in ffs_snapremove(). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D33946
*	ufs: Avoid subobject overflow in snapshot expunge code	Jessica Clarke	2022-01-02	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code here tries to be smart and zeroes out both di_db and di_ib with a single bzero call, thereby overrunning the di_db subobject. This is fine on most architectures, if a little dodgy. However, on CHERI, the compiler can optionally restrict the bounds on pointers to subobjects to just that subobject, in order to mitigate intra-object buffer overflows, and this is enabled in CheriBSD's pure-capability kernels. Instead, use separate bzero calls for each array, and let the compiler optimise it as it sees fit; even if it's not generating inline zeroing code, Clang will happily optimise two consecutive bzero's to a single larger call. Reviewed by: mckusick Differential Revision: https://reviews.freebsd.org/D33651
*	vfs: remove the unused thread argument from NDINIT*	Mateusz Guzik	2021-11-25	1	-1/+1
\| \| \| \| \| \|	See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.
*	ffs_snapshot: do not assert that um_devvp is locked	Konstantin Belousov	2021-11-12	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	It is not, and the lock is not needed there Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32761
*	Avoid "consumer not attached in g_io_request" panic when disk lost	Kirk McKusick	2021-09-28	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	while using a UFS snapshot. The UFS filesystem supports snapshots. Each snapshot is a file whose contents are a frozen image of the disk partition on which the filesystem resides. Each time an existing block in the filesystem is modified, the filesystem checks whether that block was in use at the time that the snapshot was taken. If so, and if it has not already been copied, a new block is allocated from among the blocks that were not in use at the time that the snapshot was taken and placed in the snapshot file to replace the entry that has not yet been copied. The previous contents of the block are copied to the newly allocated snapshot file block, and the write to the original is then allowed to proceed. The block allocation is done using the usual UFS_BALLOC() routine which allocates the needed block in the snapshot and returns a buffer that is set up to write data into the newly allocated block. In usual filesystem operation, the contents for the new block is copied from user space into the buffer and the buffer is then written to the file using bwrite(), bawrite(), or bdwrite(). In the case of a snapshot the new block must be filled from the disk block that is about to be rewritten. The snapshot routine has a function readblock() that it uses to read the `about to be rewritten' disk block. /* * Read the specified block into the given buffer. / static int readblock(snapvp, bp, lbn) struct vnode snapvp; struct buf bp; ufs2_daddr_t lbn; { struct inode ip; struct bio bip; struct fs fs; ip = VTOI(snapvp); fs = ITOFS(ip); bip = g_alloc_bio(); bip->bio_cmd = BIO_READ; bip->bio_offset = dbtob(fsbtodb(fs, blkstofrags(fs, lbn))); bip->bio_data = bp->b_data; bip->bio_length = bp->b_bcount; bip->bio_done = NULL; g_io_request(bip, ITODEVVP(ip)->v_bufobj.bo_private); bp->b_error = biowait(bip, "snaprdb"); g_destroy_bio(bip); return (bp->b_error); } When the underlying disk fails, its GEOM module is removed. Subsequent attempts to access it should return the ENXIO error. The functionality of checking for the lost disk and returning ENXIO is handled by the g_vfs_strategy() routine: void g_vfs_strategy(struct bufobj bo, struct buf bp) { struct g_vfs_softc sc; struct g_consumer cp; struct bio bip; cp = bo->bo_private; sc = cp->geom->softc; / * If the provider has orphaned us, just return ENXIO. / mtx_lock(&sc->sc_mtx); if (sc->sc_orphaned \|\| sc->sc_enxio_active) { mtx_unlock(&sc->sc_mtx); bp->b_error = ENXIO; bp->b_ioflags \|= BIO_ERROR; bufdone(bp); return; } sc->sc_active++; mtx_unlock(&sc->sc_mtx); bip = g_alloc_bio(); bip->bio_cmd = bp->b_iocmd; bip->bio_offset = bp->b_iooffset; bip->bio_length = bp->b_bcount; bdata2bio(bp, bip); if ((bp->b_flags & B_BARRIER) != 0) { bip->bio_flags \|= BIO_ORDERED; bp->b_flags &= ~B_BARRIER; } if (bp->b_iocmd == BIO_SPEEDUP) bip->bio_flags \|= bp->b_ioflags; bip->bio_done = g_vfs_done; bip->bio_caller2 = bp; g_io_request(bip, cp); } Only after checking that the device is present does it construct the "bio" request and call g_io_request(). When readblock() constructs its own "bio" request and calls g_io_request() directly it panics with "consumer not attached in g_io_request" when the underlying device no longer exists. The fix is to have readblock() call g_vfs_strategy() rather than constructing its own "bio" request: / * Read the specified block into the given buffer. / static int readblock(snapvp, bp, lbn) struct vnode snapvp; struct buf bp; ufs2_daddr_t lbn; { struct inode ip; struct fs *fs; ip = VTOI(snapvp); fs = ITOFS(ip); bp->b_iocmd = BIO_READ; bp->b_iooffset = dbtob(fsbtodb(fs, blkstofrags(fs, lbn))); bp->b_iodone = bdone; g_vfs_strategy(&ITODEVVP(ip)->v_bufobj, bp); bufwait(bp); return (bp->b_error); } Here it uses the buffer that will eventually be written to the disk. The g_vfs_strategy() routine uses four parts of the buffer: b_bcount, b_iocmd, b_iooffset, and b_data. The b_bcount field is already correctly set for the buffer. It is safe to set the b_iocmd and b_iooffset fields as they are set correctly when the later write is done. The write path will also clear the B_DONE flag that our use of the buffer will set. The b_iodone callback has to be set to bdone() which will do just notification that the I/O is done in bufdone(). The rest of bufdone() includes things like processing the softdeps associated with the buffer should not be done until the buffer has been written. Bufdone() will set b_iodone back to NULL after using it, so the full bufdone() processing will be done when the buffer is written. The final change from the previous version of readblock() is that it used the b_data for the destination of the read while g_vfs_strategy() uses the bdata2bio() function to take advantage of VMIO when it is available. Differential revision: https://reviews.freebsd.org/D32150 Reviewed by: kib, chs MFC after: 1 week Sponsored by: Netflix
*	Eliminate snaplk / bufwait LOR when creating UFS snapshots	Kirk McKusick	2021-09-19	1	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of creating a new UFS snapshot, it has to have its individual vnode lock replaced with the filesystem's snapshot lock. The lock order for regular vnodes with respect to buffer locks is that they must first acquire the vnode lock, then a buffer lock. The order for the snapshot lock is reversed: a buffer lock must be acquired before the snapshot lock. When creating a new snapshot, the snapshot file must retain its vnode lock until it has allocated all the blocks that it needs before switching to the snapshot lock. This update moves one final piece of the initial snapshot block allocation so that it is done before the newly created snapshot is switched to use the snapshot lock. Reported by: Witness code MFC after: 1 week Sponsored by: Netflix
*	UFS snapshots: properly set the vm object size.	Konstantin Belousov	2021-02-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Citing Kirk: The previous code [before 8563de2f2799b2cb -- kib] did not call vnode_pager_setsize() but worked because later in ffs_snapshot() it does a UFS_WRITE() to output the snaplist. Previously the UFS_WRITE() allocated the extra block at the end of the file which caused it to do the needed vnode_pager_setsize(). But the new code had already allocated the extra block, so UFS_WRITE() did not extend the size and thus did not do the vnode_pager_setsize(). PR: 253158 Reported by: Harald Schmalzbauer <bugzilla.freebsd@omnilan.de> Reviewed by: mckusick Tested by: cy Sponsored by: The FreeBSD Foundation MFC after: 1 week
*	Fix bug 253158 - Panic: snapacct_ufs2: bad block - mksnap_ffs(8) crash	Kirk McKusick	2021-02-12	1	-67/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The panic reported in 253158 arises because the /mnt/.snap/.factory snapshot allocated the last block in the filesystem. The snapshot code allocates the last block in the filesystem as a way of setting its length to be the size of the filesystem. Part of taking a snapshot is to remove all the earlier snapshots from the image of the newest snapshot so that newer snapshots will not claim the blocks of the earlier snapshots. The panic occurs when the new snapshot finds that both it and an earlier snapshot claim the same block. The fix is to set the size of the snapshot to be one block after the last block in the filesystem. This block can never be allocated since it is not a valid block in the filesystem. This extra block is used as a place to store the initial list of blocks that the snapshot has already copied and is used to avoid a deadlock in and speed up the ffs_copyonwrite() function. Reported by: Harald Schmalzbauer Tested by: Peter Holm PR: 253158 Sponsored by: Netflix
*	Stop ignoring ERELOOKUP from VOP_INACTIVE()	Konstantin Belousov	2021-02-12	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	When possible, relock the vnode and retry inactivation. Only vunref() is required not to drop the vnode lock, so handle it specially by not retrying. This is a part of the efforts to ensure that unlinked not referenced vnode does not prevent inode from reusing. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
*	ffs_snapshot: use VOP_VPUT_PAIR after VOP_CREATE.	Konstantin Belousov	2021-02-12	1	-2/+7
\| \| \| \| \| \| \| \| \|	If the snapshot embrio was reclaimed under us, return error outright. Reviewed by: chs, mckusick Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
*	Eliminate lock order reversal in UFS when unmounting filesystems	Kirk McKusick	2021-01-16	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with snapshots. Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of mounting a UFS filesystem with snapshots, each of the vnodes describing a snapshot has its individual lock replaced with the snapshot lock. When the filesystem is unmounted the vnode's original lock is returned replacing the snapshot lock. The lock order reversal happens because vnode locks must be acquired before snapshot locks. When unmounting we must lock both the snapshot lock and the vnode lock before swapping them so that the vnode will be continuously locked during the swap. For each vnode representing a snapshot, we must first acquire the snapshot lock to ensure exclusive access to it and its original lock. We then face a lock order reversal when we try to acquire the original vnode lock. The problem is eliminated by doing a non-blocking exclusive lock on the original lock which will always succeed since there are no users of that lock. Sponsored by: Netflix
*	ffs: Avoid out-of-bounds accesses in the fs_active bitmap	Mark Johnston	2020-12-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use a bitmap to track which cylinder groups have changed between snapshot creation and filesystem suspension. The "legs" of the bitmap are four bytes wide (see ACTIVESET()) so we must round up the allocation size to a multiple of four bytes. I believe this bug is harmless since UMA/kmem_* will both pad the allocation and zero the full allocation. Note that malloc() does inline zeroing when the allocation size is known at compile-time. Reported by: pho (using KASAN) Reviewed by: kib, mckusick MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27731
*	Handle LoR in flush_pagedep_deps().	Konstantin Belousov	2020-11-14	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When operating in SU or SU+J mode, ffs_syncvnode() might need to instantiate other vnode by inode number while owning syncing vnode lock. Typically this other vnode is the parent of our vnode, but due to renames occuring right before fsync (or during fsync when we drop the syncing vnode lock, see below) it might be no longer parent. More, the called function flush_pagedep_deps() needs to lock other vnode while owning the lock for vnode which owns the buffer, for which the dependencies are flushed. This creates another instance of the same LoR as was fixed in softdep_sync(). Put the generic code for safe relocking into new SU helper get_parent_vp() and use it in flush_pagedep_deps(). The case for safe relocking of two vnodes with undefined lock order was extracted into vn helper vn_lock_pair(). Due to call sequence ffs_syncvnode()->softdep_sync_buf()->flush_pagedep_deps(), ffs_syncvnode() indicates with ERELOOKUP that passed vnode was unlocked in process, and can return ENOENT if the passed vnode reclaimed. All callers of the function were inspected. Because UFS namei lookups store auxiliary information about directory entry in in-memory directory inode, and this information is then used by UFS code that creates/removed directory entry in the actual mutating VOPs, it is critical that directory vnode lock is not dropped between lookup and VOP. For softdep_prelink(), which ensures that later link/unlink operation can proceed without overflowing the journal, calls were moved to the place where it is safe to drop processing VOP because mutations are not yet applied. Then, ERELOOKUP causes restart of the whole VFS operation (typically VFS syscall) at top level, including the re-lookup of the involved pathes. [Note that we already do the same restart for failing calls to vn_start_write(), so formally this patch does not introduce new behavior.] Similarly, unsafe calls to fsync in snapshot creation code were plugged. A possible view on these failures is that it does not make sense to continue creating snapshot if the snapshot vnode was reclaimed due to forced unmount. It is possible that relock/ERELOOKUP situation occurs in ffs_truncate() called from ufs_inactive(). In this case, dropping the vnode lock is not safe. Detect the situation with VI_DOINGINACT and reschedule inactivation by setting VI_OWEINACT. ufs_inactive() rechecks VI_OWEINACT and avoids reclaiming vnode is truncation failed this way. In ffs_truncate(), allocation of the EOF block for partial truncation is re-done after vnode is synced, since we cannot leave the buffer locked through ffs_syncvnode(). In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136 Notes: svn path=/head/; revision=367672
*	Move the pointers stored in the superblock into a separate	Kirk McKusick	2020-06-19	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fs_summary_info structure. This change was originally done by the CheriBSD project as they need larger pointers that do not fit in the existing superblock. This cleanup of the superblock eases the task of the commit that immediately follows this one. Suggested by: brooks Reviewed by: kib PR: 246983 Sponsored by: Netflix Notes: svn path=/head/; revision=362358
*	Further evaluation of the POSIX spec for fdatasync() shows that it	Kirk McKusick	2020-06-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	requires that new data on growing files be accessible. Thus, the the fsyncdata() system call must update the on-disk inode when the size of the file has changed. This commit adds another inode update flag, IN_SIZEMOD, that gets set any time that the file size changes. If either the IN_IBLKDATA or the IN_SIZEMOD flag is set when fdatasync() is called, the associated inode is synchronously written to disk. We could have overloaded the IN_IBLKDATA flag to also track size changes since the only (current) use case for these flags are for fsyncdata(), but it does seem useful for possible future uses to separately track the file size changes and the inode block pointer changes. Reviewed by: kib MFC with: -r361785 Differential revision: https://reviews.freebsd.org/D25072 Notes: svn path=/head/; revision=361814
*	vfs: stop handling VI_OWEINACT in vget	Mateusz Guzik	2020-01-24	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	vget is almost always called with LK_SHARED, meaning the flag (if present) is almost guaranteed to get cleared. Stop handling it in the first place and instead let the thread which wanted to do inactive handle the bumepd usecount. Reviewed by: jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D23184 Notes: svn path=/head/; revision=357071
*	ufs: add a setter for inode i_flag field	Mateusz Guzik	2020-01-13	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	This will be used later to add vnodes to the lazy list. Reviewed by: kib (previous version), jeff Tested by: pho (in a larger patch) Differential Revision: https://reviews.freebsd.org/D22994 Notes: svn path=/head/; revision=356669
*	vfs: drop thread argument from vinactive	Mateusz Guzik	2020-01-05	1	-3/+1
\| \| \| \|	Notes: svn path=/head/; revision=356363
*	vfs: drop the mostly unused flags argument from VOP_UNLOCK	Mateusz Guzik	2020-01-03	1	-13/+13
\| \| \| \| \| \| \| \| \| \| \|	Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
*	As part of creating a snapshot, set fs->fs_fmod to 0 in the snapshot image	Chuck Silvers	2019-11-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	because nothing ever changes this field for read-only mounts and we want to verify that it is still 0 when we unmount. Reviewed by: mckusick Approved by: mckusick (mentor) Sponsored by: Netflix Notes: svn path=/head/; revision=355150
*	Update ffs_getcg() function to accept a flags parameter to be passed	Kirk McKusick	2019-10-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	to breadn_flags() in preparation for later need when doing forcible unmount when disk dies or is removed. No functional change. Sponsored by: Netflix Notes: svn path=/head/; revision=353099
*	ufs: Remove redundant brelse() after r294954	Conrad Meyer	2019-09-06	1	-1/+0
\| \| \| \| \| \| \| \| \|	Same automation. No functional change. Notes: svn path=/head/; revision=351929
*	Separate kernel crc32() implementation to its own header (gsb_crc32.h) and	Xin LI	2019-06-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	rename the source to gsb_crc32.c. This is a prerequisite of unifying kernel zlib instances. PR: 229763 Submitted by: Yoshihiro Ota <ota at j.email.ne.jp> Differential Revision: https://reviews.freebsd.org/D20193 Notes: svn path=/head/; revision=349151
*	Convert use of UFS-specific #ifdef DEBUG to DIAGNOSTIC or INVARIANTS	Kirk McKusick	2019-05-28	1	-7/+16
\| \| \| \| \| \| \| \| \|	as appropriate. No functional change intended. Suggested-by: markj Notes: svn path=/head/; revision=348329
*	When loading an inode from disk, verify that its mode is valid.	Kirk McKusick	2018-12-27	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If invalid, return EINVAL. Note that inode check-hashes greatly reduce the chance that these errors will go undetected. Reported by: Christopher Krah <krah@protonmail.com> Reported as: FS-5-UFS-2: Denial Of Service in nmount-3 (ffs_read) Reviewed by: kib MFC after: 1 week Sponsored by: Netflix M sys/fs/ext2fs/ext2_vnops.c M sys/kern/vfs_subr.c M sys/ufs/ffs/ffs_snapshot.c M sys/ufs/ufs/ufs_vnops.c Notes: svn path=/head/; revision=342548
*	Allocate v_object for the new snapshot vnode.	Konstantin Belousov	2018-12-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The vnode is not opened, so it ends up with the malloced buffers otherwise. Reported and tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=342381
*	Continuing efforts to provide hardening of FFS. This change adds a	Kirk McKusick	2018-12-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	check hash to the filesystem inodes. Access attempts to files associated with an inode with an invalid check hash will fail with EINVAL (Invalid argument). Access is reestablished after an fsck is run to find and validate the inodes with invalid check-hashes. This check avoids a class of filesystem panics related to corrupted inodes. The hash is done using crc32c. Note this check-hash is for the inode itself and not any of its indirect blocks. Check-hash validation may be extended to also cover indirect block pointers, but that will be a separate (and more costly) feature. Check hashes are added only to UFS2 and not to UFS1 as UFS1 is primarily used in embedded systems with small memories and low-powered processors which need as light-weight a filesystem as possible. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix Notes: svn path=/head/; revision=341836
*	Calculate updated superblock check-hash before writing it into the snapshot.	Kirk McKusick	2018-11-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This corrects a bug that prevented snapshots from being mounted due to a superblock check-hash failure. Reported by: Brennan Vincent <brennan@umanwizard.com> Tested by: Peter Holm (pho@) Sponsored by: Netflix Notes: svn path=/head/; revision=340924
*	In preparation for adding inode check-hashes, clean up and	Kirk McKusick	2018-11-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	document the libufs interface for fetching and storing inodes. The undocumented getino / putino interface has been replaced with a new getinode / putinode interface. Convert the utilities that had been using the undocumented interface to use the new documented interface. No functional change (as for now the libufs library does not do inode check-hashes). Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix Notes: svn path=/head/; revision=340411
*	Replace the TRIM consolodation framework originally added in -r337396	Kirk McKusick	2018-08-18	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	driven by problems found with the algorithms being tested for TRIM consolodation. Reported by: Peter Holm Suggested by: kib Reviewed by: kib Sponsored by: Netflix Notes: svn path=/head/; revision=338031
*	Revert -r337396. It is being replaced with a revised interface that	Kirk McKusick	2018-08-18	1	-3/+3
\| \| \| \| \| \| \|	resulted from testing and further reviews. Notes: svn path=/head/; revision=338029
*	Put in place the framework for consolodating contiguous blocks into	Kirk McKusick	2018-08-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a smaller number of larger TRIM requests. The hope had been to have the full TRIM consolodation in place for 12.0, but the algorithms are still under development and need further testing. With this framework in place it will be possible to easily add TRIM consolodation once the optimal strategy has been found. The only functional change with this patch is the elimination of TRIM requests for blocks that are freed before they have been likely to have been written. Reviewed by: kib Discussed with: Warner Losh and Chuck Silvers Sponsored by: Netflix Notes: svn path=/head/; revision=337396
*	Make timespecadd(3) and friends public	Alan Somers	2018-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The timespecadd(3) family of macros were imported from NetBSD back in r35029. However, they were initially guarded by #ifdef _KERNEL. In the meantime, we have grown at least 28 syscalls that use timespecs in some way, leading many programs both inside and outside of the base system to redefine those macros. It's better just to make the definitions public. Our kernel currently defines two-argument versions of timespecadd and timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define three-argument versions. Solaris also defines a three-argument version, but only in its kernel. This revision changes our definition to match the common three-argument version. Bump _FreeBSD_version due to the breaking KPI change. Discussed with: cem, jilles, ian, bde Differential Revision: https://reviews.freebsd.org/D14725 Notes: svn path=/head/; revision=336914
*	Revert r327781, r328093, r328056:	Pedro F. Giffuni	2018-01-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	ufs\|ext2fs: Revert uses of mallocarray(9). These aren't really useful: drop them. Variable unsigning will be brought again later. Notes: svn path=/head/; revision=328340
*	ufs: use mallocarray(9).	Pedro F. Giffuni	2018-01-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Basic use of mallocarray to prevent overflows: static analyzers are also likely to perform additional checks. Since mallocarray expects unsigned parameters, unsign some related variables to minimize sign conversions. Reviewed by: mckusick Notes: svn path=/head/; revision=328093