| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Add a special permission to the jail to adjust and to set the host time.
This can be useful if we want to compartmentalize the NTP daemon
from the rest of the system.
Reviewed by: olce, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D45545
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce the allocuio() and freeuio() functions to allocate and
deallocate struct uio. This hides the actual allocator interface, so it
is easier to modify the sub-allocation layout of struct uio and the
corresponding iovec array.
Obtained from: CheriBSD
Reviewed by: kib, markj
MFC after: 2 weeks
Sponsored by: CHaOS, EPSRC grant EP/V000292/1
Differential Revision: https://reviews.freebsd.org/D43711
|
|
|
|
|
| |
Submitted by: Igor Ostapenko <igor.ostapenko_pm.me>
Differential Revision: <https://reviews.freebsd.org/D43565>
|
|
|
|
|
|
|
|
|
|
| |
when the parameter allow.mlock was added a way for jails to check
if the parameter was set or now has not been added, this change
covers it.
MFC After: 3 days
Reviewed by: jamie@
Differential Revision: https://reviews.freebsd.org/D43314
|
|
|
|
|
|
| |
Reviewed by: zlei, jamie
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43142
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, a prison in "dying" state (removed but still holding
resources) can be brought back to alive state via "jail -d", or
the JAIL_DYING flag to jail_set(2). This seemed like a good idea
at the time.
Its main use was to improve support for specifying the jid when
creating a jail, which also seemed like a good idea at the time.
But resurrecting a jail that was partway through thr process of
shutting down is trouble waiting to happen.
This patch deprecates that flag, leaving it as a no-op for creating
jails (but still useful for looking at dying jails). It sill allows
creating a new jail with the same jid as a dying one, but will renumber
the old one in that case. That's imperfect, but allows for current
behavior.
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D28150
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use priv_check_cred() with a new privilege (PRIV_SEEJAILPROC) instead of
explicitly testing for UID 0 (the former has been the rule for almost 20
years).
As a consequence, cr_canseejailproc() now abides by the
'security.bsd.suser_enabled' sysctl and MAC policies.
Update the MAC policies Biba and LOMAC, and prison_priv_check() so that
they don't deny this privilege. This preserves the existing behavior
(the 'root' user is not restricted, even when jailed, unless
'security.bsd.suser_enabled' is not 0) and is consistent with what is
done for the related policies/privileges (PRIV_SEEOTHERGIDS,
PRIV_SEEOTHERUIDS).
Reviewed by: emaste (earlier version), mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit privileged accounts in a jail could not access to the
filesystem extended attributes in the system namespace. To control access to
the system namespace in a per-jail basis add a new configuration parameter
allow.extattr which is off by default.
Reported by: zirias
Tested by: zirias
Obtained from: HardenedBSD
Reviewed by: kevans, jamie
Differential revision: https://reviews.freebsd.org/D41643
MFC after: 1 week
Relnotes: yes
|
|
|
|
| |
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
|
|
|
|
|
|
|
|
| |
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change allow to open Netlink sockets in the non-vnet jails, even for
unpriviledged processes.
The security model largely follows the existing one. To be more specific:
* by default, every `NETLINK_ROUTE` command is **NOT** allowed in non-VNET
jail UNLESS `RTNL_F_ALLOW_NONVNET_JAIL` flag is specified in the command
handler.
* All notifications are **disabled** for non-vnet jails (requests to
subscribe for the notifications are ignored). This will change to be more
fine-grained model once the first netlink provider requiring this gets
committed.
* Listing interfaces (RTM_GETLINK) is **allowed** w/o limits (**including**
interfaces w/o any addresses attached to the jail). The value of this is
questionable, but it follows the existing approach.
* Listing ARP/NDP neighbours is **forbidden**. This is a **change** from the
current approach - currently we list static ARP/ND entries belonging to the
addresses attached to the jail.
* Listing interface addresses is **allowed**, but the addresses are filtered
to match only ones attached to the jail.
* Listing routes is **allowed**, but the routes are filtered to provide only
host routes matching the addresses attached to the jail.
* By default, every `NETLINK_GENERIC` command is **allowed** in non-VNET jail
(as sub-families may be unrelated to network at all).
It is the goal of the family author to implement the restriction if
necessary.
Differential Revision: https://reviews.freebsd.org/D39206
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
|
| |
these functions exclusively return (0) and (1), so convert them to bool
We also convert some networking related jail functions from int to bool
some of which were returning an error that was never used.
Differential Revision: https://reviews.freebsd.org/D29659
Reviewed by: imp, jamie (earlier version)
Pull Request: https://github.com/freebsd/freebsd-src/pull/663
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The consensus was that VNET_NFSD was not needed.
This patch removes it from kern_jail.c.
With this patch, support for the "allow.nfsd"
jail parameter is enabled in the kernel for
kernels built with "options VIMAGE".
Reviewed by: markj
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D38808
|
|
|
|
|
|
|
| |
No functional change intended.
Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D37890
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation utilize off-by-one struct prison_ip to access the
IPv[46] addresses. It is error prone and hence comes the regression fix
21ad3e27fabc and ddbf879d79d4. Use flexible array member so that compiler
will catch such errors and it will also be easier to review.
No functional change intended.
Reviewed by: melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D37874
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there are multiple instances of mountd(8) (in different
prisons), there will be confusion if they manipulate the
exports of the same file system. This patch adds mnt_exjail
to "struct mount" so that the credentials (and, therefore,
the prison) that did the exports for that file system can
be recorded. If another prison has already exported the
file system, vfs_export() will fail with an error.
If mnt_exjail == NULL, the file system has not been exported.
mnt_exjail is checked by the NFS server, so that exports done
from within a different prison will not be used.
The patch also implements vfs_exjail_destroy(), which is
called from prison_cleanup() to release all the mnt_exjail
credential references, so that the prison can be removed.
Mainly to avoid doing a scan of the mountlist for the case
where there were no exports done from within the prison,
a count of how many file systems have been exported from
within the prison is kept in pr_exportcnt.
Reviewed by: markj
Discussed with: jamie
Differential Revision: https://reviews.freebsd.org/D38371
MFC after: 3 months
|
|
|
|
|
|
|
|
|
|
|
| |
`prison_ip_restrict()` is called in loop FOREACH_PRISON_DESCENDANT_LOCKED.
While under low memory, it is still possible that in subsequent rounds
`prison_ip_restrict()` succeed and `redo_ip[46]` flip over from true to
false, thus leave some prisons's IPv[46] addresses unrestricted.
Reviewed by: jamie
Fixes: 8bce8d28abe6 jail: Avoid multipurpose return value of function prison_ip_restrict()
Differential Revision: https://reviews.freebsd.org/D38697
|
|
|
|
|
|
|
|
|
|
|
| |
There's no reason to use one over the other here, let's prefer the
interface that's used elsewhere in the kernel.
No functional change intended.
Reviewed by: mjg
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38360
|
|
|
|
|
|
| |
This reverts commit 7926a01ed7ae7cefd81ef4cc2142c35b84d81913.
A new patch in D38371 is being considered for doing this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mountd(8) basically does the following:
getmntinfo()
for each mount
delete_exports
using nmount(2) to do the creation/deletion of individual exports.
For prison0 (and for other prisons if enforce_statfs == 0) getmntinfo()
returns all mount points, including ones being used within other prisons.
This can cause confusion if the same file system is specified in the
exports(5) file for multiple prisons.
This patch adds a perminent identifier to each prison
and marks which prison did the exports in a field of
the mount structure called mnt_exjail. This field can
then be compared to the perminent identifier for the
prison that the thread's credentials is in.
Also required was a new function called prison_isalive_permid()
which returns if the prison is alive, so that the check can be
ignored for prisons that have been removed.
This prepares the system to allow mountd(8) to run in multiple
prisons, including prison0.
Future commits will complete the modifications to allow mountd(8)
to run in vnet prisons. Until then, these changes should not affect
semantics.
Reviewed by: markj
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D38144
|
|
|
|
|
|
|
|
|
|
| |
Since mountd(8) will not be able to do exports
when running in a vnet prison if enforce_statfs is
set to 0, add a check for this to prison_check_nfsd().
Reviewed by: jamie, markj
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D38189
|
|
|
|
|
|
|
|
|
|
|
| |
Currently function prison_ip_restrict() returns true if the replacement
buffer was used, or no buffer provided and allocation fails and should
redo. The logic is confusing and cause possibly infinite loop from
eb8dcdeac22d .
Reviewed by: jamie, glebius
Approved by: kp (mentor)
Differential Revision: https://reviews.freebsd.org/D37918
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And possibly infinite loop calling prison_ip_restrict() in
kern_jail_set() [2].
[1] It is possible that prisons do not have any IPv4 or IPv6 addresses.
[2] If prison_ip_restrict() is not provided with prison_ip, when it
allocates prison_ip successfully, then it should return false to
indicate not redo prison_ip_restrict() later.
Reviewed by: glebius
Approved by: kp (mentor)
Fixes: eb8dcdeac22d jail: network epoch protection for IP address lists
Differential Revision: https://reviews.freebsd.org/D37906
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix wrong IPv[46] addresses inherited from parent jail
* Properly restrict the child jail's IPv[46] addresses
Reviewed by: melifaro, glebius
Approved by: kp (mentor)
Fixes: eb8dcdeac22d jail: network epoch protection for IP address lists
Differential Revision: https://reviews.freebsd.org/D37871
Differential Revision: https://reviews.freebsd.org/D37872
|
|
|
|
|
|
|
| |
Reviewed by: melifaro, jamie
Approved by: kp (mentor)
Fixes: eb8dcdeac22d jail: network epoch protection for IP address lists
Differential Revision: https://reviews.freebsd.org/D37732
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds "allow.nfsd" to the jail code based on a
new kernel build option VNET_NFSD. This will not work
until future patches fix nmount(2) to allow mountd to
run in a vnet prison and the NFS server code is patched
so that global variables are in a vnet.
The jail(8) man page will be patched in a future commit.
Reviewed by: jamie
MFC after: 4 months
Differential Revision: https://reviews.freebsd.org/D37637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit brings back the driver from FreeBSD commit
f187d6dfbf633665ba6740fe22742aec60ce02a2 plus subsequent fixes from
upstream.
Relative to upstream this commit includes a few other small fixes such
as additional INET and INET6 #ifdef's, #include cleanups, and updates
for recent API changes in main.
Reviewed by: pauamma, gbe, kevans, emaste
Obtained from: git@git.zx2c4.com:wireguard-freebsd @ 3cc22b2
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36909
|
|
|
|
|
|
|
| |
Separate if_me privileges from if_gif.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D36691
|
|
|
|
|
|
| |
- s/paramter/parameter/
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
| |
It allows iteration over processes belonging to given jail instead of
having to walk the entire allproc list.
Note the iteration can miss processes which remains bug-compatible
with previous code.
Reviewed by: jamie (previous version), markj (previous version)
Differential Revision: https://reviews.freebsd.org/D34522
|
|
|
|
|
|
|
| |
- s/occured/occurred/
- s/the the/the/
MFC after: 3 days
|
|
|
|
|
|
|
|
|
| |
Add shm_remove_prison(), that removes all POSIX shared memory segments
belonging to a prison. Call it from prison_cleanup() so a prison
won't be stuck in a dying state due to the resources still held.
PR: 257555
Reported by: grembo
|
|
|
|
|
|
|
| |
Currently, when a jail starts dying, either by losing its last user
reference or by being explicitly killed,
osd_jail_call(...PR_METHOD_REMOVE...) is called. Encapsulate this
into a function prison_cleanup() that can then do other cleanup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenVPN Data Channel Offload (DCO) moves OpenVPN data plane processing
(i.e. tunneling and cryptography) into the kernel, rather than using tap
devices.
This avoids significant copying and context switching overhead between
kernel and user space and improves OpenVPN throughput.
In my test setup throughput improved from around 660Mbit/s to around
2Gbit/s.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34340
|
|
|
|
|
|
| |
- s/a a/a/
MFC after: 3 days
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now struct prison has two pointers (IPv4 and IPv6) of struct
prison_ip type. Each points into epoch context, address count
and variable size array of addresses. These structures are
freed with network epoch deferred free and are not edited in
place, instead a new structure is allocated and set.
While here, the change also generalizes a lot (but not enough)
of IPv4 and IPv6 processing. E.g. address family agnostic helpers
for kern_jail_set() are provided, that reduce v4-v6 copy-paste.
The fast-path prison_check_ip[46]_locked() is also generalized
into prison_ip_check() that can be executed with network epoch
protection only.
Reviewed by: jamie
Differential revision: https://reviews.freebsd.org/D33339
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit b6be9566d236 stopped prison0_init writing outside of the
preloaded hostuuid's bounds. However, the preloaded data will not
(normally) have a NUL in it, and so validate_uuid will walk off the end
of the buffer in its call to sscanf. Previously if there was any
whitespace in the string we'd at least know there's a NUL one past the
end due to the off-by-one error, but now no such byte is guaranteed.
Fix this by copying to a temporary buffer and explicitly adding a NUL.
Whilst here, change the strlcpy call to use a far less suspicious
argument for dstsize; in practice it's fine, but it's an unusual pattern
and not necessary.
Found by: CHERI
Reviewed by: emaste, kevans, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33616
|
|
|
|
|
|
| |
See b4a58fbf640409a1 ("vfs: remove cn_thread")
Bump __FreeBSD_version to 1400043.
|
|
|
|
|
|
| |
- s/phyiscal/physical/
MFC after: 3 days
|
|
|
|
|
|
| |
- s/erorr/error/
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, this will still hash the default (all zero) hostuuid and
potentially arrive at a MAC address that has a high chance of collision
if another interface of the same name appears in the same broadcast
domain on another host without a hostuuid, e.g., some virtual machine
setups.
Instead of using the default hostuuid, just treat it as a failure and
generate a random LA unicast MAC address.
Reviewed by: bz, gbe, imp, kbowling, kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29788
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a module of type "hostuuid" is provided by the loader,
prison0_init strips any trailing whitespace and ASCII control
characters by (a) adjusting the buffer length, and (b) zeroing out
the characters in question, before storing it as the system's
hostuuid.
The buffer length adjustment was correct, but the zeroing overwrote
one byte higher in memory than intended -- in the typical case,
zeroing one byte past the end of the hostuuid buffer. Due to the
layout of buffers passed by the boot loader to the kernel, this will
be the first byte of a subsequent buffer.
This was *probably* harmless; prison0_init runs after preloaded kernel
modules have been linked and after the preloaded /boot/entropy cache
has been processed, so in both cases having the first byte overwritten
will not cause problems. We cannot however rule out the possibility
that other objects which are preloaded by the loader could suffer from
having the first byte overwritten.
Since the zeroing does not in fact serve any purpose, remove it and
trim trailing whitespace and ASCII control characters by adjusting
the buffer length alone.
Fixes: c3188289 Preload hostuuid for early-boot use
Reviewed by: kevans, markj
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the preloaded hostuuid value is invalid and verbose booting is
enabled, a warning is printed. This printf had two bugs:
1. It was missing a trailing \n character.
2. The malformed UUID is printed with %s even though it is not known
to be NUL-terminated.
This commit adds the missing \n and uses %.*s with the (already known)
length of the preloaded UUID to ensure that we don't read past the end
of the buffer.
Reported by: kevans
Fixes: c3188289 Preload hostuuid for early-boot use
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
| |
After length decisions, we've decided that the if_wg(4) driver and
related work is not yet ready to live in the tree. This driver has
larger security implications than many, and thus will be held to
more scrutiny than other drivers.
Please also see the related message sent to the freebsd-hackers@
and freebsd-arch@ lists by Kyle Evans <kevans@FreeBSD.org> on
2021/03/16, with the subject line "Removing WireGuard Support From Base"
for additional context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the culmination of about a week of work from three developers to
fix a number of functional and security issues. This patch consists of
work done by the following folks:
- Jason A. Donenfeld <Jason@zx2c4.com>
- Matt Dunwoodie <ncon@noconroy.net>
- Kyle Evans <kevans@FreeBSD.org>
Notable changes include:
- Packets are now correctly staged for processing once the handshake has
completed, resulting in less packet loss in the interim.
- Various race conditions have been resolved, particularly w.r.t. socket
and packet lifetime (panics)
- Various tests have been added to assure correct functionality and
tooling conformance
- Many security issues have been addressed
- if_wg now maintains jail-friendly semantics: sockets are created in
the interface's home vnet so that it can act as the sole network
connection for a jail
- if_wg no longer fails to remove peer allowed-ips of 0.0.0.0/0
- if_wg now exports via ioctl a format that is future proof and
complete. It is additionally supported by the upstream
wireguard-tools (which we plan to merge in to base soon)
- if_wg now conforms to the WireGuard protocol and is more closely
aligned with security auditing guidelines
Note that the driver has been rebased away from using iflib. iflib
poses a number of challenges for a cloned device trying to operate in a
vnet that are non-trivial to solve and adds complexity to the
implementation for little gain.
The crypto implementation that was previously added to the tree was a
super complex integration of what previously appeared in an old out of
tree Linux module, which has been reduced to crypto.c containing simple
boring reference implementations. This is part of a near-to-mid term
goal to work with FreeBSD kernel crypto folks and take advantage of or
improve accelerated crypto already offered elsewhere.
There's additional test suite effort underway out-of-tree taking
advantage of the aforementioned jail-friendly semantics to test a number
of real-world topologies, based on netns.sh.
Also note that this is still a work in progress; work going further will
be much smaller in nature.
MFC after: 1 month (maybe)
|
|
|
|
|
|
|
|
| |
do_jail_attach() now only uses the PD_XXX flags that refer to lock
status, so make sure that something else like PD_KILL doesn't slip
through.
Add a KASSERT() in prison_deref() to catch any further PD_KILL misuse.
|
|
|
|
| |
I had locked allprison_lock without immediately setting PD_LIST_LOCKED.
|
|
|
|
|
|
|
|
|
| |
Make sure PD_KILL isn't passed to do_jail_attach, where it might end
up trying to kill the caller's prison (even prison0).
Fix the child jail loop in prison_deref_kill, which was doing the
post-order part during the pre-order part. That's not a system-
killer, but make jails not always die correctly.
|
|
|
|
| |
Reported by: arichardson
|