aboutsummaryrefslogtreecommitdiff
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* msgbuf: Allow microsecond granularity timestampsWarner Losh2022-05-071-2/+10
| | | | | | | | | | | | | | | | | | | Today, kern.msgbuf_show_timestamp=1 will give 1 second granularity timestamps on dmesg lines. When kern.msgbuf_show_timestamp=2, we'll produce microsecond level graunlarity. For example: old (== 1): [13] Dual Console: Video Primary, Serial Secondary [14] lo0: link state changed to UP [15] bxe0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit [15] bxe0: link state changed to UP new (== 2): [13.807015] Dual Console: Video Primary, Serial Secondary [14.544150] lo0: link state changed to UP [15.272044] bxe0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit [15.272052] bxe0: link state changed to UP Sponsored by: Netflix
* Correctly measure system load averages > 1024Alan Somers2022-05-062-5/+6
| | | | | | | | | | | | | | The old fixed-point arithmetic used for calculating load averages had an overflow at 1024. So on systems with extremely high load, the observed load average would actually fall back to 0 and shoot up again, creating a kind of sawtooth graph. Fix this by using 64-bit math internally, while still reporting the load average to userspace as a 32-bit number. Sponsored by: Axcient Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D35134
* cpufreq: Remove unused devclass argument to DRIVER_MODULE.John Baldwin2022-05-061-2/+3
|
* sysvsem: Add a timeout argument to the semop.Dmitry Chagin2022-05-061-12/+38
| | | | | | | | | | For future use in the Linux emulation layer for the semtimedop syscall split the sys_semop syscall into two counterparts and add struct timespec *timeout argument to the last one. Reviewed by: jhb, kib Differential revision: https://reviews.freebsd.org/D35121 MFC after: 2 weeks
* mbuf: do not restore dying interfacesKristof Provost2022-05-051-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | When we remove an interface it is first removed from the interface list V_ifnet (by if_unlink_ifnet()) and marked as IFF_DYING. We then wait for any possible references to stop being used (i.e. epoch_wait/epoch_drain_callbacks) before we tear it fully down. However, the index in ifindex_table is not removed, so m_rcvif_restore() can still find the (now dying) interface. This results in panics, for example when dummynet restores the rcvif pointer and passes a packet to ip6_input() we can panic because the AF_INET6 domain has already been removed (so we end up dereferencing a NULL pointer there). Check that the interface is not dying before we restore it, which is equivalent to checking its presence in V_ifnet, and thus ensures that future accesses (while in NET_EPOCH) are safe. Reviewed by: glebius Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D34076 (cherry picked from commit 703e533da5e2e4743d38bbf4605fec041bc69976)
* ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvifGleb Smirnoff2022-05-051-0/+22
| | | | | | | | | | | Supplement ifindex table with generation count and use it to serialize & restore an ifnet pointer. Reviewed by: kp Differential revision: https://reviews.freebsd.org/D33266 Fun note: git show e6abef09187a (cherry picked from commit e1882428dcbbafd2814d7e17b977a8f686784b39)
* Revert "mbuf: do not restore dying interfaces"Marko Zec2022-05-031-27/+0
| | | | | | | | | | This reverts commit 703e533da5e2e4743d38bbf4605fec041bc69976. Revert "ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif" This reverts commit e1882428dcbbafd2814d7e17b977a8f686784b39. Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
* subr_unit.c: make userspace tests buildableKonstantin Belousov2022-04-281-0/+2
| | | | | | | by defining a placeholder for UNR_NO_MTX Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Fix another race between fork(2) and PROC_REAP_KILL subtreeKonstantin Belousov2022-04-271-14/+87
| | | | | | | | | | | | | where we might not yet see a new child when signalling a process. Ensure that this cannot happen by stopping all reapping subtree, which ensures that the child is not inside a syscall, in particular fork(2). Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* Fix a race between fork(2) and PROC_REAP_KILL subtreeKonstantin Belousov2022-04-271-4/+30
| | | | | | | | | | | by repeating iteration over the subtree until there are no new processes to signal. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* kern_procctl: add possibility to take stop_all_proc_block() around execKonstantin Belousov2022-04-271-1/+22
| | | | | | | | | stop_allo_proc_block() must be taken before proctree_lock. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* Add stop_all_proc_block(9)Konstantin Belousov2022-04-271-0/+19
| | | | | | | | | | | It allows to have more than one consumer of thread_signle(SIGNLE_ALLPROC) by serializing them. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* reap_kill(): split children and subtree killers into helpersKonstantin Belousov2022-04-271-27/+44
| | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* reap_kill(): rename the reap variable to reaperKonstantin Belousov2022-04-271-5/+5
| | | | | | | | Suggested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* reap_kill(): de-inline LIST_FOREACH(), twiceKonstantin Belousov2022-04-271-4/+3
| | | | | | | | Suggested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* reaper_abandon_children(): upgrade proctree_lock assert to exclusiveKonstantin Belousov2022-04-271-1/+1
| | | | | | | | | | | p_reapsibling linkage is protected by proctree_lock, and it is modified there. Suggested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* unr(9): allow to avoid internal lockingKonstantin Belousov2022-04-271-14/+29
| | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* init_unrhdr(): make it usable by initializing everythingKonstantin Belousov2022-04-271-0/+2
| | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D35014
* Add a __witness_used for variables only used under #ifdef WITNESS.John Baldwin2022-04-271-3/+3
| | | | | | | __diagused is now solely used for variables only used under INVARIANTS. Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D35085
* sigtimedwait: Prevent timeout math overflows.Dmitry Chagin2022-04-251-29/+20
| | | | | | | | | | | | | | | Our kern_sigtimedwait() calculates absolute sleep timo value as 'uptime+timeout'. So, when the user specifies a big timeout value (LONG_MAX), the calculated timo can be less the the current uptime value. In that case kern_sigtimedwait() returns EAGAIN instead of EINTR, if unblocked signal was caught. While here switch to a high-precision sleep method. Reviewed by: mav, kib In collaboration with: mav Differential revision: https://reviews.freebsd.org/D34981 MFC after: 2 weeks
* Add timespecvalid_interval macro and use it.Dmitry Chagin2022-04-254-22/+10
| | | | | | Reviewed by: jhb, imp (early rev) Differential revision: https://reviews.freebsd.org/D34848 MFC after: 2 weeks
* KTLS: Move OCF function pointers out of ktls_session.John Baldwin2022-04-221-3/+3
| | | | | | | | | | | Instead, create a switch structure private to ktls_ocf.c and store a pointer to the switch in the ocf_session. This will permit adding an additional function pointer needed for NIC TLS RX without further bloating ktls_session. Reviewed by: hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D35011
* busdma_bounce: Batch bounce page free operations when possible.John Baldwin2022-04-211-35/+34
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D34968
* busdma_bounce: Add free_bounce_pages helper function.John Baldwin2022-04-211-0/+11
| | | | | | | | | Deduplicate code to iterate over the bpages list in a bus_dmamap_t freeing bounce pages during bus_dmamap_unload. Reviewed by: imp Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34967
* busdma_bounce: Make the map waiting list per-bounce-zone.John Baldwin2022-04-211-5/+8
| | | | | | | | | | | | | | | When pages are freed to a bounce zone, only maps waiting for pages for that zone can make forward progress. If a map for a different bounce zone is at the head of the global list, then requests that could otherwise make forward progress will be stalled waiting on the other bounce zone. If bounce zones shared bounce pages then a global list would still make sense to prevent "later" requests from starving an earlier request but that is not a concern with per-zone bounce page pools. Reviewed by: imp Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34966
* busdma_bounce: Use a simple kproc to invoke deferred requests.John Baldwin2022-04-211-28/+39
| | | | | | | | | Rather than using a software interrupt with a single handler, just create a dedicated kernel process woken up with a simple wakeup(). Reviewed by: imp Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34965
* Run softclock threads at a hardware ithread priority.John Baldwin2022-04-211-1/+1
| | | | | | | | | Add a new PI_SOFTCLOCK for use by softclock threads. Currently this maps to PI_AV which is the second-highest ithread priority. Reviewed by: mav, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D33693
* cpufreq_curr_sysctl: Use devclass_find to lookup cpufreq devclass.John Baldwin2022-04-211-1/+1
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D35002
* callout: fix using shared rmlocksKristof Provost2022-04-201-1/+1
| | | | | | | | | | | | | 15b1eb142c changed the callout code to store the CALLOUT_SHAREDLOCK flag in c_iflags (where it used to be c_flags), but failed to update the check in softclock_call_cc(). This resulted in the callout code always taking the write lock, even if a read lock had been requested (with the CALLOUT_SHAREDLOCK flag in callout_init_rm()). Reviewed by: markj MFC after: 1 week Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D34959
* devclass_add_driver: Permit NULL to be passed in dcp.John Baldwin2022-04-191-1/+4
| | | | | | | | | This permits a driver module structure that doesn't want to store a pointer to the new driver's devclass. Reviewed by: imp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D34962
* signals: plug a set-but-not-used varMateusz Guzik2022-04-191-3/+2
| | | | Sponsored by: Rubicon Communications, LLC ("Netgate")
* destroy_dev_sched*: Don't hold Giant for all deferred destroy_dev.John Baldwin2022-04-181-6/+28
| | | | | | | | | | | | | | | Rather than using taskqueue_swi_giant which holds Giant for all deferred destroy_dev calls, create a separate queue for destroyed devices with D_NEEDGIANT set in the corresponding cdevsw. The task for this queue holds Giant whild destroying deferred devices while the task for the default queue does not hold Giant. In addition, switch to taskqueue_thread for destroy_dev_sched. Deferred destroy_dev requests don't need to run at an SWI priority. Reviewed by: imp, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D34915
* Revert rest of a5970a529c2d95271: use vrefact() when working on fp->f_vnodeKonstantin Belousov2022-04-152-4/+4
| | | | | | | | | | | | Now, since O_PATH-opened file descriptors use use references instead of the hold references, vrefact() chahges from that revision can be reverted. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34906
* sysent: regen after 52a1d90c8bfe, posix_fadvise in capmodeEd Maste2022-04-141-1/+1
|
* Allow posix_fadvise in capability modeEd Maste2022-04-141-1/+1
| | | | | | | | | | | | | | posix_fadvise operates only on a provided fd. Noted by Mathieu <sigsys@gmail.com> in review D34761. No new CAP_ rights are added for posix_fadvise(), as 'advice' in general only influences when I/O happens; the fd must have existing CAP_ rights for actual data access. Reviewed by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34903
* Mostly revert a5970a529c2d95271: Make files opened with O_PATH to not block ↵Konstantin Belousov2022-04-132-3/+1
| | | | | | | | | | | | non-forced unmount Problem is that open(O_PATH) on nullfs -o nocache is broken then, because there is no reference on the vnode after the open syscall exits. Reported and tested by: ambrisko Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week
* kern: Move variables only used for MAC under #ifdef MAC.John Baldwin2022-04-132-2/+7
|
* sched_ule: Inline value of ts in sched_thread_priority.John Baldwin2022-04-131-3/+1
| | | | | This avoids a set but unused warning in kernels without SMP where TDQ_CPU() doesn't use its argument.
* sched_4bsd: ts is only used in sched_bind for SMP.John Baldwin2022-04-131-3/+3
|
* sched_4bsd: Remove unused variables.John Baldwin2022-04-121-4/+0
|
* realloc(9): Move slab and zone under #ifndef DEBUG_REDZONE.John Baldwin2022-04-121-2/+2
|
* tty: Remove an incorrect assertion from ttyinq_line_iterate()Mark Johnston2022-04-121-1/+0
| | | | | | | | | | We may legitimately have tib == NULL if we're at the very end of the queue. PR: 215373 Reported by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation
* kdb: set kdb_why when entered via reboot and panicTom Jones2022-04-121-1/+3
| | | | | | | | Reviewed by: jhb Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. X-NetApp-PR: #74 Differential Revision: https://reviews.freebsd.org/D34551
* getdirentries: return ENOENT for unlinked but still open directory.Dmitry Chagin2022-04-112-0/+5
| | | | | | | | To be more compatible to IEEE Std 1003.1-2008 (“POSIX.1”). Reviewed by: mjg, Pau Amma (doc) Differential revision: https://reviews.freebsd.org/D34680 MFC after: 2 weeks
* Add sysctl KERN_LOCKFKonstantin Belousov2022-04-092-1/+151
| | | | | | | | | | | | | | | | | reporting the shapshot of the active advisory locks. A new VFS ops method vfs_report_lockf if provided in the mount point op table. If it is NULL, as it is currently for all existing filesystems, vfs_report_lockf() function is used, which gathers information from the standard implementation inside kern/kern_lockf.c. Filesystems implementing its own locking (NFSv4 as example) can provide a custom implementation. Reviewed by: markj, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34756
* kern_lockf.c: remove no longer neeeded UFS headersKonstantin Belousov2022-04-091-5/+0
| | | | | | | Reviewed by: markj, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34756
* lockf: remove lf_inode from struct lockf_entryKonstantin Belousov2022-04-091-17/+3
| | | | | | | | | | | | | | The UFS-specific struct inode cannot be used in generic advisory lock code. It was probably used as a shortcut for the debugging, as the remnants of the code around it indicates. Use somewhat more verbose and less concentrated, but universal, VOP_PRINT(), where needed. Reviewed by: markj, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34756
* jail: Remove a double word in a source code commentGordon Bergling2022-04-091-1/+1
| | | | | | - s/a a/a/ MFC after: 3 days
* kern: Remove a double word in a source code commentGordon Bergling2022-04-091-1/+1
| | | | | | - s/for for/for/ MFC after: 3 days
* kern: Fix a typo in a source code commentGordon Bergling2022-04-092-2/+2
| | | | | | - s/is is/is/ MFC after: 3 days