aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* amd64: Remove proc0_tf, the bootstrap trapframeMark Johnston2021-09-251-2/+0
| | | | | | | | | | It no longer serves any purpose as thread0's td_frame field is now initialized during fpuinitstate(). No functional change intended. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32057
* amd64: Avoid copying td_frame from kernel procsMark Johnston2021-09-251-30/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | When creating a new thread, we unconditionally copy td_frame from the creating thread. For threads which never return to user mode, this is unnecessary since td_frame just points to the base of the stack or a random interrupt frame. If KASAN is configured this copying may also trigger false positives since the td_frame region may contain poisoned stack regions. It was not noticed before since thread0 used a dummy proc0_tf trapframe, and kernel procs are generally created by thread0. Since commit df8dd6025af88a99d34f549fa9591a9b8f9b75b1, though, we call cpu_thread_alloc(&thread0) when initializing FPU state, which reinitializes thread0.td_frame. Work around the problem by not copying the frame unless the copying thread came from user mode. While here, de-duplicate the copying and remove redundant re(initialization) of td_frame. Reported by: syzbot+2ec89312bffbf38d9aec@syzkaller.appspotmail.com Reviewed by: kib Fixes: df8dd6025af8 MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32057
* cam: Avoiding waking up doneq threads if we're dumpingMark Johnston2021-09-251-1/+1
| | | | | | | | | | | | | | Depending on the state of the target doneq thread at the time of the panic, the wakeup can hang indefinitely in thread_lock_block_wait(). That function should likely be modified to return immediately if the scheduler is stopped, but it is also preferable to avoid wakeups in general after a panic. Reported by: pho Reviewed by: mav, imp MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32126
* x86 bounce_bus_dmamem_alloc(): use malloc_aligned() only when possibleKonstantin Belousov2021-09-251-0/+2
| | | | | | | | | | | | | malloc_domainset_aligned() requires that alignment is less than page size. Fall back to other allocation methods, most likely kmem_alloc_contig(), when malloc_aligned() cannot fullfill the driver request. Reported by: Loic F <loic.f@hardenedbsd.org> Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32127
* malloc_aligned(9): allow zero size and alignmentKonstantin Belousov2021-09-251-1/+3
| | | | | | | | | | | For alignment we do not need to do anything to make it operational. For size, upgrade zero sized request to one byte so that we do not request insane amount of memory for placeholder. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32127
* net80211(4): Fix a few common typos in source code commentsGordon Bergling2021-09-253-3/+3
| | | | | | | - s/annoucement/announcement/ - s/setings/settings/ MFC after: 1 week
* ubsan: Fix a typo in an error messageGordon Bergling2021-09-251-1/+1
| | | | | | | - s/asumption/assumption/ Obtained from: NetBSD MFC after: 1 week
* hostname: avoid strcpy() overlap in -d flag handlingKyle Evans2021-09-251-3/+4
| | | | | | | We don't need the strcpy() anyways, just use a pointer to the hostname buffer and move it forward for `hostname -d`. Sponsored by: Klara, Inc.
* [fib_algo][dxr] Split unused range chunk list in multiple bucketsMarko Zec2021-09-251-15/+32
| | | | | | | | | | | | | | | | | | | | Traversing a single list of unused range chunks in search for a block of optimal size was suboptimal. The experience with real-world BGP workloads has shown that on average unused range chunks are tiny, mostly in length from 1 to 4 or 5, when DXR is configured with K = 20 which is the current default (D16X4R). Therefore, introduce a limited amount of buckets to accomodate descriptors of empty blocks of fixed (small) size, so that those can be found in O(1) time. If no empty chunks of the requested size can be found in fixed-size buckets, the search continues in an unsorted list of empty chunks of variable lengths, which should only happen infrequently. This change should permit us to manage significantly more empty range chunks without sacrifying the speed of incremental range table updating. MFC after: 3 days
* Make CPU children explicitly share parent unit numbers.Alexander Motin2021-09-2512-16/+22
| | | | | | Before this device unit number match was coincidental and broke if I disabled some CPU device(s). Aside of cosmetics, for some drivers (may be considered broken) it caused talking to wrong CPUs.
* loader printf: Profile with TSLOGColin Percival2021-09-251-1/+4
| | | | | | | | | | Now that the loader tslog code doesn't call printf, we can profile printf using TSLOG. On an EC2 c5.xlarge instance, we spend roughly 45 ms here (out of roughly 500 ms), presumably due to the time spent writing output to the console. MFC after: 1 week Sponsored by: https://www.patreon.com/cperciva
* loader tslog: Don't use sprintfColin Percival2021-09-251-7/+43
| | | | | | | Instead, append the log entry "manually". MFC after: 1 week Sponsored by: https://www.patreon.com/cperciva
* makesyscalls: sprinkle some assert() on standard function callsKyle Evans2021-09-251-30/+34
| | | | | | | | | | | Improves our error reporting, ensuring that we aren't just ignoring errors in the common case. Note specifically the boundary where we have to change up our error handling approach. It's fine to error() out up until we create the tempdir, then the rest should try to handle it gracefully and abort(). A future change will clean this up further by pcall'ing all of the bits that cannot currently error() without cleaning up.
* makesyscalls: rip out arbitrary command executionKyle Evans2021-09-251-33/+2
| | | | | | | | | | | | | | This was previously needed only for CloudABI, which used it to generate its capenabled from syscalls.master. CloudABI was removed in cf0ee8738e31, so we don't need to support this anymore. Others looking to do similar things should come up with a more integrated technique, such as a .conf flag or pattern/glob support. brooks suggests that it could be done in modern makesyscalls.lua by adding a config flag to specify always-on/initial flags (CAPENABLED). Reviewed by: brooks, imp MFC after: never Differential Revision: https://reviews.freebsd.org/D32095
* makesyscalls: stop trying to remove . and .. in cleanupKyle Evans2021-09-251-1/+3
| | | | | lfs.dir() will include these entries, but os.remove() cannot remove them for obvious reasons.
* acpi_cpu: Make device unit numbers match OS CPU IDs.Alexander Motin2021-09-252-68/+22
| | | | | | | | | | | There are already APIC ID, ACPI ID and OS ID for each CPU. In perfect world all of those may match, but at least for SuperMicro server boards none of them do. Plus none of them match the CPU devices listing order by ACPI. Previous code used the ACPI device listing order to number cpuX devices. It looked nice from NewBus perspective, but introduced 4th different set of IDs. Extremely confusing one, since in some places the device unit numbers were treated as OS CPU IDs (coretemp), but not in others (sysctl dev.cpu.X.%location).
* e1000: Rename 'struct adapter' to 'struct e1000_sc'Kevin Bowling2021-09-254-997/+987
| | | | | | | | | Rename the 'struct adapter' to 'struct e1000_sc' to avoid type ambiguity in things like kgdb. Reviewed by: jhb, markj MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D32129
* bus: Cleanup device_probe_child()Alexander Motin2021-09-251-21/+29
| | | | | | | | | | | | | | | | | | When device driver probe method returns 0, i.e. absolute priority, do not remove its class from the device just to set it back few lines later, that may change the device unit number, etc. and after which we'd better call the probe again. If during search we found some driver with absolute priority, we do not need to set device driver and class since we haven't removed them before. It should not happen, but if second probe method call failed, remove the driver and possibly the class from the device as it was when we started. Reviewed by: imp, jhb Differential Revision: https://reviews.freebsd.org/D32125
* mount: add libxo(3) supportCameron Katri2021-09-243-55/+114
| | | | | | Adds --libxo to mount(8). Differential Revision: https://reviews.freebsd.org/D30341
* bus: Fix LINT / BUS_DEBUG buildWarner Losh2021-09-241-1/+1
| | | | | | | | Fix 0389e9be63c5e for LINT built. Removed an arg only from code under BUS_DEBUG w/o rebuilding LINT... Sponsored by: Netflix Fixes: 0389e9be63c5e24ecedbb366c5682ddc2ff4de60
* ps: fix `ps -aa`Math Ieu2021-09-241-6/+1
| | | | | | Passing the -a flag multiple times made ps show no processes. Differential Revision: https://reviews.freebsd.org/D27215
* opencrypto: Disallow requests which pass VERIFY_DIGEST without a MACMark Johnston2021-09-241-1/+1
| | | | | | | | | | | | | | | | Otherwise we can end up comparing the computed digest with an uninitialized kernel buffer. In cryptoaead_op() we already unconditionally fail the request if a pointer to a digest buffer is not specified. Based on a patch by Simran Kathpalia. Reported by: syzkaller Reviewed by: jhb MFC after: 1 week Pull Request: https://github.com/freebsd/freebsd-src/pull/529 Differential Revision: https://reviews.freebsd.org/D32124
* loader: dev_net.c should use __func__ with printfToomas Soome2021-09-241-14/+16
| | | | | | | We have printf calls with function name hardwired to string, sometimes wrong name. Use __func__ instead. MFC after: 1 week
* ipfilter: Locking sysctls here is not requiredCy Schubert2021-09-241-2/+0
| | | | | | | Locking of data structures touched by sysctls is more finely locked in ipflter therefore higher level locks are redundant. MFC after: 3 days
* ipfilter: Avoid a null if-then-else blocksCy Schubert2021-09-242-12/+8
| | | | | | | | When WITHOUT_INET6 is selected we generate a null if-then-else blocks due to incorrect placment of #if statments. Move the #if statements reducing unnecessary runtime comparisons WITHOUT_INET6. MFC after: 1 week
* cxgbe: Mark received packets as initialized for KMSANMark Johnston2021-09-241-0/+2
| | | | | | | | | | | | | | The KMSAN runtime needs to have its shadow maps updated when devices update host memory, otherwise it assumes that device-populated memory is uninitialized. For most drivers this is handled transparently by busdma, but cxgbe doesn't make use of dma maps for receive buffers and so requires special treatment. Reported by: mjg Tested by: mjg Reviewed by: np Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32102
* read builtin: Empty variables on timeoutBryan Drewery2021-09-244-0/+29
| | | | | | | | This matches how a non-timeout error is handled. Reviewed by: jilles MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31876
* bus: retire DF_REBIDWarner Losh2021-09-242-36/+6
| | | | | | | | | | | | I did DF_REBID to allow for 'hoover' drivers that would attach to otherwise unattached devices in the tree. This notion didn't catch on as it was tricky to make work well and it was easier to just publish a /dev node of some flavor by the parent device. It's been nothing but dead weight for a long time. Reviewed by: mav Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D32056
* pidfile test: guarantee nul termination of the read pid stringKonstantin Belousov2021-09-241-1/+2
| | | | | | PR: 258701 Based on the submission by: sigsys@gmail.com MFC after: 1 week
* tests/sys/sys: Raise WARNSMark Johnston2021-09-244-5/+3
| | | | | MFC after: 1 week Sponsored by: The FreeBSD Foundation
* UPDATING: new entry about dummynetKristof Provost2021-09-241-1/+7
| | | | | | | | | Dummynet now no longer requires ipfw, so any users relying on this dependency to load ipfw will need to explicitly load ipfw. While here fix a typo in the date of the previous entry. Sponsored by: Rubicon Communications, LLC ("Netgate")
* cxgbe: fix LINT-NOIP buildsKristof Provost2021-09-241-6/+0
| | | | | | | | The -NOIP builds fail because cxgbe_tls_tag_free() has no prototype (if neither INET nor INET6 are defined). The function isn't actually used in that case, so we can just remove the stub implementation. Sponsored by: Rubicon Communications, LLC ("Netgate")
* pf.conf.5: document dummynet supportKristof Provost2021-09-241-4/+35
| | | | | | MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31907
* man dummynet: dummynet can also be used with pfKristof Provost2021-09-241-2/+4
| | | | | | MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31906
* netpfil tests: extend dummynet tests to pfKristof Provost2021-09-242-12/+29
| | | | | | | | | | Now that pf can also use dummynet we should extend the existing dummynet tests to also test it when used with pf. Reviewed by: donner MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31905
* pf: support dummynetKristof Provost2021-09-248-4/+256
| | | | | | | | | | | | Allow pf to use dummynet pipes and queues. We re-use the currently unused IPFW_IS_DUMMYNET flag to allow dummynet to tell us that a packet is being re-injected after being delayed. This is needed to avoid endlessly looping the packet between pf and dummynet. MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31904
* dummynet: Does not depend on ipfwKristof Provost2021-09-241-1/+0
| | | | | | | | | Allow the dummynet module to be loaded without ipfw, as a first step towards making pf use it for packet scheduling. Reviewed by: donner Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31903
* man dummynet: point to dnctl instead of ipfwKristof Provost2021-09-241-3/+4
| | | | | | | | | | Dummynet configuration is ideally done through dnctl now. While ipfw still works dnctl is preferred now that dummynet can also be used with pf. MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31902
* ipsec: Add support for PMTUD for IPv6 tunnelsBartlomiej Grzesik2021-09-242-34/+133
| | | | | | | | | | | Discard and send ICMPv6 Packet Too Big to sender when we try to encapsulate and forward a packet which total length exceeds the PMTU. Logic is based on the IPv4 implementation. Common code was moved to a separate function. Differential revision: https://reviews.freebsd.org/D31771 Obtained from: Semihalf Sponsored by: Stormshield
* ipsec: If no PMTU in hostcache assume it's equal to link's MTUBartlomiej Grzesik2021-09-241-4/+18
| | | | | | | | | | | | If we fail to find to PMTU in hostcache, we assume it's equal to link's MTU. This patch prevents packets larger then link's MTU to be dropped silently if there is no PMTU in hostcache. Differential revision: https://reviews.freebsd.org/D31770 Obtained from: Semihalf Sponsored by: Stormshield
* ipsec: Add PMTUD support for IPsec IPv4 over IPv6 tunnelBartlomiej Grzesik2021-09-241-6/+12
| | | | | | | | Add support for checking PMTU for IPv4 packets encapsulated in IPv6 tunnels. Differential revision: https://reviews.freebsd.org/D31769 Sponsored by: Stormshield Obtained from: Semihalf
* unionfs: lock newly-created vnodes before calling insmntque()Jason A. Harmening2021-09-241-18/+68
| | | | | | | | | | This fixes an insta-panic when attempting to use unionfs with DEBUG_VFS_LOCKS. Note that unionfs still has a long way to go before it's generally stable or usable. Reviewed by: kib (prior version), markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D31917
* kqueue: Add EV_KEEPUDATA flagNathaniel Wesley Filardo2021-09-244-2/+86
| | | | | | | | | | | When this flag is set, operations that update an existing kevent will not change the udata field. This can be used to NOTE_TRIGGER or EV_{EN,DIS}ABLE events without overwriting the stashed pointer. Reviewed by: Domagoj Stolfa <domagoj.stolfa@gmail.com> Obtained from: CheriBSD Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D30286
* libsysdecode: Permit _ in VM_PROT_(.*) names.Nathaniel Wesley Filardo2021-09-241-1/+1
| | | | | | | | | CheriBSD defines additional protection flags which use underscores such as VM_PROT_READ_CAP and VM_PROT_WRITE_CAP. Obtained from: CheriBSD Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D30017
* aio_aqueue(): avoid ucred leak on failure pathKonstantin Belousov2021-09-241-1/+3
| | | | | | PR: 258698 Submitted by: sigsys@gmail.com MFC after: 1 week
* nvme: Use shared timeout rather than timeout per transactionWarner Losh2021-09-233-66/+131
| | | | | | | | | | | | Keep track of the approximate time commands are 'due' and the next deadline for a command. twice a second, wake up to see if any commands have entered timeout. If so, quiessce and then enter a recovery mode half the timeout further in the future to allow the ISR to complete. Once we exit recovery mode, we go back to operations as normal. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D28583
* pf: fix pagefault in pf_getstatus()Kristof Provost2021-09-231-0/+3
| | | | | | | | | | | | | We can't copyout() while holding a lock, in case it triggers a page fault. Release the lock before copyout, which is safe because we've already copied all the data into the nvlist. PR: 258601 Reviewed by: mjg MFC after: 1 week Sponsored by: Modirum MDPay Differential Revision: https://reviews.freebsd.org/D32076
* e1000: fix K1 configurationWenzhuo Lu2021-09-233-1/+52
| | | | | | | | | | | | | | | This patch is for the following updates to the K1 configurations: Tx idle period for entering K1 should be 128 ns. Minimum Tx idle period in K1 should be 256 ns. Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> PR: 258153 Reviewed by: erj Tested by: iron.udjin@gmail.com Approved by: imp Obtained from: DPDK (6f934fa24dfd437c90ead96bc7598ee77a117ede) MFC after: 1 week
* man: reset OPTIND before parsing argsKyle Evans2021-09-231-0/+4
| | | | | | | | | | | | | | | | | | From jilles: POSIX requires that a script set `OPTIND=1` before using different sets of parameters with `getopts`, or the results will be unspecified. The specific problem observed here is that we would execute `man -f` or `man -k` without cleaning up state from man_parse_args()' `getopts` loop. FreeBSD's /bin/sh seems to reset OPTIND to 1 after we hit the second getopts loop, rendering the following shift harmless; other /bin/sh implementations will leave it at what we came into the loop at (e.g., bash as /bin/sh), shifting off any keywords that we had. Input from: jilles Reviewed by: allanjude, bapt, imp Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D32063
* x86: Add NUMA nodes into CPU topology.Alexander Motin2021-09-233-13/+72
| | | | | | | | | | | | | | | Depending on hardware, NUMA nodes may match last level caches, or they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC). This information is provided by ACPI instead of CPUID, and it is provided for each CPU individually instead of mask widths, but this code should be able to properly handle all the above cases. This change should immediately allow idle stealing in sched_ule(4) to prefer load from NUMA-local CPUs to remote ones when the node does not match LLC. Later we may think of how to better handle it on sched_pickcpu() side. MFC after: 1 month