aboutsummaryrefslogtreecommitdiff
path: root/sys/dev/hyperv
Commit message (Collapse)AuthorAgeFilesLines
* Introduce flag IFF_NEEDSEPOCH that marks Ethernet interfaces thatGleb Smirnoff2020-01-231-1/+2
| | | | | | | | | | | supposedly may call into ether_input() without network epoch. They all need to be reviewed before 13.0-RELEASE. Some may need be fixed. The flag is not planned to be used in the kernel for a long time. Notes: svn path=/head/; revision=357010
* storvsc: port a Linux patch, properly set residual data length on errorsAndriy Gapon2020-01-142-2/+7
| | | | | | | | | | | | | | | | | | This change is based on Linux commit 40630f462824ee. csio.resid should account for transfer_len only for success and SRB_STATUS_DATA_OVERRUN condition. I am not sure how exactly this change works, but I have a report from a user that they see lots of checksum errors when running a pool scrub concurrently with iozone -l 1 -s 100G. After applying this patch the problem cannot be reproduced. Reviewed by: nobody Sponsored by: CyberSecure Differential Revision: https://reviews.freebsd.org/D22312 Notes: svn path=/head/; revision=356730
* Fix spelling.Hans Petter Selasky2019-12-301-1/+1
| | | | | | | | | PR: 242891 MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=356201
* Revert r355806: kbd drivers: don't double register keyboard driversKyle Evans2019-12-261-3/+0
| | | | | | | | | | | r356087 made it rather innocuous to double-register built-in keyboard drivers; we now set a flag to indicate that it's been registered and only act once on a registration anyways. There is no misleading here, as the follow-up kbd_delete_driver will actually remove the driver as needed now that the linker set isn't also consulted after kbdinit. Notes: svn path=/head/; revision=356091
* kbd drivers: don't double register keyboard driversKyle Evans2019-12-161-0/+3
| | | | | | | | | | | | | | | | | | Keyboard drivers are generally registered via linker set. In these cases, they're also available as kmods which use KPI for registering/unregistering keyboard drivers outside of the linker set. For built-in modules, we still fire off MOD_LOAD and maybe even MOD_UNLOAD if an error occurs, leading to registration via linker set and at MOD_LOAD time. This is a minor optimization at best, but it keeps the internal kbd driver tidy as a future change will merge the linker set driver list into its internal keyboard_drivers list via SYSINIT and simplify driver lookup by removing the need to consult the linker set. Notes: svn path=/head/; revision=355806
* kbd: provide default implementations of get_fkeystr/diagKyle Evans2019-12-161-2/+0
| | | | | | | | | Most keyboard drivers are using the genkbd implementations as it is; formally use them for any that aren't set and make genkbd_get_fkeystr/genkbd_diag private. Notes: svn path=/head/; revision=355796
* keyboard switch definitions: standardize on c99 initializersKyle Evans2019-12-161-19/+19
| | | | | | | | | A future change will provide default implementations for some of these where it makes sense and most of them are already using the genkbd implementation (e.g. get_fkeystr, diag). Notes: svn path=/head/; revision=355794
* kbd drivers: use kbdd_* indirection for diag invocationKyle Evans2019-12-161-1/+1
| | | | | | | | | | | These invocations were directly calling enkbd_diag(), rather than indirection back through kbdd_diag/kbdsw. While they're functionally equivent, invoking kbdd_diag where feasible (i.e. not in a diag implementation) makes it easier to visually identify locking needs in these other drivers. Notes: svn path=/head/; revision=355793
* hyperv/storvsc: stash a pointer to hv_storvsc_request in ccbAndriy Gapon2019-11-191-0/+1
| | | | | | | | | | | A SIM-private field is used for that. The pointer can be useful when examining a state of a queued ccb. E.g., a ccb on a da_softc.pending_ccbs. MFC after: 2 weeks Notes: svn path=/head/; revision=354849
* hyperv/vmbus: Fix the wrong size in ndis_offload structureWei Hu2019-07-091-2/+2
| | | | | | | | | Submitted by: whu MFC after: 2 weeks Sponsored by: Microsoft Notes: svn path=/head/; revision=349857
* hyperv/vmbus: Update VMBus version 4.0 and 5.0 support.Wei Hu2019-07-095-2/+47
| | | | | | | | | | | | | Add VMBus protocol version 4.0. and 5.0 to support Windows 10 and newer HyperV hosts. For VMBus 4.0 and newer HyperV, the netvsc gpadl teardown must be done after vmbus close. Submitted by: whu MFC after: 2 weeks Sponsored by: Microsoft Notes: svn path=/head/; revision=349856
* Distinguish _CID match and _HID match and make lower priority probeTakanori Watanabe2018-10-261-6/+7
| | | | | | | | | | when _CID match. Reviewed by: jhb, imp Differential Revision:https://reviews.freebsd.org/D16468 Notes: svn path=/head/; revision=339754
* Do not trop UDP traffic when TXCSUM_IPV6 flag is onWei Hu2018-10-221-1/+2
| | | | | | | | | | | | | PR: 231797 Submitted by: whu Reviewed by: dexuan Obtained from: Kevin Morse MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://bugs.freebsd.org/bugzilla/attachment.cgi?id=198333&action=diff Notes: svn path=/head/; revision=339585
* Eliminate the arena parameter to kmem_free(). Implicitly this corrects anAlan Cox2018-08-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | error in the function hypercall_memfree(), where the wrong arena was being passed to kmem_free(). Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are mapped in kmem with execute permissions. Use this flag to determine which arena the kmem virtual addresses are returned to. Eliminate UMA_SLAB_KRWX. The introduction of VPO_KMEM_EXEC makes it redundant. Update the nearby comment for UMA_SLAB_KERNEL. Reviewed by: kib, markj Discussed with: jeff Approved by: re (marius) Differential Revision: https://reviews.freebsd.org/D16845 Notes: svn path=/head/; revision=338318
* Eliminate kmem_malloc()'s unused arena parameter. (The arena parameterAlan Cox2018-08-211-2/+2
| | | | | | | | | | | | became unused in FreeBSD 12.x as a side-effect of the NUMA-related changes.) Reviewed by: kib, markj Discussed with: jeff, re@ Differential Revision: https://reviews.freebsd.org/D16825 Notes: svn path=/head/; revision=338143
* Fix build of hyperv with base gcc on i386Dimitry Andric2018-08-041-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Base gcc fails to compile `sys/dev/hyperv/pcib/vmbus_pcib.c` for i386, with the following -Werror warnings: cc1: warnings being treated as errors /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'new_pcichild_device': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:567: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_on_channel_callback': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:940: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_protocol_negotiation': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1012: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_enter_d0': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1073: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_send_resources_allocated': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1125: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_map_msi': /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1730: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] This is because on i386, several casts from `uint64_t` to a pointer reduce the value from 64 bit to 32 bit. For gcc, this can be fixed by an intermediate cast to uintptr_t. Note that I am assuming the incoming values will always fit into 32 bit! Differential Revision: https://reviews.freebsd.org/D15753 MFC after: 3 days Notes: svn path=/head/; revision=337322
* Use SMAP on amd64.Konstantin Belousov2018-07-291-0/+1
| | | | | | | | | | | | | | Ifuncs selectors dispatch copyin(9) family to the suitable variant, to set rflags.AC around userspace access. Rflags.AC bit is cleared in all kernel entry points unconditionally even on machines not supporting SMAP. Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D13838 Notes: svn path=/head/; revision=336876
* hyperv/hn: Fix panic in hypervisor code upon device detach eventDexuan Cui2018-07-171-0/+7
| | | | | | | | | | Submitted by: hselasky Reviewed by: dexuan MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D16139 Notes: svn path=/head/; revision=336426
* hyperv: Fix boot-up after malloc() returns memory of NX by default nowDexuan Cui2018-07-071-1/+1
| | | | | | | | | | | | | | FreeBSD VM can't boot up on Hyper-V after the recent malloc change in r335068: Make UMA and malloc(9) return non-executable memory in most cases. The hypercall page here must be executable. Fix the boot-up issue by adding M_EXEC. PR: 229167 Sponsored by: Microsoft Notes: svn path=/head/; revision=336054
* if_hn: fix use of uninitialized variableEric van Gyzen2018-05-261-2/+1
| | | | | | | | | | | | | | omcast was used without being initialized in the non-multicast case. The only effect was that the interface's multicast output counter could be incorrect. Reported by: Coverity CID: 1379662 MFC after: 3 days Sponsored by: Dell EMC Notes: svn path=/head/; revision=334239
* ifnet: Replace if_addr_lock rwlock with epoch + mutexMatt Macy2018-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366 Notes: svn path=/head/; revision=333813
* i386 4/4G split.Konstantin Belousov2018-04-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The change makes the user and kernel address spaces on i386 independent, giving each almost the full 4G of usable virtual addresses except for one PDE at top used for trampoline and per-CPU trampoline stacks, and system structures that must be always mapped, namely IDT, GDT, common TSS and LDT, and process-private TSS and LDT if allocated. By using 1:1 mapping for the kernel text and data, it appeared possible to eliminate assembler part of the locore.S which bootstraps initial page table and KPTmap. The code is rewritten in C and moved into the pmap_cold(). The comment in vmparam.h explains the KVA layout. There is no PCID mechanism available in protected mode, so each kernel/user switch forth and back completely flushes the TLB, except for the trampoline PTD region. The TLB invalidations for userspace becomes trivial, because IPI handlers switch page tables. On the other hand, context switches no longer need to reload %cr3. copyout(9) was rewritten to use vm_fault_quick_hold(). An issue for new copyout(9) is compatibility with wiring user buffers around sysctl handlers. This explains two kind of locks for copyout ptes and accounting of the vslock() calls. The vm_fault_quick_hold() AKA slow path, is only tried after the 'fast path' failed, which temporary changes mapping to the userspace and copies the data to/from small per-cpu buffer in the trampoline. If a page fault occurs during the copy, it is short-circuit by exception.s to not even reach C code. The change was motivated by the need to implement the Meltdown mitigation, but instead of KPTI the full split is done. The i386 architecture already shows the sizing problems, in particular, it is impossible to link clang and lld with debugging. I expect that the issues due to the virtual address space limits would only exaggerate and the split gives more liveness to the platform. Tested by: pho Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D14633 Notes: svn path=/head/; revision=332489
* hyperv/storvsc: storvsc_io_done(): do not use CAM_SEL_TIMEOUTDexuan Cui2018-04-101-14/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CAM_SEL_TIMEOUT was introduced in https://reviews.freebsd.org/D7521 (r304251), which claimed: "VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan for LUN number higher than 7." But it turns out this is not correct: I think what really filters the invalid LUNs in r304251 is that: before r304251, we could set the CAM_REQ_CMP without checking vm_srb->srb_status at all: ccb->ccb_h.status |= CAM_REQ_CMP. r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly, so the invalid LUNs are filtered. I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be filtered, and I successfully hot-added and hot-removed 8 disks to/from the VM without any issue. CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error(): For a selection timeout, we consider all of the LUNs on the target to be gone. If the status is CAM_DEV_NOT_THERE, then we only get rid of the device(s) specified by the path in the original CCB. This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires 3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this is the bug I reported recently: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583 PR: 226583 Reviewed by: mav MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D14690 Notes: svn path=/head/; revision=332385
* Correct comment typo in Hyper-VEd Maste2018-03-301-1/+1
| | | | | | | | | PR: 226665 Submitted by: Ryo ONODERA MFC after: 3 days Notes: svn path=/head/; revision=331757
* Rename assym.s to assym.incEd Maste2018-03-202-2/+2
| | | | | | | | | | | | assym is only to be included by other .s files, and should never actually be assembled by itself. Reviewed by: imp, bdrewery (earlier) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D14180 Notes: svn path=/head/; revision=331254
* PTI for amd64.Konstantin Belousov2018-01-173-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation of the Kernel Page Table Isolation (KPTI) for amd64, first version. It provides a workaround for the 'meltdown' vulnerability. PTI is turned off by default for now, enable with the loader tunable vm.pmap.pti=1. The pmap page table is split into kernel-mode table and user-mode table. Kernel-mode table is identical to the non-PTI table, while usermode table is obtained from kernel table by leaving userspace mappings intact, but only leaving the following parts of the kernel mapped: kernel text (but not modules text) PCPU GDT/IDT/user LDT/task structures IST stacks for NMI and doublefault handlers. Kernel switches to user page table before returning to usermode, and restores full kernel page table on the entry. Initial kernel-mode stack for PTI trampoline is allocated in PCPU, it is only 16 qwords. Kernel entry trampoline switches page tables. then the hardware trap frame is copied to the normal kstack, and execution continues. IST stacks are kept mapped and no trampoline is needed for NMI/doublefault, but of course page table switch is performed. On return to usermode, the trampoline is used again, iret frame is copied to the trampoline stack, page tables are switched and iretq is executed. The case of iretq faulting due to the invalid usermode context is tricky, since the frame for fault is appended to the trampoline frame. Besides copying the fault frame and original (corrupted) frame to kstack, the fault frame must be patched to make it look as if the fault occured on the kstack, see the comment in doret_iret detection code in trap(). Currently kernel pages which are mapped during trampoline operation are identical for all pmaps. They are registered using pmap_pti_add_kva(). Besides initial registrations done during boot, LDT and non-common TSS segments are registered if user requested their use. In principle, they can be installed into kernel page table per pmap with some work. Similarly, PCPU can be hidden from userspace mapping using trampoline PCPU page, but again I do not see much benefits besides complexity. PDPE pages for the kernel half of the user page tables are pre-allocated during boot because we need to know pml4 entries which are copied to the top-level paging structure page, in advance on a new pmap creation. I enforce this to avoid iterating over the all existing pmaps if a new PDPE page is needed for PTI kernel mappings. The iteration is a known problematic operation on i386. The need to flush hidden kernel translations on the switch to user mode make global tables (PG_G) meaningless and even harming, so PG_G use is disabled for PTI case. Our existing use of PCID is incompatible with PTI and is automatically disabled if PTI is enabled. PCID can be forced on only for developer's benefit. MCE is known to be broken, it requires IST stack to operate completely correctly even for non-PTI case, and absolutely needs dedicated IST stack because MCE delivery while trampoline did not switched from PTI stack is fatal. The fix is pending. Reviewed by: markj (partially) Tested by: pho (previous version) Discussed with: jeff, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=328083
* Define xpt_path_inq.Warner Losh2017-12-061-4/+1
| | | | | | | | | | | This provides a nice wrarpper around the XPT_PATH_INQ ccb creation and calling. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D13387 Notes: svn path=/head/; revision=326645
* sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-274-0/+8
| | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326255
* hyperv/hn: Enable transparent VF by default.Sepherosa Ziehau2017-10-111-1/+1
| | | | | | | | MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=324517
* hyperv/hn: Workaround erroneous hash type observed on WS2016 for VF.Sepherosa Ziehau2017-10-113-8/+36
| | | | | | | | | | The background was described in r324489. MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=324516
* hyperv/hn: Workaround erroneous hash type observed on WS2016.Sepherosa Ziehau2017-10-103-30/+90
| | | | | | | | | | | | | | | | | | | | Background: - UDP 4-tuple hash type is unconditionally enabled in Hyper-V on WS2016, which is _not_ affected by NDIS_OBJTYPE_RSS_PARAMS. - Non-fragment UDP/IPv4 datagrams' hash type is delivered to VM as TCP_IPV4. Currently this erroneous behavior only applies to WS2016/Windows10. Force l3/l4 protocol check, if the RXed packet's hash type is TCP_IPV4, and the Hyper-V is running on WS2016/Windows10. If the RXed packet is UDP datagram, adjust mbuf hash type to UDP_IPV4. MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=324489
* hyperv/vmbus: Expose Hyper-V major version.Sepherosa Ziehau2017-10-102-1/+5
| | | | | | | | MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=324488
* hyperv/vmbus: Add tunable to pin/unpin event tasks.Sepherosa Ziehau2017-10-101-4/+17
| | | | | | | | | | | | | | Event tasks are pinned to their respective CPU by default, in the same fashion as they were. Unpin the event tasks by setting hw.vmbus.pin_evttask to 0, if certain CPUs serve special purpose. MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=324487
* hyperv/hn: Fix options RSS buildingSepherosa Ziehau2017-10-051-4/+0
| | | | | | | | | Reported by: np MFC after: 1 week Sponsored by: Microsoft Notes: svn path=/head/; revision=324316
* hyperv/hn: Unbreak i386 building.Sepherosa Ziehau2017-09-281-1/+1
| | | | | | | | | Reported by: cy MFC after: 1 week Sponsored by: Microsoft Notes: svn path=/head/; revision=324077
* hyperv/hn: Fix UDP checksum offload issue in Azure.Sepherosa Ziehau2017-09-271-2/+57
| | | | | | | | | | | | | | | | | | | UDP checksum offload does not work in Azure if following conditions are met: - sizeof(IP hdr + UDP hdr + payload) > 1420. - IP_DF is not set in IP hdr Use software checksum for UDP datagrams falling into this category. Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in case something unexpected happened. MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12429 Notes: svn path=/head/; revision=324049
* hyperv/hn: Set tcp header offset for CSUM/LSO offloading.Sepherosa Ziehau2017-09-272-27/+70
| | | | | | | | | | | No observable effect; better safe than sorry. MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12417 Notes: svn path=/head/; revision=324048
* hyperv/hn: Incease max supported MTUSepherosa Ziehau2017-09-191-2/+1
| | | | | | | | | MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12365 Notes: svn path=/head/; revision=323729
* hyperv/hn: Fix MTU settingSepherosa Ziehau2017-09-194-2/+46
| | | | | | | | | | | | | | | - Add size of an ethernet header to the value configured to NVS. This does not seem to have any effects if MTU is 1500, but fix hypervisor side's setting if MTU > 1500. - Override the MTU setting according to the view from the hypervisor side. MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12352 Notes: svn path=/head/; revision=323728
* hyperv/hn: Apply VF's RSS settingSepherosa Ziehau2017-09-194-52/+363
| | | | | | | | | | | | | | Since in Azure SYN and SYN|ACK go through the synthetic parts while the rest of the same TCP flow goes through the VF, apply VF's RSS settings to synthetic parts to have a consistent hash value/type for the same TCP flow. MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12333 Notes: svn path=/head/; revision=323727
* hyperv/hn: Log RSS capabilities mask.Sepherosa Ziehau2017-09-051-0/+2
| | | | | | | | | | | This helps to detect when UDP hash types can be supported. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12177 Notes: svn path=/head/; revision=323176
* hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}.Sepherosa Ziehau2017-09-051-0/+52
| | | | | | | | | | | | | The conditional compiling in the review request is removed, since these IOCTLs will be available in stable/10 and stable/11. Reviewed by: gallatin MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12175 Notes: svn path=/head/; revision=323175
* hyperv: Update copyright for the files changed in 2017Sepherosa Ziehau2017-08-1417-17/+17
| | | | | | | | | MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11982 Notes: svn path=/head/; revision=322488
* hyperv/hn: Re-set datapath after synthetic parts reattached.Sepherosa Ziehau2017-08-141-1/+2
| | | | | | | | | | | Do this even for non-transparent mode VF. Better safe than sorry. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11981 Notes: svn path=/head/; revision=322487
* hyperv/hn: Minor cleanupSepherosa Ziehau2017-08-141-8/+7
| | | | | | | | | MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11979 Notes: svn path=/head/; revision=322486
* hyperv/hn: Fix/enhance receiving path when VF is activated.Sepherosa Ziehau2017-08-142-31/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | - Update hn(4)'s stats properly for non-transparent mode VF. - Allow BPF tapping to hn(4) for non-transparent mode VF. - Don't setup mbuf hash, if 'options RSS' is set. In Azure, when VF is activated, TCP SYN and SYN|ACK go through hn(4) while the rest of segments and ACKs belonging to the same TCP 4-tuple go through the VF. So don't setup mbuf hash, if a VF is activated and 'options RSS' is not enabled. hn(4) and the VF may use neither the same RSS hash key nor the same RSS hash function, so the hash value for packets belonging to the same flow could be different! - Disable LRO. hn(4) will only receive broadcast packets, multicast packets, TCP SYN and SYN|ACK (in Azure), LRO is useless for these packet types. For non-transparent, we definitely _cannot_ enable LRO at all, since the LRO flush will use hn(4) as the receiving interface; i.e. hn_ifp->if_input(hn_ifp, m). While I'm here, remove unapplied comment and minor style change. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11978 Notes: svn path=/head/; revision=322485
* hyperv/hn: Update VF's ibytes properly under transparent VF mode.Sepherosa Ziehau2017-08-141-5/+26
| | | | | | | | | | | | While, I'm here add comment about why updating VF's imcast stat is not necessary. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11948 Notes: svn path=/head/; revision=322483
* hyperv/hn: Implement transparent mode network VF.Sepherosa Ziehau2017-08-093-41/+811
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | How network VF works with hn(4) on Hyper-V in transparent mode: - Each network VF has a cooresponding hn(4). - The network VF and the it's cooresponding hn(4) have the same hardware address. - Once the network VF is attached, the cooresponding hn(4) waits several seconds to make sure that the network VF attach routing completes, then: o Set the intersection of the network VF's if_capabilities and the cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s if_capabilities. And adjust the cooresponding hn(4) if_capable and if_hwassist accordingly. (*) o Make sure that the cooresponding hn(4)'s TSO parameters meet the constraints posed by both the network VF and the cooresponding hn(4). (*) o The network VF's if_input is overridden. The overriding if_input changes the input packet's rcvif to the cooreponding hn(4). The network layers are tricked into thinking that all packets are neceived by the cooresponding hn(4). o If the cooresponding hn(4) was brought up, bring up the network VF. The transmission dispatched to the cooresponding hn(4) are redispatched to the network VF. o Bringing down the cooresponding hn(4) also brings down the network VF. o All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to the network VF; the cooresponding hn(4) changes its internal state if necessary. o The media status of the cooresponding hn(4) solely relies on the network VF. o If there are multicast filters on the cooresponding hn(4), allmulti will be enabled on the network VF. (**) - Once the network VF is detached. Undo all damages did to the cooresponding hn(4) in the above item. NOTE: No operation should be issued directly to the network VF, if the network VF transparent mode is enabled. The network VF transparent mode can be enabled by setting tunable hw.hn.vf_transparent to 1. The network VF transparent mode is _not_ enabled by default, as of this commit. The benefit of the network VF transparent mode is that the network VF attachment and detachment are transparent to all network layers; e.g. live migration detaches and reattaches the network VF. The major drawbacks of the network VF transparent mode: - The netmap(4) support is lost, even if the VF supports it. - ALTQ does not work, since if_start method cannot be properly supported. (*) These decisions were made so that things will not be messed up too much during the transition period. (**) This does _not_ need to go through the fancy multicast filter management stuffs like what vlan(4) has, at least currently: - As of this write, multicast does not work in Azure. - As of this write, multicast packets go through the cooresponding hn(4). MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11803 Notes: svn path=/head/; revision=322299
* hyperv/kvp: Use proper size macro for adapter id.Sepherosa Ziehau2017-08-031-1/+1
| | | | | | | | | Submitted by: Christopher Ertl <Christopher.Ertl microsoft com> MFC after: 3 days Sponsored by: Microsoft Notes: svn path=/head/; revision=321965
* hyperv/hn: Add comment about ether_ifattach event subscription.Sepherosa Ziehau2017-08-011-0/+6
| | | | | | | | | MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11710 Notes: svn path=/head/; revision=321837