aboutsummaryrefslogtreecommitdiff
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* Initialize pcids array for the proc0 pmap.Konstantin Belousov2015-05-101-0/+5
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=282711
* Tweak assert to also print the thread address.Konstantin Belousov2015-05-101-2/+2
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=282710
* On exec, single-threading must be enforced before arguments space isKonstantin Belousov2015-05-101-1/+9
| | | | | | | | | | | | | | | allocated from exec_map. If many threads try to perform execve(2) in parallel, the exec map is exhausted and some threads sleep uninterruptible waiting for the map space. Then, the thread which won the race for the space allocation, cannot single-thread the process, causing deadlock. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=282708
* Correct the assertion. We should compare the pmap' curcpu pcid valueKonstantin Belousov2015-05-091-1/+2
| | | | | | | | | | | against 0, not the pmap. Noted by: Oliver Pinter <oliver.pinter@hardenedbsd.org> Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=282696
* Rewrite amd64 PCID implementation to follow an algorithm described inKonstantin Belousov2015-05-0911-478/+258
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the Vahalia' "Unix Internals" section 15.12 "Other TLB Consistency Algorithms". The same algorithm is already utilized by the MIPS pmap to handle ASIDs. The PCID for the address space is now allocated per-cpu during context switch to the thread using pmap, when no PCID on the cpu was ever allocated, or the current PCID is invalidated. If the PCID is reused, bit 63 of %cr3 can be set to avoid TLB flush. Each cpu has PCID' algorithm generation count, which is saved in the pmap pcpu block when pcpu PCID is allocated. On invalidation, the pmap generation count is zeroed, which signals the context switch code that already allocated PCID is no longer valid. The implication is the TLB shootdown for the given cpu/address space, due to the allocation of new PCID. The pm_save mask is no longer has to be tracked, which (significantly) reduces the targets of the TLB shootdown IPIs. Previously, pm_save was reset only on pmap_invalidate_all(), which made it accumulate the cpuids of all processors on which the thread was scheduled between full TLB shootdowns. Besides reducing the amount of TLB shootdowns and removing atomics to update pm_saves in the context switch code, the algorithm is much simpler than the maintanence of pm_save and selection of the right address space in the shootdown IPI handler. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=282684
* Remove unused define.Konstantin Belousov2015-05-091-2/+0
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=282680
* If x86 CPU implementation of the MWAIT instruction reasonablyKonstantin Belousov2015-05-092-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | interacts with interrupts, query ACPI and use MWAIT for entrance into Cx sleep states. Support C1 "I/O then halt" mode. See Intel' document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface Specification" for description. Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use it instead of inlining "sti; hlt" sequence in several places. In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported} sysctls. Both jkim and avg have some other patches implementing the mwait functionality; this work is unrelated. Linux does not rely on the ACPI to provide correct tables describing Cx modes. Instead, the driver has pre-defined knowledge of the CPU models, it was supplied by Intel. Tested by: pho (previous versions) Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=282678
* Check 'td_owepreempt' and yield the vcpu thread if it is set.Neel Natu2015-05-061-1/+7
| | | | | | | | | | | | | This is done explicitly because a vcpu thread can be in a critical section for the entire time slice alloted to it. This in turn can delay the handling of the 'td_owepreempt'. Reviewed by: jhb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D2430 Notes: svn path=/head/; revision=282571
* Deprecate the 3-way return values from vm_gla2gpa() and vm_copy_setup().Neel Natu2015-05-065-93/+88
| | | | | | | | | | | | | | | | | | | | | Prior to this change both functions returned 0 for success, -1 for failure and +1 to indicate that an exception was injected into the guest. The numerical value of ERESTART also happens to be -1 so when these functions returned -1 it had to be translated to a positive errno value to prevent the VM_RUN ioctl from being inadvertently restarted. This made it easy to introduce bugs when writing emulation code. Fix this by adding an 'int *guest_fault' parameter and setting it to '1' if an exception was delivered to the guest. The return value is 0 or EFAULT so no additional translation is needed. Reviewed by: tychon MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D2428 Notes: svn path=/head/; revision=282558
* Do a proper emulation of guest writes to MSR_EFER.Neel Natu2015-05-063-14/+128
| | | | | | | | | | | | | - Must-Be-Zero bits cannot be set. - EFER_LME and EFER_LMA should respect the long mode consistency checks. - EFER_NXE, EFER_FFXSR, EFER_TCE can be set if allowed by CPUID capabilities. - Flag an error if guest tries to set EFER_LMSLE since bhyve doesn't enforce segment limits in 64-bit mode. MFC after: 2 weeks Notes: svn path=/head/; revision=282520
* Emulate the 'CMP r/m8, imm8' instruction encountered when booting a WindowsNeel Natu2015-05-041-2/+14
| | | | | | | | | | Vista guest. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 1 week Notes: svn path=/head/; revision=282407
* Don't advertise the Intel SMX capability to the guest.Neel Natu2015-05-021-1/+2
| | | | | | | | Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 1 week Notes: svn path=/head/; revision=282351
* Emulate machine check related MSRs to allow guest OSes like Windows to boot.Neel Natu2015-05-023-7/+24
| | | | | | | | Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=282336
* r281630 relaxed the limits on the vectors that can be asserted in the IRRs.Neel Natu2015-05-011-11/+9
| | | | | | | | | | | Do the same when transitioning a vector from the IRR to the ISR and also when extinguishing it from the ISR in response to an EOI. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=282301
* Emulate MSR_SYSCFG which is accessed by Linux on AMD cpus when MTRRs areNeel Natu2015-05-011-0/+2
| | | | | | | | | enabled. MFC after: 2 weeks Notes: svn path=/head/; revision=282296
* Don't require <sys/cpuset.h> to be always included before <machine/vmm.h>.Neel Natu2015-04-3014-20/+4
| | | | | | | | | | Only a subset of source files that include <machine/vmm.h> need to use the APIs that require the inclusion of <sys/cpuset.h>. MFC after: 1 week Notes: svn path=/head/; revision=282287
* When an instruction cannot be decoded just return to userspace so bhyve(8)Neel Natu2015-04-301-2/+6
| | | | | | | | | | can dump the instruction bytes. Requested by: grehan MFC after: 1 week Notes: svn path=/head/; revision=282284
* Advertise the MTRR feature via CPUID and emulate the minimal set of MTRR MSRs.Neel Natu2015-04-303-3/+38
| | | | | | | | | | This is required for booting Windows guests. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=282281
* Remove support for Xen PV domU kernels. Support for HVM domU kernelsJohn Baldwin2015-04-303-297/+0
| | | | | | | | | | | | | | | | | | | | | | | | remains. Xen is planning to phase out support for PV upstream since it is harder to maintain and has more overhead. Modern x86 CPUs include virtualization extensions that support HVM guests instead of PV guests. In addition, the PV code was i386 only and not as well maintained recently as the HVM code. - Remove the i386-only NATIVE option that was used to disable certain components for PV kernels. These components are now standard as they are on amd64. - Remove !XENHVM bits from PV drivers. - Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3, etc.) - Remove duplicate copy of <xen/features.h>. - Remove unused, i386-only xenstored.h. Differential Revision: https://reviews.freebsd.org/D2362 Reviewed by: royger Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0) Relnotes: yes Notes: svn path=/head/; revision=282274
* Re-implement RTC current time calculation to eliminate the possibility ofNeel Natu2015-04-291-21/+32
| | | | | | | | | | | | | | | losing time. The problem with the earlier implementation was that the uptime value used by 'vrtc_curtime()' could be different than the uptime value when 'vrtc_time_update()' actually updated 'base_uptime'. Fix this by calculating and updating the (rtctime, uptime) tuple together. MFC after: 2 weeks Notes: svn path=/head/; revision=282259
* Microsoft vmbus, storage and other related driver enhancements for HyperV.Wei Hu2015-04-293-1/+21
| | | | | | | | | | | | | | | | | | | | | | | - Vmbus multi channel support. - Vector interrupt support. - Signal optimization. - Storvsc driver performance improvement. - Scatter and gather support for storvsc driver. - Minor bug fix for KVP driver. Thanks royger, jhb and delphij from FreeBSD community for the reviews and comments. Also thanks Hovy Xu from NetApp for the contributions to the storvsc driver. PR: 195238 Submitted by: whu Reviewed by: royger, jhb, delphij Approved by: royger MFC after: 2 weeks Relnotes: yes Sponsored by: Microsoft OSTC Notes: svn path=/head/; revision=282212
* Emulate the 'bit test' instruction. Windows 7 uses 'bit test' to check theNeel Natu2015-04-291-0/+52
| | | | | | | | | | 'Delivery Status' bit in APIC ICR register. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=282209
* Implement the century byte in the RTC. Some guests require this field to beNeel Natu2015-04-281-22/+44
| | | | | | | | | | properly set. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=282206
* STOS/STOSB/STOSW/STOSD/STOSQ instruction emulation.Tycho Nightingale2015-04-251-0/+77
| | | | | | | Reviewed by: neel Notes: svn path=/head/; revision=281987
* Move common code from sys/i386/i386/mp_machdep.c andKonstantin Belousov2015-04-242-1022/+51
| | | | | | | | | | | | sys/amd64/amd64/mp_machdep.c, to the new common x86 source sys/x86/x86/mp_x86.c. Proposed and reviewed by: jhb Review: https://reviews.freebsd.org/D2347 Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=281940
* Reassign copyright statements on several files from AdvancedJohn Baldwin2015-04-231-1/+1
| | | | | | | | | | Computing Technologies LLC to Hudson River Trading LLC. Approved by: Hudson River Trading LLC (who owns ACT LLC) MFC after: 1 week Notes: svn path=/head/; revision=281887
* Missing break in switch case.Marcelo Araujo2015-04-231-0/+1
| | | | | | | | Differential Revision: D2342 Reviewed by: neel Notes: svn path=/head/; revision=281879
* Move some common code from sys/amd64/amd64/machdep.c andKonstantin Belousov2015-04-222-369/+1
| | | | | | | | | | | sys/i386/i386/machdep.c to new file sys/x86/x86/cpu_machdep.c. Most of the code is related to the idle handling. Discussed with: pluknet Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=281851
* Remove duplicate definitions of MWAIT_CX hints. Identical defines inKonstantin Belousov2015-04-201-9/+0
| | | | | | | | | | | specialreg.h are enough. Discussed with: mav Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=281762
* Remove lazy pmap switch code from i386. Naive benchmark with md(4)Konstantin Belousov2015-04-181-0/+2
| | | | | | | | | | | | | | shows no difference with the code removed. On both amd64 and i386, assert that a released pmap is not active. Proposed and reviewed by: alc Discussed with: Svatopluk Kraus <onwahe@gmail.com>, peter Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=281707
* Relax the check on which vectors can be delivered through the APIC. AccordingNeel Natu2015-04-161-1/+5
| | | | | | | | | | | to the Intel SDM vectors 16 through 255 are allowed to be delivered via the local APIC. Reported by: Leon Dang (ldang@nahannisys.com) MFC after: 2 weeks Notes: svn path=/head/; revision=281630
* Prefer 'vcpu_should_yield()' over checking 'curthread->td_flags' directly.Neel Natu2015-04-161-1/+1
| | | | | | | MFC after: 1 week Notes: svn path=/head/; revision=281612
* Use explicitly sized types in EFI module metadataEd Maste2015-04-101-5/+5
| | | | | | | | | | This will allow the same metadata struct to be used on all platforms. Differential Revision: https://reviews.freebsd.org/D2275 Reviewed by: jhb Notes: svn path=/head/; revision=281381
* Enhance the support for Group 1 Extended opcodes:Tycho Nightingale2015-04-061-38/+84
| | | | | | | | | | | * Implemement the 0x81 and 0x83 CMP instructions. * Implemement the 0x83 AND instruction. * Implemement the 0x81 OR instruction. Reviewed by: neel Notes: svn path=/head/; revision=281145
* adrian asked me to revert and get more testingEitan Adler2015-04-051-5/+1
| | | | Notes: svn path=/head/; revision=281103
* head/sys/amd64/amd64/support.S: unroll loopEitan Adler2015-04-051-1/+5
| | | | | | | | | | | | | | | | unroll the loop in ENTRY(pagezero) acc' to the submitter this results in a reproducible 1% perf improvement under buildworld like workload I validated correctness and run-testing, but not performance impact Submitted by: lidl@pix.net Reviewed by: adrian PR: 199151 MFC After: 1 month Notes: svn path=/head/; revision=281100
* Fix integer truncation bug in malloc(9)Ryan Stone2015-04-011-2/+2
| | | | | | | | | | | | | | | | | | | | A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM. Note to self: When this is MFCed, sparc64 needs the same fix. Differential revision: https://reviews.freebsd.org/D2106 Reviewed by: kib Reported by: Michael Fuckner <michael@fuckner.net> Tested by: Michael Fuckner <michael@fuckner.net> MFC after: 2 weeks Notes: svn path=/head/; revision=280957
* Fix "MOVS" instruction memory to MMIO emulation. Currently updates toTycho Nightingale2015-04-014-35/+54
| | | | | | | | | | | | %rdi, %rsi, etc are inadvertently bypassed along with the check to see if the instruction needs to be repeated per the 'rep' prefix. Add "MOVS" instruction support for the 'MMIO to MMIO' case. Reviewed by: neel Notes: svn path=/head/; revision=280929
* Provide workaround for a performance issue with the popcnt instructionKonstantin Belousov2015-03-311-17/+24
| | | | | | | | | | | | | | | on Intel processors. Clear spurious dependency by explicitely xoring the destination register of popcnt. Use bitcount64() instead of re-implementing SWAR locally, for processors without popcnt instruction. Reviewed by: jhb Discussed with: jilles (previous version) Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=280880
* Wait 100 microseconds for a local APIC to dispatch each startup-related IPIJohn Baldwin2015-03-301-3/+3
| | | | | | | | | | | | | | | | | | | | | rather than 20. The MP 1.4 specification states in Appendix B.2: "A period of 20 microseconds should be sufficient for IPI dispatch to complete under normal operating conditions". (Note that this appears to be separate from the 10 millisecond (INIT) and 200 microsecond (STARTUP) waits after the IPIs are dispatched.) The Intel SDM is silent on this issue as far as I can tell. At least some hardware requires 60 microseconds as noted in the PR, so bump this to 100 to be on the safe side. PR: 197756 Reported by: zaphod@berentweb.com MFC after: 1 week Notes: svn path=/head/; revision=280866
* Make it possible for the signal handler to act on #ss. Load theKonstantin Belousov2015-03-281-0/+1
| | | | | | | | | | | canonical user data segment' selector into %ss when calling the handler. Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=280781
* The #ss fault handler erronously does not check for the faultKonstantin Belousov2015-03-281-2/+0
| | | | | | | | | | | | originated from the return to usermode. #ss must be handled same as #np. Reported by: Andrew Lutomirski through secteam Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=280780
* Fix the RTC device model to operate correctly in 12-hour mode. The followingNeel Natu2015-03-281-6/+41
| | | | | | | | | | | | | | | | table documents the values in the RTC 'hour' field in the two modes: Hour-of-the-day 12-hour mode 24-hour mode 12 AM 12 0 [1-11] AM [1-11] [1-11] 12 PM 0x80 | 12 12 [1-11] PM 0x80 | [1-11] [13-23] Reported by: Julian Hsiao (madoka@nyanisore.net) MFC after: 1 week Notes: svn path=/head/; revision=280775
* When fetching an instruction in non-64bit mode, consider the value of theTycho Nightingale2015-03-245-6/+20
| | | | | | | | | | | | code segment base address. Also if an instruction doesn't support a mod R/M (modRM) byte, don't be concerned if the CPU is in real mode. Reviewed by: neel Notes: svn path=/head/; revision=280447
* Use VT-d interrupt remapping block (IR) to perform FSB messagesKonstantin Belousov2015-03-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | translation. In particular, despite IO-APICs only take 8bit apic id, IR translation structures accept 32bit APIC Id, which allows x2APIC mode to function properly. Extend msi_cpu of struct msi_intrsrc and io_cpu of ioapic_intsrc to full int from one byte. KPI of IR is isolated into the x86/iommu/iommu_intrmap.h, to avoid bringing all dmar headers into interrupt code. The non-PCI(e) devices which generate message interrupts on FSB require special handling. The HPET FSB interrupts are remapped, while DMAR interrupts are not. For each msi and ioapic interrupt source, the iommu cookie is added, which is in fact index of the IRE (interrupt remap entry) in the IR table. Cookie is made at the source allocation time, and then used at the map time to fill both IRE and device registers. The MSI address/data registers and IO-APIC redirection registers are programmed with the special values which are recognized by IR and used to restore the IRE index, to find proper delivery mode and target. Map all MSI interrupts in the block when msi_map() is called. Since an interrupt source setup and dismantle code are done in the non-sleepable context, flushing interrupt entries cache in the IR hardware, which is done async and ideally waits for the interrupt, requires busy-wait for queue to drain. The dmar_qi_wait_for_seq() is modified to take a boolean argument requesting busy-wait for the written sequence number instead of waiting for interrupt. Some interrupts are configured before IR is initialized, e.g. ACPI SCI. Add intr_reprogram() function to reprogram all already configured interrupts, and call it immediately before an IR unit is enabled. There is still a small window after the IO-APIC redirection entry is reprogrammed with cookie but before the unit is enabled, but to fix this properly, IR must be started much earlier. Add workarounds for 5500 and X58 northbridges, some revisions of which have severe flaws in handling IR. Use the same identification methods as employed by Linux. Review: https://reviews.freebsd.org/D1892 Reviewed by: neel Discussed with: jhb Tested by: glebius, pho (previous versions) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Notes: svn path=/head/; revision=280260
* Update to the Intel ixgbe driver:Jack F Vogel2015-03-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | - Split the driver into independent pf and vf loadables. This is in preparation for SRIOV support which will be following shortly. This also allows us to keep a seperate revision control over the two parts, making for easier sustaining. - Make the TX/RX code a shared/seperated file, in the old code base the ixv code would miss fixes that went into ixgbe, this model will eliminate that problem. - The driver loadables will now match the device names, something that has been requested for some time. - Rather than a modules/ixgbe there is now modules/ix and modules/ixv - It will also be possible to make your static kernel with only one or the other for streamlined installs, or both. Enjoy! Submitted by: jfv and erj Notes: svn path=/head/; revision=280182
* Report ARAT (APIC-Timer-always-running) feature for virtual CPU.Alexander Motin2015-03-161-0/+6
| | | | | | | | | | | | | | | | This makes FreeBSD guest to not avoid using LAPIC timer, preferring HPET due to worries about non-existing for virtual CPUs deep sleep states. Benchmarks of usleep(1) on guest and host show such extra latencies: - 51us for virtual HPET, - 22us for virtual LAPIC timer, - 22us for host HPET and - 3us for host LAPIC timer. MFC after: 2 weeks Notes: svn path=/head/; revision=280134
* Use lapic_ipi_alloc() to dynamically allocate IPI slots needed by bhyve whenNeel Natu2015-03-1410-184/+40
| | | | | | | | | | | vmm.ko is loaded. Also relocate the 'justreturn' IPI handler to be alongside all other handlers. Requested by: kib Notes: svn path=/head/; revision=279971
* Only schedule interrupts on a single hyperthread of a modern Intel CPU coreJohn Baldwin2015-03-061-2/+2
| | | | | | | | | | by default. Previously we used a single hyperthread on Pentium4-era cores but used both hyperthreads on more recent CPUs. MFC after: 2 weeks Notes: svn path=/head/; revision=279699
* When ICW1 is issued the edge sense circuit is reset which means thatTycho Nightingale2015-03-061-0/+1
| | | | | | | | | | following an initialization a low-to-high transistion is necesary to generate an interrupt. Reviewed by: neel Notes: svn path=/head/; revision=279683