src - FreeBSD source tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	nvme: Introduce longer timeouts for admin queue	Alexander Motin	2023-11-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty namespace. In some cases like hot-plug it takes longer, triggering timeout and controller resets after just 30 seconds. Linux for many years has separate 60 seconds timeout for admin queue. This patch does the same. And it is good to be consistent. Sponsored by: iXsystems, Inc. Reviewed by: imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D42454 (cherry picked from commit 8d6c0743e36e3cff9279c40468711a82db98df23)
*	sys: Remove $FreeBSD$: one-line .c pattern	Warner Losh	2023-08-23	1	-2/+0
\| \| \| \| \| \| \|	Remove /^[\s]__FBSDID$"\$FreeBSD\$"$;?\s*\n/ Similar commit in current: (cherry picked from commit 685dc743dc3b)
*	spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD	Warner Losh	2023-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix (cherry picked from commit 4d846d260e2b9a3d4d0a701462568268cbfe7a5b)
*	Add NAMESPACE MANAGEMENT into admin_opcode[].	Alexander Motin	2023-03-13	1	-0/+1
\| \| \| \| \| \|	MFC after: 1 week (cherry picked from commit 49ebbdb264fe185a685dce846985b95f28320e3f)
*	nvme: Print CRD, M and DNR status bits on errors.	Alexander Motin	2022-08-15	1	-5/+9
\| \| \| \| \| \| \| \|	It may help with some issues debugging. MFC after: 1 week (cherry picked from commit a69c0964625f4e20a7d5f22d51e036e13eedbeeb)
*	nvme(4): Fix a typo in a source code comment	Gordon Bergling	2022-04-14	1	-1/+1
\| \| \| \| \| \|	- s/is is/is/ (cherry picked from commit dfa01f4f98a1343bf375c54a5cd44718c4211bec)
*	nvme: Do not rearm timeout for commands without one.	Alexander Motin	2022-01-21	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Admin queues almost always have several ASYNC_EVENT_REQUEST outstanding. They have no timeouts, but their presence in qpair->outstanding_tr caused useless timeout callout rearming twice a second. While there, relax timeout callout period from 0.5s to 0.5-1s to improve aggregation. Command timeouts are measured in seconds, so we don't need to be precise here. Reviewed by: imp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D33781 (cherry picked from commit b3c9b6060f9a3525196867d8e812b24fc0bc61e1)
*	nvme: Reduce traffic to the doorbell register	Warner Losh	2022-01-21	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Reduce traffic to doorbell register when processing multiple completion events at once. Only write it at the end of the loop after we've processed everything (assuming we found at least one completion, even if that completion wasn't valid). Sponsored by: Netflix Reviewed by: mav Differential Revision: https://reviews.freebsd.org/D32470 (cherry picked from commit 2ec165e3f065217ae8d54a2a8235fe1f219805ea)
*	nvme: Restore hotplug warning	Warner Losh	2022-01-21	1	-1/+2
\| \| \| \| \| \| \| \| \|	Restore hotplug warning in recovery state machine. No functional change other than what message gets printed. Sponsored by: Netflix (cherry picked from commit 18dc12bfd2e23ad2ea97db54cb8ee499f6f014da)
*	nvme: Sanity check completion id	Warner Losh	2022-01-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure the completion ID is in the range of [0..num_trackers) since the values past the end of the act_tr array are never going to be valid trackers and will lead to pain and suffering if we try to dereference them to get the tracker or to set the tracker back to NULL as we complete the I/O. Sponsored by: Netflix Reviewed by: mav, chs, chuck Differential Revision: https://reviews.freebsd.org/D32088 (cherry picked from commit 36a87d0c6fe9d65de23f177ef84000b205f87e39)
*	nvme: count number of ignored interrupts	Warner Losh	2022-01-21	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	Count the number of times we're asked to process completions, but that we ignore because the state of the qpair isn't in RECOVERY_NONE. Sponsored by: Netflix Reviewed by: mav, chuck Differential Revision: https://reviews.freebsd.org/D32212 (cherry picked from commit 587aa25525e54ea775298c402acd7a647f9838fb)
*	nvme: Add sanity check for phase on startup.	Warner Losh	2022-01-21	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The proper phase for the qpiar right after reset in the first interrupt is 1. For it, make sure that we're not still in phase 0. This is an illegal state to be processing interrupts and indicates that we've failed to properly protect against a race between initializing our state and processing interrupts. Modify stat resetting code so it resets the number of interrpts to 1 instead of 0 so we don't trigger a false positive panic. Sponsored by: Netflix Reviewed by: cperciva, mav (prior version) Differential Revision: https://reviews.freebsd.org/D32211 (cherry picked from commit 7d5eebe0f4a0f2aa5c8c7dfdd1a9ce1513849da8)
*	nvme: start qpair in state RECOVERY_WAITING	Warner Losh	2022-01-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An interrupt happens on the admin queue right away after the reset, so as soon as we enable interrupts, we'll get a call to our interrupt handler. It is safe to ignore this interrupt if we're not yet initialized, or to process it if we are. If we are initialized, we'll see there's no completion records and return. If we're not, we'll process no completion records and return. Either way, nothing is processed and nothing is lost. Until we've completely setup the qpair, we need to avoid processing completion records. Start the qpair in the waiting recovery state so we return immediately when we try to process completions. The code already sets it to 'NONE' when we're initialization is complete. It's safe to defer completion processing here because we don't send any commands before the initialization of the software state of the qpair is complete. And even if we were to somehow send a command prior to that completing, the completion record for that command would be processed when we send commands to the admin qpair after we've setup the software state. There's no good central point to add an assert for this last condition. This fixes an KASSERT "received completion for unknown cmd" panic on boot. Fixes: 502dc84a8b6703e7c0626739179a3cdffdd22d81 Sponsored by: Netflix Reviewed by: mav, cperciva, gallatin Differential Revision: https://reviews.freebsd.org/D32210 (cherry picked from commit fa81f3731d1a2984a28ae44e60d12a0659b8fd2f)
*	nvme: Use shared timeout rather than timeout per transaction	Warner Losh	2022-01-21	1	-61/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Keep track of the approximate time commands are 'due' and the next deadline for a command. twice a second, wake up to see if any commands have entered timeout. If so, quiessce and then enter a recovery mode half the timeout further in the future to allow the ISR to complete. Once we exit recovery mode, we go back to operations as normal. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D28583 (cherry picked from commit 502dc84a8b6703e7c0626739179a3cdffdd22d81)
*	nvme/nda: Fail all nvme I/Os after controller fails	Warner Losh	2022-01-21	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once the controller has failed, fail all I/O w/o sending it to the device. The reset of the nvme driver won't schedule any I/O to the failed device, and the controller is in an indeterminate state and can't accept I/O. Fail both at the top end of the sim and the bottom end. Don't bother queueing up the I/O for failure in a different task. Reviewed by: chuck Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31341 (cherry picked from commit 4b977e6dda92fe093ea300f1a91dbcf877b64fa0)
*	nvme(4): Add MSI and single MSI-X support.	Alexander Motin	2021-09-07	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we can't allocate more MSI-X vectors, accept using single shared. If we can't allocate any MSI-X, try to allocate 2 MSI vectors, but accept single shared. If still no luck, fall back to shared INTx. This provides maximal flexibility in some limited scenarios. For example, vmd(4) does not support INTx and can handle only limited number of MSI/MSI-X vectors without sharing. MFC after: 1 week (cherry picked from commit e3bdf3da769a55f0944d9c337bb4d91b6435f02c)
*	nvme: Enable interrupts after qpair fully constructed	Warner Losh	2021-07-21	1	-24/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To guard against the ill effects of a spurious interrupt during construction (or one that was bogusly pending), enable interrupts after the qpair is completely constructed. Otherwise, we can die with null pointer dereferences in nvme_qpair_process_completions. This has been observed in at least one pre-release NVMe drive where the MSIX interrupt fired while the queue was being created, before we'd started the NVMe controller card. The alternative of only turning on the interrupts after the rest was tried, but was insufficient to work around this bug and made the code more complicated w/o benefit. Reviewed by: mav, chuck Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31182 (cherry picked from commit fc9a0840231770bc7e7dcfe4616babdc6d4389a6)
*	nvme: coherently read status of completion records	Warner Losh	2021-07-12	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coherently read the phase bit of the status completion record. We loop over the completion record array, looking for all the transactions in the same phase that have been completed. In doing that, we have to be careful to read the status field first, and if it indicates a complete record, we need to read and process that record. Otherwise, the host might be overtaken by device when reading this completion record, leading to a mistaken belief that the record is in phase. This leads to the code using old values and looking at an already completed entry, which has no current tracker. To work around this problem, we read the status and make sure it is in phase, we then re-read the entire completion record guaranteeing it's complete, valid, and consistent . In addition we resync the dmatag to reflect changes since the prior loop for the bouncing dma case. Reviewed by: jrtc27@, chuck@ Found by: jrtc27 (this fix is based in part on her D30995 fix) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31002 (cherry picked from commit aa0ab681ae755e01cd69435fab50f6852f248c42)
*	nvme: use NVME_GONE rather than hard-coded 0xffffffff	Warner Losh	2021-02-24	1	-1/+1
\| \| \| \| \| \| \|	Make it clearer that the value 0xfffffff is being used to detect the device is gone. We use it other places in the driver for other meanings. (cherry picked from commit 9600aa31aa633bbb9e8a56d91a781d5a7ce2bff6)
*	nvme: Remove a wmb() that's not necessary.	Warner Losh	2020-12-04	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bus_dmamap_sync() ensures that memory that's prepared for PREWRITE can be DMA'd immediately after it returns. The details differ, but this mirrors atomic thread release semantics, at least for the buffers synced. For non-x86 platforms, bus_dmamap_sync() has the right syncing and fences. So in the past, wmb() had been omitted for them. For x86 platforms, the memory ordering is already strong enough to ensure DMA to the device sees the current contents. As such, we don't need the wmb() here. It translates to an sfence which is only needed for writes to regions that have the write combining attribute set or when some exotic opcodes are used. The nvme driver does neither of these. Since bus_dmamap_sync() includes atomic_thread_fence_rel, we can be assured any optimizer won't reorder the bus_dmamap_sync and the bus_space_write operations. The wmb() was a vestiage of the pre-busdma version initially committed to the tree. Reviewed by: kib@, gallatin@, chuck@, mav@ Differential Revision: https://reviews.freebsd.org/D27448 Notes: svn path=/head/; revision=368352
*	NVME: Multiple busdma related fixes.	Michal Meloun	2020-12-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- in nvme_qpair_process_completions() do dma sync before completion buffer is used. - in nvme_qpair_submit_tracker(), don't do explicit wmb() also for arm and arm64. Bus_dmamap_sync() on these architectures is sufficient to ensure that all CPU stores are visible to external (including DMA) observers. - Allocate completion buffer as BUS_DMA_COHERENT. On not-DMA coherent systems, buffers continuously owned (and accessed) by DMA must be allocated with this flag. Note that BUS_DMA_COHERENT flag is no-op on DMA coherent systems (or coherent buses in mixed systems). MFC after: 4 weeks Reviewed by: mav, imp Differential Revision: https://reviews.freebsd.org/D27446 Notes: svn path=/head/; revision=368279
*	nvme: Fix typo in definition	Chuck Tuffli	2020-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Change occurrences of "selt test" to "self tests in the NVMe header file. Reviewed by: imp, mav MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27439 Notes: svn path=/head/; revision=368275
*	Increase nvme(4) maximum transfer size from 1MB to 2MB.	Alexander Motin	2020-11-29	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With 4KB page size the 2MB is the maximum we can address with one page PRP. Going further would require chaining, that would add some more complexity. On the other side, to reduce memory consumption, allocate the PRP memory respecting maximum transfer size reported in the controller identify data. Many of NVMe devices support much smaller values, starting from 128KB. To do that we have to change the initialization sequence to pull the data earlier, before setting up the I/O queue pairs. The admin queue pair is still allocated for full MIN(maxphys, 2MB) size, but it is not a big deal, since there is only one such queue with only 16 trackers. Reviewed by: imp MFC after: 2 weeks Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=368132
*	nvme: clean up empty lines in .c and .h files	Mateusz Guzik	2020-09-01	1	-5/+0
\| \| \| \|	Notes: svn path=/head/; revision=365189
*	Remove free_domain() and uma_zfree_domain().	Mark Johnston	2020-08-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These functions were introduced before UMA started ensuring that freed memory gets placed in domain-local caches. They no longer serve any purpose since UMA now provides their functionality by default. Remove them to simplyify the kernel memory allocator interfaces a bit. Reviewed by: cem, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25937 Notes: svn path=/head/; revision=363834
*	Make polled request timeout less invasive.	Alexander Motin	2020-06-18	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of panic after one second of polling, make the normal timeout handler to activate, reset the controller and abort the outstanding requests. If all of it won't happen within 10 seconds then something in the driver is likely stuck bad and panic is the only way out. In particular this fixed device hot unplug during execution of those polled commands, allowing clean device detach instead of panic. MFC after: 1 week Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=362337
*	Fix admin qpair leak if detached during initial reset.	Alexander Motin	2020-06-17	1	-16/+29
\| \| \| \| \| \| \| \|	MFC after: 1 week Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=362282
*	Fix various Coverity-detected errors in nvme driver	David Bright	2020-05-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes several Coverity-detected errors in the nvme driver. CIDs addressed: 1008344, 1009377, 1009380, 1193740, 1305470, 1403975, 1403980 Reviewed by: imp@, vangyzen@ MFC after: 5 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D24532 Notes: svn path=/head/; revision=360568
*	remove extraneous double ;s in sys/	Ed Maste	2020-03-30	1	-1/+1
\| \| \| \|	Notes: svn path=/head/; revision=359441
*	Properly synchronize completion DMA buffers.	Michal Meloun	2019-12-15	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \|	Within command completion processing the callback function may access DMAed data buffer. Synchronize it before use, not after. This allows to use NVMe disk on non-DMA coherent arm64 system. MFC after: 3 weeks Notes: svn path=/head/; revision=355774
*	Move to using bool instead of boolean_t	Warner Losh	2019-12-13	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	While there are subtle semantic differences between bool and boolean_t, none of them matter in these cases. Prefer true/false when dealing with bool type. Preserve a couple of TRUEs since they are passed into int args into CAM. Preserve a couple of FALSEs when used for status.done, an int. Differential Revision: https://reviews.freebsd.org/D20999 Notes: svn path=/head/; revision=355721
*	trackers always know what qpair they are on	Warner Losh	2019-12-06	1	-11/+14
\| \| \| \| \| \| \| \| \| \| \|	Don't needlessly pass around qpair pointers when the tracker knows what qpair it's on. This will simplify code and make it easier to split submission and completion queues in the future. Signed-off-by: John Meneghini <johnm@netapp.com> Notes: svn path=/head/; revision=355465
*	Make nvme(4) driver some more NUMA aware.	Alexander Motin	2019-09-23	1	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- For each queue pair precalculate CPU and domain it is bound to. If queue pairs are not per-CPU, then use the domain of the device. - Allocate most of queue pair memory from the domain it is bound to. - Bind callouts to the same CPUs as queue pair to avoid migrations. - Do not assign queue pairs to each SMT thread. It just wasted resources and increased lock congestions. - Remove fixed multiplier of CPUs per queue pair, spread them even. This allows to use more queue pairs in some hardware configurations. - If queue pair serves multiple CPUs, bind different NVMe devices to different CPUs. MFC after: 1 month Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=352630
*	Support doorbell strides != 0.	Warner Losh	2019-09-04	1	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NVMe standard (1.4) states >>> 8.6 Doorbell Stride for Software Emulation >>> The doorbell stride,...is useful in software emulation of an NVM >>> Express controller. ... For hardware implementations of the NVM >>> Express interface, the expected doorbell stride value is 0h. However, hardware in the wild exists with a doorbell stride of 1 (meaning 8 byte separation). This change supports that hardware, as well as software emulators as envisioned in Section 8.6. Since this is the fast path, care has been taken to make this computation efficient. The bit of math to compute an offset for each is replaced by a memory load from cache of a pre-computed value. MFC After: 3 days Reviewed by: scottl@ Differential Revision: https://reviews.freebsd.org/D21514 Notes: svn path=/head/; revision=351828
*	Improve NVMe hot unplug handling.	Alexander Motin	2019-08-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If device is unplugged from the system (CSTS register reads return 0xffffffff), it makes no sense to send any more recovery requests or expect any responses back. If there is a detach call in such state, just stop all activity and free resources. If there is no detach call (hot-plug is not supported), rely on normal timeout handling, but when it trigger controller reset, do not wait for impossible and quickly report failure. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=351352
*	Add more random bits from NVMe 1.4.	Alexander Motin	2019-08-03	1	-0/+25
\| \| \| \| \| \| \|	MFC after: 2 weeks Notes: svn path=/head/; revision=350553
*	Add more new fields and values from NVMe 1.4.	Alexander Motin	2019-08-02	1	-0/+9
\| \| \| \| \| \| \|	MFC after: 2 weeks Notes: svn path=/head/; revision=350529
*	Keep track of the number of commands that exhaust their retry limit.	Warner Losh	2019-07-19	1	-3/+6
\| \| \| \| \| \| \| \| \|	While we print failure messages on the console, sometimes logs are lost or overwhelmed. Keeping a count of how many times we've failed retriable commands helps get a magnitude of the problem. Notes: svn path=/head/; revision=350147
*	Keep track of the number of retried commands.	Warner Losh	2019-07-19	1	-0/+3
\| \| \| \| \| \| \| \| \|	Retried commands can indicate a performance degredation of an nvme drive. Keep track of the number of retries and report it out via sysctl, just like number of commands an interrupts. Notes: svn path=/head/; revision=350146
*	Provide new tunable hw.nvme.verbose_cmd_dump	Warner Losh	2019-07-18	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The nvme drive dumps only the most relevant details about a command when it fails. However, there are times this is not sufficient (such as debugging weird issues for a new drive with a vendor). Setting hw.nvme.verbose_cmd_dump=1 in loader.conf will enable more complete debugging information about each command that fails. Reviewed by: rpokala Sponsored by: Netflix Differential Version: https://reviews.freebsd.org/D20988 Notes: svn path=/head/; revision=350118
*	Since a fatal trap can happen at aribtrary times, don't panic when the	Warner Losh	2019-06-01	1	-13/+66
\| \| \| \| \| \| \| \| \| \| \| \|	completions are not in a consistent state. Cope with the different places the normal I/O completion polling thread can be interrupted and then re-entered during a kernel panic + dump. Reviewed by: jhb and markj (both prior versions) Differential Revision: https://reviews.freebsd.org/D20478 Notes: svn path=/head/; revision=348495
*	Don't print all the I/O we abort on a reset, unless we're out of	Warner Losh	2019-03-09	1	-15/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	retries. When resetting the controller, we abort I/O. Prior to this fix, we printed a ton of abort messages for I/O that we're going to retry. This imparts no useful information. Stop printing them unless our retry count is exhausted. Clarify code for when we don't retry, and remove useless arg to a routine that's always called with it as 'true'. All the other debug is still printed (including multiple reset messages if we have multiple timeouts before the taskqueue runs the actual reset) so that we know when we reset. Reviewed by: jimharris@, chuck@ Differential Revision: https://reviews.freebsd.org/D19431 Notes: svn path=/head/; revision=344955
*	Add ABORTED_BY_REQUEST to the list of things we look at DNR bit and tell why ↵	Warner Losh	2019-03-03	1	-1/+2
\| \| \| \| \| \| \|	to comment (code already does this) Notes: svn path=/head/; revision=344736
*	Unconditionally support unmapped BIOs. This was another shim for	Warner Losh	2019-02-27	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	supporting older kernels. However, all supported versions of FreeBSD have unmapped I/Os (as do several that have gone EOL), remove it. It's unlikely the driver would work on the older kernels anyway at this point. Notes: svn path=/head/; revision=344642
*	Remove #ifdef code to support FreeBSD versions that haven't been	Warner Losh	2019-02-27	1	-5/+0
\| \| \| \| \| \| \| \| \|	supported in years. A number of changes have been made to the driver that likely wouldn't work on those older versions that aren't properly ifdef'd and it's project policy to GC such code once it is stale. Notes: svn path=/head/; revision=344640
*	Add descriptions to NVMe interrupts.	Alexander Motin	2018-12-26	1	-0/+7
\| \| \| \| \| \| \|	MFC after: 1 month Notes: svn path=/head/; revision=342546
*	Make NVMe compatible with the original API	Chuck Tuffli	2018-08-22	1	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original NVMe API used bit-fields to represent fields in data structures defined by the specification (e.g. the op-code in the command data structure). The implementation targeted x86_64 processors and defined the bit fields for little endian dwords (i.e. 32 bits). This approach does not work as-is for big endian architectures and was changed to use a combination of bit shifts and masks to support PowerPC. Unfortunately, this changed the NVMe API and forces #ifdef's based on the OS revision level in user space code. This change reverts to something that looks like the original API, but it uses bytes instead of bit-fields inside the packed command structure. As a bonus, this works as-is for both big and little endian CPU architectures. Bump __FreeBSD_version to 1200081 due to API change Reviewed by: imp, kbowling, smh, mav Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D16404 Notes: svn path=/head/; revision=338182
*	nvme(4): Add bus_dmamap_sync() at the end of the request path	Justin Hibbits	2018-08-03	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some architectures, in this case powerpc64, need explicit synchronization barriers vs device accesses. Prior to this change, when running 'make buildworld -j72' on a 18-core (72-thread) POWER9, I would see controller resets often. With this change, I don't see these resets messages, though another tester still does, for yet to be determined reasons, so this may not be a complete fix. Additionally, I see a ~5-10% speed up in buildworld times, likely due to not needing to reset the controller. Reviewed By: jimharris Differential Revision: https://reviews.freebsd.org/D16570 Notes: svn path=/head/; revision=337273
*	Fix use-after-free in nvme_qpair_destroy().	Alexander Motin	2018-04-30	1	-7/+8
\| \| \| \| \| \| \| \| \| \|	dma_tag_payload should not be destroyed before payload_dma_map, and seems it should be used there instead of dma_tag to match creation. Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=333127
*	Try polling the qpairs on timeout.	Warner Losh	2018-03-16	1	-9/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some systems, we're getting timeouts when we use multiple queues on drives that work perfectly well on other systems. On a hunch, Jim Harris suggested I poll the completion queue when we get a timeout. This patch polls the completion queue if no fatal status was indicated. If it had pending I/O, we complete that request and return. Otherwise, if aborts are enabled and no fatal status, we abort the command and return. Otherwise we reset the card. This may clear up the problem, or we may see it result in lots of timeouts and a performance problem. Either way, we'll know the next step. We may also need to pay attention to the fatal status bit of the controller. PR: 211713 Suggested by: Jim Harris Sponsored by: Netflix Notes: svn path=/head/; revision=331046