aboutsummaryrefslogtreecommitdiff
path: root/docs/HardwareAssistedAddressSanitizerDesign.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/HardwareAssistedAddressSanitizerDesign.rst')
-rw-r--r--docs/HardwareAssistedAddressSanitizerDesign.rst139
1 files changed, 139 insertions, 0 deletions
diff --git a/docs/HardwareAssistedAddressSanitizerDesign.rst b/docs/HardwareAssistedAddressSanitizerDesign.rst
new file mode 100644
index 000000000000..00777ce88280
--- /dev/null
+++ b/docs/HardwareAssistedAddressSanitizerDesign.rst
@@ -0,0 +1,139 @@
+=======================================================
+Hardware-assisted AddressSanitizer Design Documentation
+=======================================================
+
+This page is a design document for
+**hardware-assisted AddressSanitizer** (or **HWASAN**)
+a tool similar to :doc:`AddressSanitizer`,
+but based on partial hardware assistance.
+
+The document is a draft, suggestions are welcome.
+
+
+Introduction
+============
+
+:doc:`AddressSanitizer`
+tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*),
+uses *redzones* to find buffer-overflows and
+*quarantine* to find use-after-free.
+The redzones, the quarantine, and, to a less extent, the shadow, are the
+sources of AddressSanitizer's memory overhead.
+See the `AddressSanitizer paper`_ for details.
+
+AArch64 has the `Address Tagging`_, a hardware feature that allows
+software to use 8 most significant bits of a 64-bit pointer as
+a tag. HWASAN uses `Address Tagging`_
+to implement a memory safety tool, similar to :doc:`AddressSanitizer`,
+but with smaller memory overhead and slightly different (mostly better)
+accuracy guarantees.
+
+Algorithm
+=========
+* Every heap/stack/global memory object is forcibly aligned by `N` bytes
+ (`N` is e.g. 16 or 64)
+* For every such object a random `K`-bit tag `T` is chosen (`K` is e.g. 4 or 8)
+* The pointer to the object is tagged with `T`.
+* The memory for the object is also tagged with `T`
+ (using a `N=>1` shadow memory)
+* Every load and store is instrumented to read the memory tag and compare it
+ with the pointer tag, exception is raised on tag mismatch.
+
+Instrumentation
+===============
+
+Memory Accesses
+---------------
+All memory accesses are prefixed with a call to a run-time function.
+The function encodes the type and the size of access in its name;
+it receives the address as a parameter, e.g. `__hwasan_load4(void *ptr)`;
+it loads the memory tag, compares it with the
+pointer tag, and executes `__builtin_trap` (or calls `__hwasan_error_load4(void *ptr)`) on mismatch.
+
+It's possible to inline this callback too.
+
+Heap
+----
+
+Tagging the heap memory/pointers is done by `malloc`.
+This can be based on any malloc that forces all objects to be N-aligned.
+
+Stack
+-----
+
+Special compiler instrumentation is required to align the local variables
+by N, tag the memory and the pointers.
+Stack instrumentation is expected to be a major source of overhead,
+but could be optional.
+TODO: details.
+
+Globals
+-------
+
+TODO: details.
+
+Error reporting
+---------------
+
+Errors are generated by `__builtin_trap` and are handled by a signal handler.
+
+Attribute
+---------
+
+HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching
+C function attribute. An alternative would be to re-use ASAN's attribute
+`sanitize_address`. The reasons to use a separate attribute are:
+
+ * Users may need to disable ASAN but not HWASAN, or vise versa,
+ because the tools have different trade-offs and compatibility issues.
+ * LLVM (ideally) does not use flags to decide which pass is being used,
+ ASAN or HWASAN are being applied, based on the function attributes.
+
+This does mean that users of HWASAN may need to add the new attribute
+to the code that already uses the old attribute.
+
+
+Comparison with AddressSanitizer
+================================
+
+HWASAN:
+ * Is less portable than :doc:`AddressSanitizer`
+ as it relies on hardware `Address Tagging`_ (AArch64).
+ Address Tagging can be emulated with compiler instrumentation,
+ but it will require the instrumentation to remove the tags before
+ any load or store, which is infeasible in any realistic environment
+ that contains non-instrumented code.
+ * May have compatibility problems if the target code uses higher
+ pointer bits for other purposes.
+ * May require changes in the OS kernels (e.g. Linux seems to dislike
+ tagged pointers passed from address space:
+ https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).
+ * **Does not require redzones to detect buffer overflows**,
+ but the buffer overflow detection is probabilistic, with roughly
+ `(2**K-1)/(2**K)` probability of catching a bug.
+ * **Does not require quarantine to detect heap-use-after-free,
+ or stack-use-after-return**.
+ The detection is similarly probabilistic.
+
+The memory overhead of HWASAN is expected to be much smaller
+than that of AddressSanitizer:
+`1/N` extra memory for the shadow
+and some overhead due to `N`-aligning all objects.
+
+
+Related Work
+============
+* `SPARC ADI`_ implements a similar tool mostly in hardware.
+* `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses
+ similar approaches ("lock & key").
+* `Watchdog`_ discussed a heavier, but still somewhat similar
+ "lock & key" approach.
+* *TODO: add more "related work" links. Suggestions are welcome.*
+
+
+.. _Watchdog: http://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf
+.. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf
+.. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html
+.. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf
+.. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html
+