aboutsummaryrefslogtreecommitdiff
path: root/contrib/jemalloc/ChangeLog
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/jemalloc/ChangeLog')
-rw-r--r--contrib/jemalloc/ChangeLog191
1 files changed, 189 insertions, 2 deletions
diff --git a/contrib/jemalloc/ChangeLog b/contrib/jemalloc/ChangeLog
index a9406853e1bf..98c12f2048e5 100644
--- a/contrib/jemalloc/ChangeLog
+++ b/contrib/jemalloc/ChangeLog
@@ -4,6 +4,193 @@ brevity. Much more detail can be found in the git revision history:
https://github.com/jemalloc/jemalloc
+* 5.0.0 (June 13, 2017)
+
+ Unlike all previous jemalloc releases, this release does not use naturally
+ aligned "chunks" for virtual memory management, and instead uses page-aligned
+ "extents". This change has few externally visible effects, but the internal
+ impacts are... extensive. Many other internal changes combine to make this
+ the most cohesively designed version of jemalloc so far, with ample
+ opportunity for further enhancements.
+
+ Continuous integration is now an integral aspect of development thanks to the
+ efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
+ stable on the tested platforms (Linux, FreeBSD, macOS, and Windows). As a
+ side effect the official release frequency may decrease over time.
+
+ New features:
+ - Implement optional per-CPU arena support; threads choose which arena to use
+ based on current CPU rather than on fixed thread-->arena associations.
+ (@interwq)
+ - Implement two-phase decay of unused dirty pages. Pages transition from
+ dirty-->muzzy-->clean, where the first phase transition relies on
+ madvise(... MADV_FREE) semantics, and the second phase transition discards
+ pages such that they are replaced with demand-zeroed pages on next access.
+ (@jasone)
+ - Increase decay time resolution from seconds to milliseconds. (@jasone)
+ - Implement opt-in per CPU background threads, and use them for asynchronous
+ decay-driven unused dirty page purging. (@interwq)
+ - Add mutex profiling, which collects a variety of statistics useful for
+ diagnosing overhead/contention issues. (@interwq)
+ - Add C++ new/delete operator bindings. (@djwatson)
+ - Support manually created arena destruction, such that all data and metadata
+ are discarded. Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
+ associated with destroyed arenas. (@jasone)
+ - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
+ merged/destroyed arena statistics via mallctl. (@jasone)
+ - Add opt.abort_conf to optionally abort if invalid configuration options are
+ detected during initialization. (@interwq)
+ - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
+ stats dumped during exit if opt.stats_print is true. (@jasone)
+ - Add --with-version=VERSION for use when embedding jemalloc into another
+ project's git repository. (@jasone)
+ - Add --disable-thp to support cross compiling. (@jasone)
+ - Add --with-lg-hugepage to support cross compiling. (@jasone)
+ - Add mallctl interfaces (various authors):
+ + background_thread
+ + opt.abort_conf
+ + opt.retain
+ + opt.percpu_arena
+ + opt.background_thread
+ + opt.{dirty,muzzy}_decay_ms
+ + opt.stats_print_opts
+ + arena.<i>.initialized
+ + arena.<i>.destroy
+ + arena.<i>.{dirty,muzzy}_decay_ms
+ + arena.<i>.extent_hooks
+ + arenas.{dirty,muzzy}_decay_ms
+ + arenas.bin.<i>.slab_size
+ + arenas.nlextents
+ + arenas.lextent.<i>.size
+ + arenas.create
+ + stats.background_thread.{num_threads,num_runs,run_interval}
+ + stats.mutexes.{ctl,background_thread,prof,reset}.
+ {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
+ num_owner_switch}
+ + stats.arenas.<i>.{dirty,muzzy}_decay_ms
+ + stats.arenas.<i>.uptime
+ + stats.arenas.<i>.{pmuzzy,base,internal,resident}
+ + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
+ + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
+ + stats.arenas.<i>.bins.<j>.mutex.
+ {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
+ num_owner_switch}
+ + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
+ + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
+ extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
+ {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
+ num_owner_switch}
+
+ Portability improvements:
+ - Improve reentrant allocation support, such that deadlock is less likely if
+ e.g. a system library call in turn allocates memory. (@davidtgoldblatt,
+ @interwq)
+ - Support static linking of jemalloc with glibc. (@djwatson)
+
+ Optimizations and refactors:
+ - Organize virtual memory as "extents" of virtual memory pages, rather than as
+ naturally aligned "chunks", and store all metadata in arbitrarily distant
+ locations. This reduces virtual memory external fragmentation, and will
+ interact better with huge pages (not yet explicitly supported). (@jasone)
+ - Fold large and huge size classes together; only small and large size classes
+ remain. (@jasone)
+ - Unify the allocation paths, and merge most fast-path branching decisions.
+ (@davidtgoldblatt, @interwq)
+ - Embed per thread automatic tcache into thread-specific data, which reduces
+ conditional branches and dereferences. Also reorganize tcache to increase
+ fast-path data locality. (@interwq)
+ - Rewrite atomics to closely model the C11 API, convert various
+ synchronization from mutex-based to atomic, and use the explicit memory
+ ordering control to resolve various hypothetical races without increasing
+ synchronization overhead. (@davidtgoldblatt)
+ - Extensively optimize rtree via various methods:
+ + Add multiple layers of rtree lookup caching, since rtree lookups are now
+ part of fast-path deallocation. (@interwq)
+ + Determine rtree layout at compile time. (@jasone)
+ + Make the tree shallower for common configurations. (@jasone)
+ + Embed the root node in the top-level rtree data structure, thus avoiding
+ one level of indirection. (@jasone)
+ + Further specialize leaf elements as compared to internal node elements,
+ and directly embed extent metadata needed for fast-path deallocation.
+ (@jasone)
+ + Ignore leading always-zero address bits (architecture-specific).
+ (@jasone)
+ - Reorganize headers (ongoing work) to make them hermetic, and disentangle
+ various module dependencies. (@davidtgoldblatt)
+ - Convert various internal data structures such as size class metadata from
+ boot-time-initialized to compile-time-initialized. Propagate resulting data
+ structure simplifications, such as making arena metadata fixed-size.
+ (@jasone)
+ - Simplify size class lookups when constrained to size classes that are
+ multiples of the page size. This speeds lookups, but the primary benefit is
+ complexity reduction in code that was the source of numerous regressions.
+ (@jasone)
+ - Lock individual extents when possible for localized extent operations,
+ rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone)
+ - Use first fit layout policy instead of best fit, in order to improve
+ packing. (@jasone)
+ - If munmap(2) is not in use, use an exponential series to grow each arena's
+ virtual memory, so that the number of disjoint virtual memory mappings
+ remains low. (@jasone)
+ - Implement per arena base allocators, so that arenas never share any virtual
+ memory pages. (@jasone)
+ - Automatically generate private symbol name mangling macros. (@jasone)
+
+ Incompatible changes:
+ - Replace chunk hooks with an expanded/normalized set of extent hooks.
+ (@jasone)
+ - Remove ratio-based purging. (@jasone)
+ - Remove --disable-tcache. (@jasone)
+ - Remove --disable-tls. (@jasone)
+ - Remove --enable-ivsalloc. (@jasone)
+ - Remove --with-lg-size-class-group. (@jasone)
+ - Remove --with-lg-tiny-min. (@jasone)
+ - Remove --disable-cc-silence. (@jasone)
+ - Remove --enable-code-coverage. (@jasone)
+ - Remove --disable-munmap (replaced by opt.retain). (@jasone)
+ - Remove Valgrind support. (@jasone)
+ - Remove quarantine support. (@jasone)
+ - Remove redzone support. (@jasone)
+ - Remove mallctl interfaces (various authors):
+ + config.munmap
+ + config.tcache
+ + config.tls
+ + config.valgrind
+ + opt.lg_chunk
+ + opt.purge
+ + opt.lg_dirty_mult
+ + opt.decay_time
+ + opt.quarantine
+ + opt.redzone
+ + opt.thp
+ + arena.<i>.lg_dirty_mult
+ + arena.<i>.decay_time
+ + arena.<i>.chunk_hooks
+ + arenas.initialized
+ + arenas.lg_dirty_mult
+ + arenas.decay_time
+ + arenas.bin.<i>.run_size
+ + arenas.nlruns
+ + arenas.lrun.<i>.size
+ + arenas.nhchunks
+ + arenas.hchunk.<i>.size
+ + arenas.extend
+ + stats.cactive
+ + stats.arenas.<i>.lg_dirty_mult
+ + stats.arenas.<i>.decay_time
+ + stats.arenas.<i>.metadata.{mapped,allocated}
+ + stats.arenas.<i>.{npurge,nmadvise,purged}
+ + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
+ + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
+ + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
+ + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
+
+ Bug fixes:
+ - Improve interval-based profile dump triggering to dump only one profile when
+ a single allocation's size exceeds the interval. (@jasone)
+ - Use prefixed function names (as controlled by --with-jemalloc-prefix) when
+ pruning backtrace frames in jeprof. (@jasone)
+
* 4.5.0 (February 28, 2017)
This is the first release to benefit from much broader continuous integration
@@ -12,7 +199,7 @@ brevity. Much more detail can be found in the git revision history:
regressions fixed by this release.
New features:
- - Add --disable-thp and the opt.thp to provide opt-out mechanisms for
+ - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
transparent huge page integration. (@jasone)
- Update zone allocator integration to work with macOS 10.12. (@glandium)
- Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
@@ -25,7 +212,7 @@ brevity. Much more detail can be found in the git revision history:
- Handle race in per size class utilization computation. This functionality
was first released in 4.0.0. (@interwq)
- Fix lock order reversal during gdump. (@jasone)
- - Fix-refactor tcache synchronization. This regression was first released in
+ - Fix/refactor tcache synchronization. This regression was first released in
4.0.0. (@jasone)
- Fix various JSON-formatted malloc_stats_print() bugs. This functionality
was first released in 4.3.0. (@jasone)