MALLOC(3) BSD Library Functions Manual MALLOC(3)NAME
malloc, calloc, realloc, free, reallocf, malloc_usable_size — general
purpose memory allocation functions
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <stdlib.h>
void *
malloc(size_t size);
void *
calloc(size_t number, size_t size);
void *
realloc(void *ptr, size_t size);
void *
reallocf(void *ptr, size_t size);
void
free(void *ptr);
const char * _malloc_options;
void
(*_malloc_message)(const char *p1, const char *p2, const char *p3,
const char *p4);
#include <malloc_np.h>
size_t
malloc_usable_size(const void *ptr);
DESCRIPTION
The malloc() function allocates size bytes of uninitialized memory. The
allocated space is suitably aligned (after possible pointer coercion) for
storage of any type of object.
The calloc() function allocates space for number objects, each size bytes
in length. The result is identical to calling malloc() with an argument
of “number * size”, with the exception that the allocated memory is
explicitly initialized to zero bytes.
The realloc() function changes the size of the previously allocated mem‐
ory referenced by ptr to size bytes. The contents of the memory are
unchanged up to the lesser of the new and old sizes. If the new size is
larger, the contents of the newly allocated portion of the memory are
undefined. Upon success, the memory referenced by ptr is freed and a
pointer to the newly allocated memory is returned. Note that realloc()
and reallocf() may move the memory allocation, resulting in a different
return value than ptr. If ptr is NULL, the realloc() function behaves
identically to malloc() for the specified size.
The reallocf() function is identical to the realloc() function, except
that it will free the passed pointer when the requested memory cannot be
allocated. This is a FreeBSD specific API designed to ease the problems
with traditional coding styles for realloc causing memory leaks in
libraries.
The free() function causes the allocated memory referenced by ptr to be
made available for future allocations. If ptr is NULL, no action occurs.
The malloc_usable_size() function returns the usable size of the alloca‐
tion pointed to by ptr. The return value may be larger than the size
that was requested during allocation. The malloc_usable_size() function
is not a mechanism for in-place realloc(); rather it is provided solely
as a tool for introspection purposes. Any discrepancy between the
requested allocation size and the size reported by malloc_usable_size()
should not be depended on, since such behavior is entirely implementa‐
tion-dependent.
TUNING
Once, when the first call is made to one of these memory allocation rou‐
tines, various flags will be set or reset, which affects the workings of
this allocator implementation.
The “name” of the file referenced by the symbolic link named
/etc/malloc.conf, the value of the environment variable MALLOC_OPTIONS,
and the string pointed to by the global variable _malloc_options will be
interpreted, in that order, from left to right as flags.
Each flag is a single letter, optionally prefixed by a non-negative base
10 integer repetition count. For example, “3N” is equivalent to “NNN”.
Some flags control parameter magnitudes, where uppercase increases the
magnitude, and lowercase decreases the magnitude. Other flags control
boolean parameters, where uppercase indicates that a behavior is set, or
on, and lowercase means that a behavior is not set, or off.
A All warnings (except for the warning about unknown flags being
set) become fatal. The process will call abort(3) in these
cases.
B Double/halve the per-arena lock contention threshold at which a
thread is randomly re-assigned to an arena. This dynamic load
balancing tends to push threads away from highly contended are‐
nas, which avoids worst case contention scenarios in which
threads disproportionately utilize arenas. However, due to the
highly dynamic load that applications may place on the allocator,
it is impossible for the allocator to know in advance how sensi‐
tive it should be to contention over arenas. Therefore, some
applications may benefit from increasing or decreasing this
threshold parameter. This option is not available for some con‐
figurations (non-PIC).
C Double/halve the size of the maximum size class that is a multi‐
ple of the cacheline size (64). Above this size, subpage spacing
(256 bytes) is used for size classes. The default value is 512
bytes.
D Use sbrk(2) to acquire memory in the data storage segment (DSS).
This option is enabled by default. See the “M” option for
related information and interactions.
F Double/halve the per-arena maximum number of dirty unused pages
that are allowed to accumulate before informing the kernel about
at least half of those pages via madvise(2). This provides the
kernel with sufficient information to recycle dirty pages if
physical memory becomes scarce and the pages remain unused. The
default is 512 pages per arena; MALLOC_OPTIONS=10f will prevent
any dirty unused pages from accumulating.
G When there are multiple threads, use thread-specific caching for
objects that are smaller than one page. This option is enabled
by default. Thread-specific caching allows many allocations to
be satisfied without performing any thread synchronization, at
the cost of increased memory use. See the “R” option for related
tuning information. This option is not available for some con‐
figurations (non-PIC).
J Each byte of new memory allocated by malloc(), realloc() or
reallocf() will be initialized to 0xa5. All memory returned by
free(), realloc() or reallocf() will be initialized to 0x5a.
This is intended for debugging and will impact performance nega‐
tively.
K Double/halve the virtual memory chunk size. The default chunk
size is the maximum of 1 MB and the largest page size that is
less than or equal to 4 MB.
M Use mmap(2) to acquire anonymously mapped memory. This option is
enabled by default. If both the “D” and “M” options are enabled,
the allocator prefers anonymous mappings over the DSS, but allo‐
cation only fails if memory cannot be acquired via either method.
If neither option is enabled, then the “M” option is implicitly
enabled in order to assure that there is a method for acquiring
memory.
N Double/halve the number of arenas. The default number of arenas
is two times the number of CPUs, or one if there is a single CPU.
P Various statistics are printed at program exit via an atexit(3)
function. This has the potential to cause deadlock for a multi-
threaded process that exits while one or more threads are execut‐
ing in the memory allocation functions. Therefore, this option
should only be used with care; it is primarily intended as a per‐
formance tuning aid during application development.
Q Double/halve the size of the maximum size class that is a multi‐
ple of the quantum (8 or 16 bytes, depending on architecture).
Above this size, cacheline spacing is used for size classes. The
default value is 128 bytes.
R Double/halve magazine size, which approximately doubles/halves
the number of rounds in each magazine. Magazines are used by the
thread-specific caching machinery to acquire and release objects
in bulk. Increasing the magazine size decreases locking over‐
head, at the expense of increased memory usage. This option is
not available for some configurations (non-PIC).
U Generate “utrace” entries for ktrace(1), for all operations.
Consult the source for details on this option.
V Attempting to allocate zero bytes will return a NULL pointer
instead of a valid pointer. (The default behavior is to make a
minimal allocation and return a pointer to it.) This option is
provided for System V compatibility. This option is incompatible
with the “X” option.
X Rather than return failure for any allocation function, display a
diagnostic message on stderr and cause the program to drop core
(using abort(3)). This option should be set at compile time by
including the following in the source code:
_malloc_options = "X";
Z Each byte of new memory allocated by malloc(), realloc() or
reallocf() will be initialized to 0. Note that this initializa‐
tion only happens once for each byte, so realloc() and reallocf()
calls do not zero memory that was previously allocated. This is
intended for debugging and will impact performance negatively.
The “J” and “Z” options are intended for testing and debugging. An
application which changes its behavior when these options are used is
flawed.
IMPLEMENTATION NOTES
Traditionally, allocators have used sbrk(2) to obtain memory, which is
suboptimal for several reasons, including race conditions, increased
fragmentation, and artificial limitations on maximum usable memory. This
allocator uses both sbrk(2) and mmap(2) by default, but it can be config‐
ured at run time to use only one or the other. If resource limits are
not a primary concern, the preferred configuration is MALLOC_OPTIONS=dM
or MALLOC_OPTIONS=DM. When so configured, the datasize resource limit
has little practical effect for typical applications; use
MALLOC_OPTIONS=Dm if that is a concern. Regardless of allocator configu‐
ration, the vmemoryuse resource limit can be used to bound the total vir‐
tual memory used by a process, as described in limits(1).
This allocator uses multiple arenas in order to reduce lock contention
for threaded programs on multi-processor systems. This works well with
regard to threading scalability, but incurs some costs. There is a small
fixed per-arena overhead, and additionally, arenas manage memory com‐
pletely independently of each other, which means a small fixed increase
in overall memory fragmentation. These overheads are not generally an
issue, given the number of arenas normally used. Note that using sub‐
stantially more arenas than the default is not likely to improve perfor‐
mance, mainly due to reduced cache performance. However, it may make
sense to reduce the number of arenas if an application does not make much
use of the allocation functions.
In addition to multiple arenas, this allocator supports thread-specific
caching for small objects (smaller than one page), in order to make it
possible to completely avoid synchronization for most small allocation
requests. Such caching allows very fast allocation in the common case,
but it increases memory usage and fragmentation, since a bounded number
of objects can remain allocated in each thread cache.
Memory is conceptually broken into equal-sized chunks, where the chunk
size is a power of two that is greater than the page size. Chunks are
always aligned to multiples of the chunk size. This alignment makes it
possible to find metadata for user objects very quickly.
User objects are broken into three categories according to size: small,
large, and huge. Small objects are smaller than one page. Large objects
are smaller than the chunk size. Huge objects are a multiple of the
chunk size. Small and large objects are managed by arenas; huge objects
are managed separately in a single data structure that is shared by all
threads. Huge objects are used by applications infrequently enough that
this single data structure is not a scalability issue.
Each chunk that is managed by an arena tracks its contents as runs of
contiguous pages (unused, backing a set of small objects, or backing one
large object). The combination of chunk alignment and chunk page maps
makes it possible to determine all metadata regarding small and large
allocations in constant time.
Small objects are managed in groups by page runs. Each run maintains a
bitmap that tracks which regions are in use. Allocation requests that
are no more than half the quantum (8 or 16, depending on architecture)
are rounded up to the nearest power of two. Allocation requests that are
more than half the quantum, but no more than the minimum cacheline-multi‐
ple size class (see the “Q” option) are rounded up to the nearest multi‐
ple of the quantum. Allocation requests that are more than the minumum
cacheline-multiple size class, but no more than the minimum subpage-mul‐
tiple size class (see the “C” option) are rounded up to the nearest mul‐
tiple of the cacheline size (64). Allocation requests that are more than
the minimum subpage-multiple size class are rounded up to the nearest
multiple of the subpage size (256). Allocation requests that are more
than one page, but small enough to fit in an arena-managed chunk (see the
“K” option), are rounded up to the nearest run size. Allocation requests
that are too large to fit in an arena-managed chunk are rounded up to the
nearest multiple of the chunk size.
Allocations are packed tightly together, which can be an issue for multi-
threaded applications. If you need to assure that allocations do not
suffer from cacheline sharing, round your allocation requests up to the
nearest multiple of the cacheline size.
DEBUGGING MALLOC PROBLEMS
The first thing to do is to set the “A” option. This option forces a
coredump (if possible) at the first sign of trouble, rather than the nor‐
mal policy of trying to continue if at all possible.
It is probably also a good idea to recompile the program with suitable
options and symbols for debugger support.
If the program starts to give unusual results, coredump or generally
behave differently without emitting any of the messages mentioned in the
next section, it is likely because it depends on the storage being filled
with zero bytes. Try running it with the “Z” option set; if that
improves the situation, this diagnosis has been confirmed. If the pro‐
gram still misbehaves, the likely problem is accessing memory outside the
allocated area.
Alternatively, if the symptoms are not easy to reproduce, setting the “J”
option may help provoke the problem.
In truly difficult cases, the “U” option, if supported by the kernel, can
provide a detailed trace of all calls made to these functions.
Unfortunately this implementation does not provide much detail about the
problems it detects; the performance impact for storing such information
would be prohibitive. There are a number of allocator implementations
available on the Internet which focus on detecting and pinpointing prob‐
lems by trading performance for extra sanity checks and detailed diagnos‐
tics.
DIAGNOSTIC MESSAGES
If any of the memory allocation/deallocation functions detect an error or
warning condition, a message will be printed to file descriptor
STDERR_FILENO. Errors will result in the process dumping core. If the
“A” option is set, all warnings are treated as errors.
The _malloc_message variable allows the programmer to override the func‐
tion which emits the text strings forming the errors and warnings if for
some reason the stderr file descriptor is not suitable for this. Please
note that doing anything which tries to allocate memory in this function
is likely to result in a crash or deadlock.
All messages are prefixed by “⟨progname⟩: (malloc)”.
RETURN VALUES
The malloc() and calloc() functions return a pointer to the allocated
memory if successful; otherwise a NULL pointer is returned and errno is
set to ENOMEM.
The realloc() and reallocf() functions return a pointer, possibly identi‐
cal to ptr, to the allocated memory if successful; otherwise a NULL
pointer is returned, and errno is set to ENOMEM if the error was the
result of an allocation failure. The realloc() function always leaves
the original buffer intact when an error occurs, whereas reallocf() deal‐
locates it in this case.
The free() function returns no value.
The malloc_usable_size() function returns the usable size of the alloca‐
tion pointed to by ptr.
ENVIRONMENT
The following environment variables affect the execution of the alloca‐
tion functions:
MALLOC_OPTIONS If the environment variable MALLOC_OPTIONS is set, the
characters it contains will be interpreted as flags to
the allocation functions.
EXAMPLES
To dump core whenever a problem occurs:
ln -s 'A' /etc/malloc.conf
To specify in the source that a program does no return value checking on
calls to these functions:
_malloc_options = "X";
SEE ALSOlimits(1), madvise(2), mmap(2), sbrk(2), alloca(3), atexit(3),
getpagesize(3), getpagesizes(3), memory(3), posix_memalign(3)STANDARDS
The malloc(), calloc(), realloc() and free() functions conform to ISO/IEC
9899:1990 (“ISO C90”).
HISTORY
The reallocf() function first appeared in FreeBSD 3.0.
The malloc_usable_size() function first appeared in FreeBSD 7.0.
BSD September 26, 2009 BSD