Follow
Publications: 0 | Followers: 1

Dynamic Memory Allocation - people.cs.clemson.edu

Publish on Category: Birds 268

DynamicMemory Allocation
CPSC/ECE 3220Mark Smotherman
Explicit DynamicMemory Allocation Characteristics
Objects are explicitly created anddestroyedDynamically-allocated object lifetimes are not linked to normal scope rulesLonger than lifetime of astack-allocated object (which is durationofafunction call)Shorter than lifetime of aglobal or static object (which isduration ofprocess)It’s not practical to globally allocate all the dynamic objects that may be neededAllocated in special segment of memory called the heapObject is born on call tomalloc(), dies on call to free()Does not use garbage collectionAllocatorknowsthe objectsize but usuallynot thedata typeAllocatedblockstypically cannot berelocated by the allocatorMultipleuser pointerscould exist related toan allocatedblock
Allocator Goals
Space efficiencyKeep track of releases to reuse released memory for new requestsMinimizefragmentation (wasted space)Time efficiencyO(1) complexity where possibleAs usual, these goals can conflict
Allocation Patterns
Block lifetimes are not randomRamp– allocations throughout program lifetime without releasesPlateau – allocations, then lengthy usage, then releasesPeaks –burstybehavior and short object lifetimesBlock sizes are not randomZorn andGrunwald1992 study of six allocation-intensive C programs (e.g., gawk, PCB channel router,perl) found that the top two sizes accounted for 53-93%ofrequestsSeehttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.5186So, the allocator can attemptto exploit patternsAllocate related blocks contiguouslyCan infer relationshipsSimilar allocation timesSame objectsizes
Allocator Actions
AllocateIdentifya free blockCoalesce free blocks during thesearch? (e.g., UnixSystem V, but makessearchslower)Splitthefree block?Exact size ofrequestNearest allowable size (e.g., splitting blocks in half in a binary buddy system)No splittingReleaseInsert into free listCoalesce freeblocks?Immediately onreleaseDeferredSmallblocks may be immediatelyreallocated, so avoid immediate coalescing that will have to be undone by a splitNo coalescing
DataStructure Choices for the Set of Free Blocks1
Linked list of variable-size free blocksLocationExternallist (but where to get the space for nodesinthe free list as it gets longer?)Integrated list (use the space within the free blocks to hold the links!)OrderingSorted byaddressSorted by block sizeLIFOFIFOArray oflinkedlists (“segregated free lists”)Multiple free lists, each witha fixedblock size or a specified range of block sizesTreeBitmap forfixed-size freeblocksSimple free pointer for a region______________1Some early memory allocation algorithms kept both allocated and free blocks in a single list
Identifying a Variable-Length FreeBlock
Sequential fitLimited searchFirst fit – but small fragmentswill typically buildup at front of listNext fit– start next search where last search stoppedExhaustive searchBest fit – try to leave the smallest fragmentsWorst fit – try to leave the largest fragmentsNote thatthe array-of-listsapproachcan be similar to best-fit but without exhaustive searches
Some methods use a header or tag located above an allocated block ofBbytes (invisible to requestor)
Header for an Allocated Block
Freeblock
Freeblock
Allocatedblock
size of N, status = freefreelist pointers
size of B, status = allocated
size of (N-H-B), status = freefreelist pointers
B bytes
H bytes
Before
After
Header for an Allocated Block (2)
Using aheader meanssplitting (H+B)bytesout of a large free block,notjust B bytesBoundary tag method uses bothheader and trailerTags may include the status of the adjacent block in memory as well as the size and status of the currentblock
Splitting a Variable-Length Block for Exact Size
From top or bottom of free block?Allocating fromtop means linksOriginalin the current block have to befree blockmovedand links intwo otherfree blockshave to bechangedAllocating from bottom maymean only the size fieldfor thecurrent blockneeds tochange
Use the spaceinside a freeblock to holdfree list links
fwdlinkbacklink
fwdlinkbacklink
fwdlinkbacklink
Coalescing Variable-Length Free Blocks
Four cases based on status of adjacent blocksNeitheris free Above is free Below is free Both are freeMustinsert Must merge twoa new free block \__ update an existing free block __/ existing free blocksin the free list in the free list
Block beingreturned
Block beingreturned
Block beingreturned
Block beingreturned
Boundary Tag (Knuth, 1962)
Designed for variable-sized allocations,burstypatterns,and ease ofcoalescingSingle free list, doubly-linked, LIFO or FIFO orderHeader and trailer words in eachblock with size and statusAllows O(1) access to adjacent blocks to inspect their status and guide coalescingvoid *allocate(size_tsize )Search free list (e.g., first-fit)If a split is needed, set up two new tags within the identified free blockvoid release( void*ptr)Compare tags of returned block with tags in adjacent blocks in memoryIf no coalescing possible, add new free block to free list (front or back) – O(1)
Binary BuddySystem (Markowitz, 1963)
Designed forvariable-sized allocations,burstypatterns, and ease of coalescingTable of free list head pointers, with each list containing freeblocksof 2ninsizethereare limits on the high and low values fornvoid *allocate(size_tsize )Round up a request size to 2nsizeIf free block of that size is not available, identify a larger block and recursively split it in half until a suitably-sized block is available to satisfy the requestrelease( … )Design choice 1: pass backboth address and sizeNo header/tag needed for allocated blockFlipbitfor rounded size in the returned block addressto constructthe address of the “buddy”blockSearch in free list of that size for the buddy; if found, coalesce and recursively repeatDesign choice 2: pass back only addressUse a header word with status and size to provide information for coalescingAlternatively, use a hierarchical set of bitmapsLater extensions are Fibonacci buddy,weighted buddy,and double buddy
BSD 4.2malloc() (Kingsley, 1982)
Designed to be extremely fastTradeoffisfragmentation (internal as well as the unused, “stranded” blocks of wrong size)Array of 30 free lists, each of specified size 2(i+3)(i.e., 8 bytes to 4GiB)void*malloc(size_tsize )Round up the requested size, plusa 4-byteheader, to a power-of-2No splitting of blocks from free lists with larger block sizeInstead, if the appropriate free list is empty, attempt to get the block viammap()For block sizes less than a page, request a page frommmap() and then divide the page into multiple blocksvoidfree( void *ptr)Indexfield in header determinesthe free list on which the block is placed in LIFO orderNo coalescingSeeftp://web.mit.edu/freebsd/head/libexec/rtld-elf/malloc.cBSD 4.3 kernel memory allocator paired Kingsley’s power-of-2 approach for 16 bytes to 2 KiB with another approach using multiples of page size for requests larger than 2KiB
dlmalloc() (Doug Lea, 1987)
Designed for balance across a number of goals; various versions since 1987Includes heuristics,e.g, tryingto allocate a “designated victim”, cachingCombination of three techniquesTable of free lists for fixed-size blocks with payload sizes 8, 16, 24, …, 248 bytes(“small bins”)MRU order for localitySplitting when a particular free list is emptyTable of free lists supporting a bitwisetriefor variable-sized blocks with payloads of 256bytes to 256KiB (“large bins”)LRU order to reduce fragmentationBlocks arranged in descending size orderUsemmap() forrequests greaterthan 256 KiBHeader contains sizeand status of current block as well as status of adjacent blockTrailer is written intofreeblocks, containing size of current blockSeeftp://gee.cs.oswego.edu/pub/misc/malloc.c
Region / Arena (Hanson, 1988)
Designed for variable-sized allocations where related groups of blocks are released all at onceA region consists of a linked listoflarge “arenas”No need to keep a free listwithin an arena orto use headers inthe allocated/free blocksInstead, rely on asimple freepointer in eacharenavoid *allocate(region_tregion,size_tsize )Ifthere is enoughfreespace in the current arena, makea copy ofthe freepointer value toreturn and increment the freepointer bythe size amountOtherwise, add a new arenavoid release(region_tregion)Release all but first arena from the region and reset the freepointerin the first arenaCannot release an individualblockSeeftp://ftp.cs.princeton.edu/techreports/1988/191.pdf
Slab Allocation (Bonwick, 1994)
Designed forfixed-size, pre-initializedallocations withburstypatterns in an operating systemkernelE.g., a control block containing a lock can be initialized once and later used and reused without incurring create/destroy lock overhead on each reuse“Cache”offixed-sized objectsMultiple“slabs” in each cache, each capable of containing multiple objectsDifferent caches for different objectsTwo-level hierarchy of bitmaps to speed searchRoot-level: empty, partial, fullbitmaps (each with onebit perslab)Second-level: bitmap of free blocks withinslab (onebit perblock)void*kmem_cache_alloc(structkmem_cache*cp,intflags )voidkmem_cache_free(structkmem_cache*cp, void *ptr)Seehttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4759
jemalloc() (Jason Evans, 2005)
Designed to avoid fragmentation and to scale on multiprocessors (i.e., locking and caching issues)Various versions since 2005; used in the more recent BSD releases, in Firefox, and by Facebook on serversFour arenas per processor with round-robin assignment of threads to arenas2MiB“chunks”, which can be divided into “runs” that are at least one virtual page in sizeComposed of three allocators (block sizes are from 2006 paper; have since changed)Small – allocations come from a set of runs and are managed using a bitmap at the start of a runThree subclasses: tiny (2, 4, 8 bytes),quantum-spaced (16, 32, 48, …, 512 bytes), and sub-page (1, 2 KiB)Large (4, 8, 12, …, 1020 KiB) – each allocation is a dedicated run and is managed by a binarybuddy system with metadata stored in a chunk headerHuge (1, 2, 3, …MiB) – each allocation has a set of chunks; managed by a single red-black treeSeehttp://jemalloc.net/
GNUglibcmalloc()
Derivedfromptalloc(), which was derived fromdlmalloc()Uses arenas to support multithreaded applicationsSwitch to a separate arena if themalloc() call encounters an arena that is locked byanotherthread (usingpthread_mutex_trylock())Total arenas typically limited to eight times the number of processor coresSeehttps://sourceware.org/glibc/wiki/MallocInternals
GNUglibcObstacks– ObjectStacks
Can define separateobstacksEachobstackis a linked list of “chunks”Defaultchunk size is 4KiBvoid*obstack_alloc(structobstack*obstack_ptr,intsize)You must specify how memory is obtained, e.g., usingmalloc()voidobstack_free(structobstack*obstack_ptr, void *object )Youmust specifyhow memory isfreed,e.g., usingfree()Nested behavior - withina givenobstack, freeing one object automatically frees all other objects allocated morerecentlyPassing a NULL object pointer frees all objects in thatobstack
General References
Paul Wilson,et al., “Dynamic Storage Allocation: A Survey and Critical Review,” Proc. Intl. Workshop on Memory Management, Scotland, Sept.1995ValtteriHeikkilä, “AStudy on Dynamic MemoryAllocation Mechanismsfor Small Block Sizes inReal-Time Embedded Systems,” UniversityofOulu, Departmentof InformationProcessing, Science Master's Thesis, Dec. 2012

0

Embed

Share

Upload

Make amazing presentation for free
Dynamic Memory Allocation - people.cs.clemson.edu