[Exceptional C++ Style] Item 21: Containers in Memory, Part 2: How Big Is It Really?

Thu Jan 13 07:53:27 EST 2005

Item 21: Containers in Memory, Part 2: How big is it really?

In this item we go really deep into the details of memory managers to 
discover why it may cost us 20 megs or more of memory to store 1 meg of 
actual data.

1 JG Question: When you ask for _n_ bytes of memory using *new* or *malloc* 
do you actually use up _n_ bytes of memory? Explain why or why not.

2. Guru Question: How much memory do the various standard containers use to 
store the same number of elements of the same type T?

Solution for the first question: obviously the memory consumption can't be 
less than n. And from the previous item we learned the memory manager itself 
must use some memory to keep track of things, that will add to the amount 
used.
If our block is served from a fixed size arena, the manager must use at 
least one bit for a busy flag, plus if there's a difference between the 
requested size and the block size it adds to the waste too.
If a variable length arena is used, the manager must store the block size or 
some equivalent chaining info to find the rest of the managed memory, And 
the actual block size may be bigger than requested too.

We see a detailed discussion on alignment issues.  On many processors 
accessing a 32-bit item at an address that is not aligned to a 32 bit 
boundary is illegal, on others it is legal but slow. Or similar restrictions 
may apply to other sizes.   The C++ standard requires malloc and op.new to 
return a block of memory that is aligned properly to access anything at its 
start.

The alignment requirements will ask for padding inside structs and at end of 
structs even if we just deal with old C-style arrays of those structs.  And 
the memory manager is influenced by them too as it's shown on figures.

The presence of padding is reflected right in sizeof(struct).  example 
structure having char-long-char will grow to size 12 at 4-byte alignment, 
while a rearranged one having long-char-char would use only 8.  And whenever 
we hear 'contiguous' that means 'certainly including all those gaps for 
alignment'.

To reduce the gaps the programmer shall either think of alignment and put 
the struct elements in optimal order, or at least allow the compiler to do 
rearrangement.  But the compiler must keep items in order between 
occurrences of public/protected/private keyword. To grant full licence for 
rearrangement every data item must be preceded with one of those keywords.

[Huh, do you know many junior gurus having all that knowledge? ;-]

Solution for the senior gurus:

2. How much memory do the various standard containers use to store the same 
number of elements of the same type T?

First we see a recap on Knuth Vol1 material about data structures and the 
theory.

- vector<T> uses a single block of continuous memory and nothing else.
- list<T> is a double-linked list of individual elements, so it stores 2 
pointers beside each element.
- deque<T> is like a list of vectors -- elements live on 'pages' 
(supposedly) several at a time, and those pages are linked.  The amount of 
overhead will depend on the item count for a page, and can be as small as 
1/32 bit per element.
-set/multiset<T> is represented as a binary tree with a must of 2 
pointers/item, but most implementations use also a third "up" pointer for 
convenience, so we wan expect that 3 poiter/item overhead.
-map/multimap<T> is similar to set/multiset just its items are not T but 
pair<key, T>.

And we arrive to the real-life implications, where we put together the 
alignment issues, performance-oriented tradeofs and the theory for the 
minimum stored info.

For vector the 'continuous' means what discussed in the first question.
Fore nodes and chunks in the other containers we shall consider the 
additional overhead of the memory manager.  [What is not mentioned: to deal 
with that overhead the implementation may allocate nodes several-at-a-time. 
Or deque page may have a minimum size at allocation. So the wasted memory 
will be less as the container is filled, but until that  we have memory in 
as unused reserve]

A table gives node structures equivalent to storing units for the 
list/set/etc noded collections.  And sizeof of those nodes are calculated 
for usual alignment and T being char or int.
And we see another interesting implication:  we can arrive to the same node 
size from very different collections, where list<char> (having 1 byte data, 
2 pointers)  will finally use the same 16 bytes in that allocation as node 
for set<int> (4 byte data, 3 pointers).

Summary:
Generally the programmer chose a container by its behaviour, like figures on 
linear/random access or insertion time, item stability, and so on. Having a 
certain property may cost quite an overhead in memory usage.  The programmer 
shall be aware of that. And if the memory appears an issue, it shall be 
investigated with knowledge of the platform and implementation, pure 
abstract theory is not enough, and may turn misleading if alternatives are 
picked only by that.   A related discussion is found in Item 26 of 
Exceptional C++.