[Exceptional C++ Style] Item 21: Containers in Memory, Part 2: How Big Is It Really?
Balog Pal
pasa at lib.hu
Thu Jan 13 07:53:27 EST 2005
Item 21: Containers in Memory, Part 2: How big is it really?
In this item we go really deep into the details of memory managers to
discover why it may cost us 20 megs or more of memory to store 1 meg of
actual data.
1 JG Question: When you ask for _n_ bytes of memory using *new* or *malloc*
do you actually use up _n_ bytes of memory? Explain why or why not.
2. Guru Question: How much memory do the various standard containers use to
store the same number of elements of the same type T?
Solution for the first question: obviously the memory consumption can't be
less than n. And from the previous item we learned the memory manager itself
must use some memory to keep track of things, that will add to the amount
used.
If our block is served from a fixed size arena, the manager must use at
least one bit for a busy flag, plus if there's a difference between the
requested size and the block size it adds to the waste too.
If a variable length arena is used, the manager must store the block size or
some equivalent chaining info to find the rest of the managed memory, And
the actual block size may be bigger than requested too.
We see a detailed discussion on alignment issues. On many processors
accessing a 32-bit item at an address that is not aligned to a 32 bit
boundary is illegal, on others it is legal but slow. Or similar restrictions
may apply to other sizes. The C++ standard requires malloc and op.new to
return a block of memory that is aligned properly to access anything at its
start.
The alignment requirements will ask for padding inside structs and at end of
structs even if we just deal with old C-style arrays of those structs. And
the memory manager is influenced by them too as it's shown on figures.
The presence of padding is reflected right in sizeof(struct). example
structure having char-long-char will grow to size 12 at 4-byte alignment,
while a rearranged one having long-char-char would use only 8. And whenever
we hear 'contiguous' that means 'certainly including all those gaps for
alignment'.
To reduce the gaps the programmer shall either think of alignment and put
the struct elements in optimal order, or at least allow the compiler to do
rearrangement. But the compiler must keep items in order between
occurrences of public/protected/private keyword. To grant full licence for
rearrangement every data item must be preceded with one of those keywords.
[Huh, do you know many junior gurus having all that knowledge? ;-]
Solution for the senior gurus:
2. How much memory do the various standard containers use to store the same
number of elements of the same type T?
First we see a recap on Knuth Vol1 material about data structures and the
theory.
- vector<T> uses a single block of continuous memory and nothing else.
- list<T> is a double-linked list of individual elements, so it stores 2
pointers beside each element.
- deque<T> is like a list of vectors -- elements live on 'pages'
(supposedly) several at a time, and those pages are linked. The amount of
overhead will depend on the item count for a page, and can be as small as
1/32 bit per element.
-set/multiset<T> is represented as a binary tree with a must of 2
pointers/item, but most implementations use also a third "up" pointer for
convenience, so we wan expect that 3 poiter/item overhead.
-map/multimap<T> is similar to set/multiset just its items are not T but
pair<key, T>.
And we arrive to the real-life implications, where we put together the
alignment issues, performance-oriented tradeofs and the theory for the
minimum stored info.
For vector the 'continuous' means what discussed in the first question.
Fore nodes and chunks in the other containers we shall consider the
additional overhead of the memory manager. [What is not mentioned: to deal
with that overhead the implementation may allocate nodes several-at-a-time.
Or deque page may have a minimum size at allocation. So the wasted memory
will be less as the container is filled, but until that we have memory in
as unused reserve]
A table gives node structures equivalent to storing units for the
list/set/etc noded collections. And sizeof of those nodes are calculated
for usual alignment and T being char or int.
And we see another interesting implication: we can arrive to the same node
size from very different collections, where list<char> (having 1 byte data,
2 pointers) will finally use the same 16 bytes in that allocation as node
for set<int> (4 byte data, 3 pointers).
Summary:
Generally the programmer chose a container by its behaviour, like figures on
linear/random access or insertion time, item stability, and so on. Having a
certain property may cost quite an overhead in memory usage. The programmer
shall be aware of that. And if the memory appears an issue, it shall be
investigated with knowledge of the platform and implementation, pure
abstract theory is not enough, and may turn misleading if alternatives are
picked only by that. A related discussion is found in Item 26 of
Exceptional C++.
More information about the Effective-cpp
mailing list