[Exceptional C++ Style] Item 26 -- Data Formats and Efficiency

Paul Grenyer paul at paulgrenyer.co.uk
Mon Jan 31 08:53:06 EST 2005


Hi Lois

Thanks for another superb, and very prompt, summary!

I'm very busy today, so it won't go up on the site until tomorrow.

Somebody must have some comments? I felt a bit lost and didn't follow
exactly how the moves work in Chess. I haven't played for years and never
understood the notation.

Out of interest, I know I'm one of the younger ones here, but is there
anyone who hasn't played chess?

Regards
Paul

Paul Grenyer
email: paul at paulgrenyer.co.uk
web: http://www.paulgrenyer.co.uk
articles: http://www.paulgrenyer.dyndns.org/articles/

"We're all living in Amerika, Amerika ist wunderbar..."



----- Original Message ----- 
From: "Lois Goldthwaite" <lois at loisgoldthwaite.com>
To: <effective-cpp at accu.org>
Sent: Monday, January 31, 2005 12:05 AM
Subject: [Exceptional C++ Style] Item 26 -- Data Formats and Efficiency


> Summary of Sutter Item 26:
>
> Data Formats and Efficiency, Part 1:
> When Compression Is the Name of the Game
>
> A basic understanding of chess is assumed.
>
> JG Question:
> 1. Which of these standard containers uses the least memory to store the
same
> number of objects of the same type T: deque, list, set, or vector?
Explain.
>
> Guru Question
> 2. You are creating a worldwide chess server to store all games ever
played.
> Demonstrate a format for representing a chess game that requires the
> indicated amount of data to store each half-move (one move played by one
> side). Assume each byte has 8 bits.
>
> a) 128 bytes per half-move
> b) 32 bytes per half-move
> c) minimum 4 bytes and maximum 8 bytes per half-move
> d) minimum 2 bytes and maximum 4 bytes per half-move
> e) 12 bits per half-move
>
> Solutions
> 1. Each kind of container in the standard C++ library chooses a different
> trade-off of space vs performance.
>
> - A vector<T> stores T objects in a contiguous array and so has no
per-element
> overhead at all.
>
> - A deque<T> stores T objects in chunks or "pages", each chunk containing
a
> portion of a (notionally) contiguous array. This typically requires one
extra
> pointer of management information per chunk, but that works out to only a
> fraction of a bit in per-element overhead.
>
> - A list<T> is a doubly-linked list of nodes that hold T elements. This
means
> that each T object added to list also brings along two pointers in
overhead.
>
> - A set<T> (and also multiset<T>, map<T>, and multimap<T>) is usually
> implemented as a tree structure with three pointers for each element
added.
>
> Part of choosing an efficient in-memory representation of data is choosing
the
> most space- and time-efficient container which supports the functionality
you
> need. Another part of the task is determining how to represent the data
that
> will be put into those containers, hence the second part of this Item.
>
> 2. What data format will represent a half-move in:
>
> a) 128 bytes: Store the whole board position after each half-move, listing
the
> contents of each of the 64 squares. In this scheme, rank one at the
beginning
> of the game would be rendered as "WRWNWBWQWKWBWNWR". Black pieces are
> prefixed with "B"; unoccupied squares are denoted by '.'.
>
> b) 32 bytes: Keeping the basic strategy of storing the whole board
position,
> we can represent each square in only four bits: three bits to indicate the
> piece (0=K, 1=Q, 2=R, 3=B, 4=N, 5=P, 6=none) plus one bit to store the
> colour.
>
> c) 4 to 8 bytes: We can achieve this by using the old-style "descriptive"
> chess notation: P-K4, RKN1-KB1, P-KN8(Q). No extra trailing null or other
> delimiter is needed, because the move format can be decoded unambiguously.
>
> d) 2 to 4 bytes: Each half-move can be recorded as text in modern
"algebraic"
> chess notation, which requires a minimum of two characters (d4) and a
maximum
> of 4 (Rgf1, gh=Q). Again, no special move delimiter is needed.
>
> e) 12 bits: Since the starting position of each piece is fixed, we could
> encapsulate a half move by simply recording the origin and destination
square
> every time a piece moves. Since there are 64 squares, it takes only six
bits
> to represent one, and therefore 12 bits to represent both. However,
recording
> a pawn promotion would require some additional space beyond the 12 bits.
>
> Lois
> _______________________________________________
> Effective-cpp mailing list
> Effective-cpp at accu.org
> List: http://www.accu.org/mailman/listinfo/effective-cpp
> Project: http://www.paulgrenyer.dyndns.org/cppstyle/
>




More information about the Effective-cpp mailing list