[Effective-cpp] Item 1: Uses and Abuses of vector

Tue Oct 26 17:48:58 EDT 2004

Gregory Haley wrote:
[SNIP]
> I would like to add to the summary two bits that have been overlooked
> up to now. 
> 
> 4. (p. 6) avoid unecessary recalculations.  This is pertinent
> in a for loop.  If the for loop has this structure:
> 
> for (vector<int>::const_iterator it = vint.begin(); it != vint.end();
> ++ it )
> 
> the value of vint.end() has to be calculated each time
> through the loop.  On small vectors this would have no
> appreciable effect, but for very large vectors, and I would
> guess for vectors of fairly complicated data structures, the
> overhead of this recalculation could be quite significant.

Now let's stop here for a moment! :-)  Could you please tell us *which*
vector implementation are you talking about?  I only ask, because in _all_
implementations I know vector::end() is an O(1) operation.  Futhermore some
vector implementation store end() (random access iterator a.k.a. pointer)
instead of size(), so end() is a simple getter of a member variable.
Copying it (as an optimization) is (as far as I know) normally a
pessimization.

> I like the advice to calculate the value of vint.end() once
> outside the body of the loop, and use that constant value for
> the terminal point of the loop.

Please let us know what makes you think that end() is calculated.  I ask
this, because I have seen *many* times the error, that people prematurely
optimized (away the nothing) by moving end() out of the loop, and then
started to modify the vector inside.  You can imagine the horror of running
such code.

I have to say, that so far I am convinced that the above advice is a red
herring.  It actually makes the code slower (in most platforms, copying the
value of end() into a variable unecessarily) as well as it introduces a
sleeper-bug into the code.  Shall the vector be modified (later, think of
maintenance) inside the loop body, you are looking at few nights of heavy
debugging.  Since it will only crash occasionally, under high load...

I say the above based on experiences with 4 standard library implementations
on 3 platforms.  Please tell us why do you think the premature optimisation
justifies the possible maintenance nightmare in this case?  Did you profile
some specific code?  What makes you think that calling end is expensive?

> (though, physician 'heal thyself' would be appropriate here,
> since I always turn to the less efficient manner in my own
> code -- since my fingers almost automatically key the code in :) .

And that is the idiomatic way to write the code, and IMHO it is not a
coincidence. :-)

> 5. (p. 7) prefer '\n' to endl.  On some systems, the overhead
> of calling endl can be significant. In addition, in web
> applications, where probably 75% of the code involves writing
> output to the web browser, using '\n' can dramatically
> improve response times.

This part I absolutely agree with.  But a tiny little point to add: you also
want to make sure you do have an endl or a \n at the end of your output,
since on some Unices funny things happen if you don't. :-)

Attila