[Effective-cpp] Item 1: Uses and Abuses of vector

Balog Pal pasa at lib.hu
Wed Oct 27 14:02:33 EDT 2004


> > OTOH, loop optimization is a special area in the optimizer,
> > and a good one will go lengths to find invariants, removing
> > them from loops, on such one we will see identical binary.
> > Locals are pretty well optimized, especially ones with
> > trivial data flow -- so I can't see any chance to observe
> > pessimized code.
>
> I can.  If accessing the local is the same cost as accesing end(), your
code
> is longer with that very assignment to the local.  And in addition more
> fragile.

That would require a completely useless optimizer. As I mentioned, access to
locals is the easiest optimisation.

> > As I mentioned in my first post on this issue, I write loop
> > headers a certain way to express the intent, not for speed
> > reasons.  Precalculated end means the the container will be
> > stable, call end() in the test means it will
> > change in some cases.   My personal meta-info. :)
>
> My problem with this approach is that it only works in the ideal case.
The
> expression "My personal" does not exist in large teams or large
> organizations.

At large organisations 'my personal' will be substituded for some company
policy. Backed up by a review system enforcing it.  (Hopefully ;)

If I were to write one such policy I'd simply shotrcut your problem-set:
forbidding that for() iteration on a vector that adds or removes elements in
the body.

Such operation shall be rare by nature, and for most using a vector is an
abuse in its own right (wow, we're on topic ;).   And even f we really want
the vector, the operation must be carried out with extreme care.  Looking
every step, checking how the ins/del influences out current cursor.  Even if
you just push_back() elements, and have enpugh reserve, it is still unclear
whether you want the iteration include the newly added elements, or stop at
the former end.   All that stuff shall be there commented.

> >> But I have seen broken code too
> >> many times, where end() was pre-evaluated and the vector (later)
> >> changed inside the loop.
> >
> > Well, that happens when the tail swings the dog.
>
> Wag the dog.  I don't think so.  It happens, when unexperienced people
write
> C++ code, or touch existing C++ code.  Which means it happens every day.

In "in large teams or large organizations"?
Unexperienced or unaware people will always wreak havoc -- you can suggest
them anything you like.  IMHO books like the one we discuss, and this
discussion is aimed at those already having the basic experience, and look
for ideas to improve.

The primary purpose to write some code is to do some behavior. Do what is
intended.  Speed and other optimisation issues are secondary by nature.
In a professional environment it is even more the case, there I wouldn't
even think certain shape of code suggest optimisation, it shall reflect the
intent instead.

So seeing end() set aside in a const variable to me suggests 'end() will not
change' and not something else.  Then if I see vector getting modified in
the loop, it shall trigger an error.

> > No one would likely
> > trade speed for incorrectness. If iterators get invalidated
> > inside the loop, you shall not
> > precalculate them.
>
> When?  Where I work a code-base is usually used for 10+ years.
>
> > But in those case just having the end() test is
> > not enough either -- as you'll likely invalidate the running iterator
> > as well.
>
> Not necessarily.

Not necessarily, but if you want to play safe, the baseline assumptetin is
'iterators are invalid'.  If they are not, there must be a proof, why.  And
it shall be there with the code.  Otherwise the resulting code will be very
fragile.

Suppose the example mentioned -- reserve at beginning of function, then add
element in the iteration, expecting you stay within capacity. Thus just use
the iterators. Someone may change the code later removing that reserve()
thinking it just another optimisation step.  Or comes up with some other
optimistic guess for the item count -- thinking it covers most cases, and
the rest may go with an extra reallocation.

For such cases you better annonce the parts necessary to work in ensemble.
And in the process I see little chance to introduce the bug with the loop
end condition.

> > Unless you expect the code to be literary -- and suggest the intent.
> > In that case set-aside end mean stable vector, and not a means to
> > gain some cycles.
>
> That may be true for you, but I do not think it is a convention yet.

Conventons are tied to the environment. :)  Certainly if the codebase you
work on is not shaped this way you can not use that convention -- even if it
wolud be a worldwide accepted and used idiom.

> >> Can anyone show any measurements or any fact/proof that this
> >> optimization  is actually worth the risk of broken code?

> > IMHO that is a wrong approach to the problem, whatever the data would
> > show. Suppose two cases: A) showing big speed difference and B)
> > showing no difference.  Would that change your opinion on the
> > issue?
>
> Yes.
>
> > If in your environment people tend to screw up, will
> > that be better to know in the rest of cases they gain speed?
>
> If doing so gains speed than it is worth to teach people when to do so and
> when not.  So that they will not screw up.  If it does not or rarely does,
> then it is much-much easier to just teach the idiomatic form with no
> pre-fetched end().

That covers only half of the issue.

> > Or if it turns out equal, will sticking to direct
> > end() calls remove *all* the related problems?
>
> I cannot see how that question is relevant.

It is tied to the complex problem of 'iterating a vector and reshaping it
en-route'. Messing up the end-condiotion is only one of the dangers.  If
people think it is the only one, they will fall to pray to the other issues.
(some may get pretty subtle, depending on random actual capacity of the
vector).

IMHO if you cover this whole prolem, messing up the 'cacheing end' issue
just goes away by itself.  Then there's no point to force either to cache it
or not.





More information about the Effective-cpp mailing list