Friday, September 26, 2008

size_t

What a pain in the ass.. size_t is the format one should use in for-loops in C/C++.
I've been doing it. I get it, it's sort of like unsigned int, but tailored for addressing memory, so it's going to be scalable on 64 bit (addressing) architectures and also helps using the full 32 bit because of it's unsigned nature (no more oversights where one is limited to access 2GB of memory instead of 4).

Still it's a pain in the ass !
For example size_t becomes ugly with backward for-loops that use the iterator to index a 0-based array, (unsafe):

for (size_t i=vec.size()-1; i >=0; --i)
{
    // ...
}
..if vec.size() is 0, it will underflow (meaning negative overflow actually) and the loop will iterate 4 billion of times instead of 0 times 8)

So, one has to typecast like such, (safe, up to size_t-1's bits):

for (int i=(int)vec.size()-1; i >=0; --i)
{
    // ...
}

..but in most loops one doesn't really have to address 4 billion things, or 2 billions. So, what a pain in the ass to have to worry about underflow !!

bha

3 comments:

  1. Hmm i dont like backwards loops as a rule anyway.

    rather have:
    for (size_t i = 0; i < vec.size(); i++)
    {
    size_t index = vec.size() - i;
    // ......
    }

    And like you say, when do you ever want to have 2 billion iteration???

    I still use unsigned int for my loops counters.

    ReplyDelete
  2. index = vec.size() - i ..will address your items in the 1..n range (from n to 1), like a Pascal array, not like a 0-based C array which I think we all use nowadays.
    But I guess you could do:
    index = vec.size() - i - 1 and still be safe inside the for-loop.. however you lose the handy "compare to 0".
    I used to write 99% of my loop backwards because I feared a full fledged CMP.. I guess influenced by assembly programming thinking, where comparing to 0 is usually cheaper, and also in general wanting to be sure that the loop checks at least to an immediate value and not to some variable that may have to be picked from memory at every iteration.

    We don't need 2 billion iterations currently, but we could be streaming files bigger than 2GB.

    The problem with programming, especially with C++ and STL is that it's getting into the paranoia domain, where one truly has to develop at a higher level but also at a deeper logical level.
    I think that we can consider that as an unproductive disease for programmers.. when programmers start worrying so much about details that can't possibly give room to creativity outside that clumsy framework (I guess programmers always worried about details, but as we evolve that never seems to be getting better).

    size_t is an example, STL iterators come next.. give me a break ! Too many nested levels of mental masturbation !!

    Fack the C++ experts.. write something useful, stop being a language-dork !!

    ReplyDelete
  3. yeah of course i forgot the -1.

    Talking about asm way of thinking.

    I was surprised to find that inline asm can't be used on 64 bit platforms...

    ReplyDelete