Re: [dev] [RFC] Design of a vim like text editor

From: Roberto E. Vargas Caballero <k0ga_AT_shike2.com>
Date: Wed, 17 Sep 2014 09:04:21 +0200

> > Can you explain me why in C you have to represent them as int or long as
> > not with a structure like you did in c++?.
>
> Because I want the convenience of adding two byte counts using +, which I can
> do in C++, and have that compile down to exactly the same assembly as if I used ints,
> but still get a compilation error if I try to add a byte count and a line count.
>

Ains, poor mighty super compiler. Why do you want different count types
for bytes and lines?????, they are size_t. If you want to use the type
system to ensure the correctness of your code, instead of writing
good code then you have an understanding problem about programming.

> Why would I overload the addition on lines ? As you say, there is no canonical
> operation that makes sense for that. I use operator overloading only on types
> where they match their semantic.

This is the thing I said, overloading is only useful in a few cases, and
in almost of these cases a good language could add them to the language
itself. Almost of the c++ code style forbid overloading; Google C++
Style Guide [1]:

Cons:
        - While operator overloading can make code more intuitive, it
          has several drawbacks:
        - It can fool our intuition into thinking that expensive
          operations are cheap, built-in operations.
        - It is much harder to find the call sites for overloaded operators.
          Searching for Equals() is much easier than searching for relevant
          invocations of ==.
        - Some operators work on pointers too, making it easy to introduce
          bugs. Foo + 4 may do one thing, while &Foo + 4 does something
          totally different. The compiler does not complain for either of
          these, making this very hard to debug.
        - User-defined literals allow creating new syntactic forms that
          are unfamiliar even to experienced C++ programmers.
        - Overloading also has surprising ramifications. For instance,
          if a class overloads unary operator&, it cannot safely be
          forward-declared.

I can see you are a novice only with this defense you are doing about
overloading, that is one of the worst aspect of c++, even expert c++
programmers accept this point.

> I work on video games technologies, we have a multi milion lines code base,
> written by 100+ people, we need to be able to organize it and to provide
> abstraction because nobody can fit the whole thing in his mind.

I have worked also in video games, and with a code base of the size
you are mentioning, and I can say you that at least 70% of the code is
due to uglyness of C++. Take a look of st or dwm, that thanks they are
writing in good C they only has 4000/2000 lines of code. If you try
to do the same in C++ you will have near of 50000 for sure.

If you still want to have so big number of lines, then you could take a
look to the source of linux kernel and git, and you will see how you
can structure big programs with C.


> > Try to don't need such kind of things. For example Go hasn't generic and
> > you don't need them ever. You like them because your are so used to them
> > that you don't see the kind of code you are forced to write with them.
>
> That is the complaint I mostly hear about Go, the lack of Generic. That and
> garbage collection.

You should read [2]:

        C++ programmers don't come to Go because they have fought
        hard to gain exquisite control of their programming domain,
        and don't want to surrender any of it. To them, software
        isn't just about getting the job done, it's about doing it
        a certain way.

Learn to think.

> > No. Because the realloc that you have to do maybe has to move in the
> > worst case n-1 elements in a new memory arena. So you have n (moving
> > to the position) * n-1 (memcpy to new position in memory) = n^2. You
> > can decrease the possibility of reallocation allocating
> > more memory of needed, but when you have the reallocation you get
> > the O(n^2). In a list you don't have ever a reallocation.
>
> You already answered that.

No, you don't answered it. You are talking about locality and you forgive
that each time you reallocate you destroy your cache, because you have
to move the full array to a new position of memory.

>
> > If you read what I have said, iy you insert in the HEAD, and always take
> > the HEAD (FIFO fashion), you don't get any problem of locality, because
> > you directly access (or remove) the element you need. And now that you
> > are talking about deletion in dinamyc arrays, can you explain me how do you
> > delete in the middle of them?. AFAIK there are only two ways:
> >
> > [...]
>
> Ok, so on a modern processor, data access pattern matters, with your linked list,
> you basically have a cache miss at each access to the next element when you iterate,
> when you use an array, you get linear access in memory, which means you please the
> prefetcher and avoid most cache misses. That effect on performance is huge, and

I begin to think you don't read.

First you are mixing the point about data structures and memory
layout. You can build a list with a code like this (simplified version,
without free support):

        List *
        allocitem(List *lp)
        {
                static List *buffer;
                static size_t nr;
                List *bp;

                if (nr == 0) {
                        buffer = xmalloc(sizeof(*lp) * NR_BUFF);
                        nr = NR_BUFF;
                }
                bp = &buffer[--nr];
                bp->next = lp;
                return bp;
        }

This code allocates chunks of NR_BUFF and links them in consecutive
order. Run over this list will not generate a miss in each element,
and you don't get ever a reallocation. I thought you could understand
this point with the static version. I hope you understand it now.

Second, I said that lists are good when you only access to the first
element of the list, because then you don't have problems of locality,
you only access to the first element. If you want to run over the
list then is better use another structure.

It's funny people compare list and dynamic arrays, but only optimize
arrays...

> Additionally, the storage overhead of the linked list is interleaved with the
> payload, which means you stress even more your memory accesses.

And you don't stress the cache when you move the full array in
place, or reallocate it to a new position, nooooo. Dynamic arrays
are not an option when the size of the array is a few of thousand.
Another problem, what happens when you store pointers to the objects
of the array in another structure? You cannot do that, so you store
pointers to the objects in the dynamic array, and in each element
you have a miss. There are cases where dynamic arrays are good,
and cases where lists are good, and another cases where fixed size
arrays are the solution, but saying that C uses lists due to problems
of abstractions only means you don't understand anything.


Regards,

[1] http://google-styleguide.googlecode.com/svn/trunk/cppguide.html#Operator_Overloading
[2] http://commandcenter.blogspot.com.es/2012/06/less-is-exponentially-more.html

-- 
Roberto E. Vargas Caballero
Received on Wed Sep 17 2014 - 09:04:21 CEST

This archive was generated by hypermail 2.3.0 : Wed Sep 17 2014 - 09:12:07 CEST