On 11 June 2016 at 07:34, <k0ga_AT_shike2.com> wrote:
> Strings are not idenpotent. In C strings, any pointer inside
> of the string is a new string. Splitting strings is only
> writing a 0. Splitting strings in Pascal strings require to
> allocate a new chunk of memory and copy all the characters.
This is fixed with slices (like in Go) though, as those are not,
struct string { size_t size; char ar[]; };
but,
struct string { size_t size; char *ptr; };
> Fixed maximum size. Pascal strings used a byte for the size,
> and it meant that you could not have strings bigger than 256.
> Of course you can increment this size to whatever you want,
> but then you waste a lot of space.
Clearly the problem with the above is that there is a word of
overhead, instead of only a byte. Although note that since you can use
`struct string s` in place of `char *s`, the pointer itself adds no
additional overhead.
> In both strings you can mess everything if you access out of the limits,
> so they have the same problem.
A size field does make it much more efficient to perform a bounds
check though, which makes it easier to be absolutely sure (albeit with
a performance hit); e.g.
bool inbounds(struct string *s, size_t n) {
return n < s->size;
}
vs.
bool inbounds(char *s, size_t n) {
for (size_t i = 0; i <= n; i++)
if (s[i] == '\0')
return false;
return true;
}
So, so long as you're willing to take the performance and space hit,
slices probably are safer. But since everything I've said above is
also true for arrays, and C arrays don't have explicit length either,
not very much is really gained. Ultimately, if you want that kind of
assurance of safety at the cost of performance, you'd probably be
better off using a memory-safe language instead of C.
Although one unfortunate side effect of C strings is that as soon as
you have to deal with the possibility of null bytes, everything gets a
bit awkward.
cls
Received on Sat Jun 11 2016 - 10:12:48 CEST
This archive was generated by hypermail 2.3.0
: Sat Jun 11 2016 - 10:24:11 CEST