Re: [dev] [PATCH][RFC] Add a basic version of tr

From: Szabolcs Nagy <nsz_AT_port70.net>
Date: Wed, 15 Jan 2014 21:36:07 +0100

* Silvan Jegen <s.jegen_AT_gmail.com> [2014-01-15 20:43:54 +0100]:
> Note, though, that GNU's tr does not seem to handle Unicode at all[1]
> while this version of tr, according to "perf record/report", seems to
> spend most of its running time in the Unicode handling functions of glibc.

multi-byte string decoding is known to be slow in glibc

eg see the utf8 decoding benchmark in
http://www.etalabs.net/compare_libcs.html

> By no means was this any serious benchmarking but eliminating the function
> pointer did not seem to make an obvious difference.

note that recent gcc (4.7?) can do function pointer inlining
if it can infere that the function is in the same tu
(and with lto it can probably do cross-tu inlining)

> +void
> +handleescapes(char *s)
> +{
> + switch(*s) {
> + case 'n':
> + *s = '\x0A';
> + break;
> + case 't':
> + *s = '\x09';
> + break;
> + case '\\':
> + *s = '\x5c';

what's wrong with '\n' etc here?

btw a fully posix conformant tr implementation is available here:
http://git.musl-libc.org/cgit/noxcuse/tree/src/tr.c

(but this is probably outside of the scope of sbase)
Received on Wed Jan 15 2014 - 21:36:07 CET

This archive was generated by hypermail 2.3.0 : Wed Jan 15 2014 - 21:48:08 CET