Re: [dev] [sbase][RFC] Add a simplistic version of tr

From: sin <>
Date: Thu, 28 Nov 2013 12:45:40 +0200

On Tue, Nov 26, 2013 at 12:01:01PM -0800, Silvan Jegen wrote:
> Hi
> This is a braindead and incomplete implementation of tr that only
> works for one-byte encodings. Do you think it makes sense to use this
> implementation as some kind of stopgap-measure until we have a more
> robust version of tr?

This particular version of the patch does not introduce a manpage
which would be necessary to document the limited behaviour of the
current program.

I am starting to wonder, do you guys think it would make sense to
have a staging branch that we can use for incomplete tools? Currently
some of the tools implement a subset of the total behaviour but I'd
like to believe that they implement that subset correctly. As long as
we document that they can go in master with possible eprintf("not implemented");
calls for the options that we care about.

Programs that are obviously buggy can go in the staging branch.

> If you you would rather not take this version, what approach would
> you take for the character set mapping when using UTF-8? A hashmap-,
> or B-tree-based solution or something else entirely?

I am not knowledgeable enough about UTF-8 so I can't answer this.
A B-tree is I think an overkill for sbase. We do not have a nice
implementation of a hash table in sbase as we did not need it but
if we go down that path it makes sense to put this in util/ so other
programs can benefit. Currently we don't have an implementation of
a singly linked list that we can reuse, but that is trivial enough and
we've re-implemented it wherever needed (with the minimum set of
operations needed for each tool). I can send an implementation of
a hash table that I've used for my own programs, MIT/X licensed and it is
simple enough.

Regarding UTF-8, some other programs in sbase also lack proper handling
of UTF-8. Do you think we could embed libutf8 from and
use it?

> +usage(void)
> +{
> + eprintf("usage: tr set1 [set2]\n");
> +}

Use %s and argv0.

> +void
> +handle_escapes(char *s)
> +{
> + switch(*s) {
> + case 'n':
> + *s = '\x0A';
> + break;
> + case 't':
> + *s = '\x09';
> + break;
> + case '\\':
> + *s = '\x5c';
> + break;
> + }
> +}

I have not yet applied this patch but I suspect you have
mixed whitespace + tabs here. Use tabs only.

> + if (ferror(stdin)) {
> + eprintf("<stdin>: read error:");
> + return EXIT_FAILURE;
> + }

Indentation issues.

I'll have a look at the rest of the code once I have
some time today.

Received on Thu Nov 28 2013 - 11:45:40 CET

This archive was generated by hypermail 2.3.0 : Thu Nov 28 2013 - 11:48:06 CET