Re: [dev] [st] Erasing UTF-8 characters in ed

From: Roberto E. Vargas Caballero <>
Date: Fri, 25 Jul 2014 09:34:58 +0200

> OK, I think I understand, thanks. So basically, there is no "real
> delete character" in VT10X emulators? I couldn't find it in man stty,

There is no Backspace key, but there is a Delete key (in real VT100).
Delete character is 7FH (^?), that is a character that the terminal
has to ignore. If you can see it is because the kernal transforms it
into a charather sequence ('^' '?') that you can see.

The kernel has a line driver, which operates in different modes. When
you are working in ed, the kernel reads all the characthers and
allows line editing and only sends the characters to the user after
a newline. This is the reason why you can configure erase key or word
erase key with stty. The list of keys that the kernel accepts is:

cchars: discard = ^O; dsusp = ^Y; eof = ^D; eol = ^_AT_; eol2 = ^@;
        erase = ^?; intr = ^C; kill = ^U; lnext = ^V; min = 1; quit = ^\;
        reprint = ^R; start = ^Q; status = <undef>; stop = ^S; susp = ^Z;
        time = 0; werase = ^W;

If you want, you can change intr from ^C to c, and then you will kill your
programs using only a common c (it is stupid but you can).

When you are working with bash, the line driver works in another mode,
where the kernel pass directly all the characters directly to the
program, so there is no erase key. In this mode the own program has
to give a meaning to any key that the user press. Terminfo gives
a definition of what is the character generated by each key:

        - bs: Backspace key
        - del1: Delete key

And now, it is decision of the programmer what is the function of each
key. In the case of readline (that it is used today by almost of the
shells), I think it deletes the previous character with ^H, ^?, and bs
characters, and it deletes the current character with del1. It is
important that readline does all the operations to show this 'delete action'
to the user (for example in the case of delete current character in
some terminals it has to rewrite the line until the end).

Other different point is what understand the terminal. I mean, If
you write in a shell 'printf "Hello\033[D\033[P\n' then you will
see in the next line 'Hell', but if you redirect the printf to a file
you will get the original string. It happens because the terminal
understand the sequences you are passing to it, but in this case
the kernel doesn't understand them, and then you see
a string in the screen but the kernel sees another different.

> and it's logical given the explanations in the link you provided [0].
> It seems the only way to delete the character under the cursor is to
> use an escape sequence ("\033[P", or maybe also "\033[1K"). That's
> surprising to me!

This is the only way to delete graphically a character, but it
doesn't mean that the character is deleted from the stream. See
my previous example.
> fr_FR.UTF-8 (more details in the enclosed file)
> > Could you send a file with a small sesion where I could see the problem?
> > (sssion files can be generated using -o option)
> I've attached a session file. It's st from tip, running rc, without
> any "stty erase" — I used my delete key as the erase character. I also
> tried LC_ALL=C.UTF-8, to no avail.

I hope I will have some time this weekend and take a look to your session.


Roberto E. Vargas Caballero
Received on Fri Jul 25 2014 - 09:34:58 CEST

This archive was generated by hypermail 2.3.0 : Fri Jul 25 2014 - 09:36:06 CEST