On Sun, Mar 15, 2026 at 01:22:55PM +0000, amrit44404 wrote:
> From 39dd8d1a573f76d969a8c55e80358ec33a1c6c76 Mon Sep 17 00:00:00 2001
> From: amritxyz <amrit44404_AT_proton.me[1]>
> Date: Sun, 15 Mar 2026 18:43:11 +0545
> Subject: [PATCH] st: fix C1 bytes (0x80-0x9F) shown as garbage in UTF-8
> mode
>
> Raw C1 bytes are not valid UTF-8. utf8decode() returns U+FFFD for
> them which gets drawn on screen as a replacement character.
>
> Fix this by skipping C1 bytes in twrite() before utf8decode() sees
> them. The ESC_STR guard lets them through when inside a STR sequence
> so they can still act as sequence terminators.
>
> Also add an early return in tputc() as a safety net for any direct
> callers, and call strhandle() when a C1 byte terminates a STR
> sequence so OSC sequences are not silently lost.
>
> Tested: printf '\x8f' now produces no output.
> ---
> st.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/st.c b/st.c
> index 6f40e35..d0bf933 100644
> --- a/st.c
> +++ b/st.c
> _AT_@ -2396,6 +2396,9 @@ tputc(Rune u)
> Glyph *gp;
>
> control = ISCONTROL(u);
> + /* in UTF-8 mode, ignore C1 control characters early */
> + if (IS_SET(MODE_UTF8) && ISCONTROLC1(u) && !(term.esc & ESC_STR))
> + return;
> if (u < 127 || !IS_SET(MODE_UTF8)) {
> c[0] = u;
> width = len = 1;
> _AT_@ -2455,8 +2458,11 @@ check_control_code:
> */
> if (control) {
> /* in UTF-8 mode ignore handling C1 control characters */
> - if (IS_SET(MODE_UTF8) && ISCONTROLC1(u))
> + if (IS_SET(MODE_UTF8) && ISCONTROLC1(u)) {
> + if (term.esc & ESC_STR_END)
> + strhandle();
> return;
> + }
> tcontrolcode(u);
> /*
> * control codes are not shown ever
> _AT_@ -2546,6 +2552,11 @@ twrite(const char *buf, int buflen, int show_ctrl)
>
> for (n = 0; n < buflen; n += charsize) {
> if (IS_SET(MODE_UTF8)) {
> + /* skip C1 bytes before utf8decode() mangles them */
> + if (ISCONTROLC1(buf[n] & 0xFF) && !(term.esc & ESC_STR)) {
> + charsize = 1;
> + continue;
> + }
> /* process a complete utf8 char */
> charsize = utf8decode(buf + n, &u, buflen - n);
> if (charsize == 0)
> --
> 2.53.0
>
> References
>
> 1. mailto:amrit44404_AT_proton.me (link)
Hi,
The patch looks garbled (no TAB indent). Can you fix it and resend?
Also are there particular applications where you noticed this?
I hope this doesn't break anything...
--
Kind regards,
Hiltjo
Received on Sun Mar 15 2026 - 16:35:00 CET