Re: [hackers] st and combining characters

From: Roberto E. Vargas Caballero <k0ga_AT_shike2.com>
Date: Mon, 15 Jun 2015 10:25:19 +0200

Hi,

        I like the idea, but I think the patch needs some evolution.
A patch of 500 lines is usually hard of reading, and in in this case
the change is not trivial.

On Wed, Jun 10, 2015 at 11:39:25PM +0200, Joakim Sindholt wrote:
> glyph now holds a union of two combining characters and a pointer to a zero
> terminated array. This is purely to make the most of the space. No length is
> kept and every character added past 2 causes a realloc.

And why 2 characters?. I think it is more logical to have only one, which is
the most common case, isn't it?. Maybe I'm missing something.

> There's currently a problem with rendering where Xft doesn't draw the
> combining characters as a part of the character they combine to, which causes
> things like Hangul Jamo to be completely unusable stacks of characters instead
> of the proper combined form.

Uhmm, I don't understand this point, can you explain it a bit better?,
or better, put an example.

> _AT_@ -98,6 +98,7 @@ enum glyph_attribute {
> ATTR_WRAP = 1 << 8,
> ATTR_WIDE = 1 << 9,
> ATTR_WDUMMY = 1 << 10,
> + ATTR_DECOMPPTR = 1 << 11,
> ATTR_BOLD_FAINT = ATTR_BOLD | ATTR_FAINT,
> };
>
> _AT_@ -190,6 +191,10 @@ typedef XftColor Color;
>
> typedef struct {
> Rune u; /* character code */
> + union {
> + Rune c[2];
> + Rune *pc;
> + } dc;

We already have the field u, so why is not the union made with u and pc?

> Line *alt; /* alternate screen */
> bool *dirty; /* dirtyness of lines */
> XftGlyphFontSpec *specbuf; /* font spec buffer used for rendering */
> + int specbuflen;


> _AT_@ -419,8 +426,9 @@ static void ttywrite(const char *, size_t);
> static void tstrsequence(uchar);
>
> static inline ushort sixd_to_16bit(int);
> -static int xmakeglyphfontspecs(XftGlyphFontSpec *, const Glyph *, int, int, int);
> -static void xdrawglyphfontspecs(const XftGlyphFontSpec *, Glyph, int, int, int);
> +static int xmakeglyphfontspecs(XftGlyphFontSpec *, int, const Glyph *, int, int, int);
> +static int xmakeglyphfontspecsintermbuf(const Glyph *, int, int, int);
> +static void xdrawglyphfontspecs(const XftGlyphFontSpec *, Glyph, int, int, int, int);

Uhmmm, these names begin to seem like java names. We must find a better way to name them,
because with names so long the only lower case style doesn't work.

> /* append every set & selected glyph to the selection */
> _AT_@ -984,6 +993,16 @@ getsel(void) {
> continue;
>
> ptr += utf8encode(gp->u, ptr);
> + if(gp->mode & ATTR_DECOMPPTR) {
> + pc = gp->dc.pc;
> + for(pc = gp->dc.pc; *pc; pc++)
> + ptr += utf8encode(*pc, ptr);
> + } else {
> + if(gp->dc.c[0])
> + ptr += utf8encode(gp->dc.c[0], ptr);
> + if(gp->dc.c[1])
> + ptr += utf8encode(gp->dc.c[1], ptr);
> + }
> }

If we define the union with u and pc we can remove the else in this code, no?

> + term.dirty[y] = 1;
> + if(term.line[y][x].mode & ATTR_DECOMPPTR) {
> + p = term.line[y][x].dc.pc;
> + while(*p)
> + p++;
> + sz = (p - term.line[y][x].dc.pc) + 2;
> + term.line[y][x].dc.pc = xrealloc(term.line[y][x].dc.pc, sz * sizeof(Rune));
> + term.line[y][x].dc.pc[sz-2] = u;
> + term.line[y][x].dc.pc[sz-1] = 0;
> + } else {
> + if(!term.line[y][x].dc.c[0]) {
> + term.line[y][x].dc.c[0] = u;
> + } else if(!term.line[y][x].dc.c[1]) {
> + term.line[y][x].dc.c[1] = u;
> + } else {
> + p = xmalloc(4 * sizeof(Rune));
> + p[0] = term.line[y][x].dc.c[0];
> + p[1] = term.line[y][x].dc.c[1];
> + p[2] = u;
> + p[3] = 0;
> + term.line[y][x].dc.pc = p;
> + term.line[y][x].mode |= ATTR_DECOMPPTR;
> + }
> + }
> +}

The same. If the union is made with u almost all off this code disappear.

> _AT_@ -1711,7 +1786,7 @@ tclearregion(int x1, int y1, int x2, int y2) {
>
> void
> tdeletechar(int n) {
> - int dst, src, size;
> + int dst, src, size, i;
> Glyph *line;
>
> LIMIT(n, 0, term.col - term.c.x);
> _AT_@ -1721,13 +1796,17 @@ tdeletechar(int n) {
> size = term.col - src;
> line = term.line[term.c.y];
>
> + for(i = dst; i < src; i++) {
> + if(line[i].mode & ATTR_DECOMPPTR)
> + line[i].dc.pc = NULL;
> + }

...

> _AT_@ -1737,6 +1816,10 @@ tinsertblank(int n) {
> size = term.col - dst;
> line = term.line[term.c.y];
>
> + for(i = src; i < dst; i++) {
> + if(line[i].mode & ATTR_DECOMPPTR)
> + line[i].dc.pc = NULL;
> + }
> memmove(&line[dst], &line[src], size * sizeof(Glyph));
> tclearregion(src, term.c.y, dst - 1, term.c.y);
> }


Maybe we should write a blank in u here, because if we have NULL and blank
everything becomes harder.


> _AT_@ -2774,6 +2857,19 @@ tputc(Rune u) {
> if(sel.ob.x != -1 && BETWEEN(term.c.y, sel.ob.y, sel.oe.y))
> selclear(NULL);
>
> + /* combining chars are added to the previous cell */
> + if(width == 0) {
> + if(term.c.x == 0) {
> + if(term.c.y == 0)
> + taddcchar(u, 0, 0);
> + else
> + taddcchar(u, term.col-1, term.c.y-1);
> + } else {
> + taddcchar(u, term.c.x-1, term.c.y);
> + }
> + return;
> + }
> +

You repeat this logic of term.c.x and term.c.y in several places of the code,
maybe is better to add some function whose name is clear enough to make obvious
this logic.

> _AT_@ -2834,12 +2936,19 @@ tresize(int col, int row) {
> memmove(term.alt, term.alt + i, row * sizeof(Line));
> }
> for(i += row; i < term.row; i++) {
> + for(j = 0; j < term.col; j++) {
> + if(term.line[i][j].mode & ATTR_DECOMPPTR)
> + free(term.line[i][j].dc.pc);
> + if(term.alt[i][j].mode & ATTR_DECOMPPTR)
> + free(term.alt[i][j].dc.pc);
> + }

...

> _AT_@ -2849,14 +2958,26 @@ tresize(int col, int row) {
>
> /* resize each row to new width, zero-pad if needed */
> for(i = 0; i < minrow; i++) {
> + for(j = col; j < term.col; j++) {
> + if(term.line[i][j].mode & ATTR_DECOMPPTR)
> + free(term.line[i][j].dc.pc);
> + if(term.alt[i][j].mode & ATTR_DECOMPPTR)
> + free(term.alt[i][j].dc.pc);
> + }

The same with this logic.

> int
> -xmakeglyphfontspecs(XftGlyphFontSpec *specs, const Glyph *glyphs, int len, int x, int y)
> +xmakeglyphfontspecs(XftGlyphFontSpec *specs, int nspecs, const Glyph *glyphs, int len, int x, int y)
> {
> - float winx = borderpx + x * xw.cw, winy = borderpx + y * xw.ch, xp, yp;
> + float winx = borderpx + x * xw.cw, winy = borderpx + y * xw.ch;

I think the very long name of this function and the number of parameters are very good
indicators we are doing something wrong here, and we should do some rewriting instead
of patching the code.

> _AT_@ -3326,11 +3464,12 @@ xmakeglyphfontspecs(XftGlyphFontSpec *specs, const Glyph *glyphs, int len, int x
> /* Lookup character index with default font. */
> glyphidx = XftCharIndex(xw.dpy, font->match, rune);
> if(glyphidx) {
> - specs[numspecs].font = font->match;
> - specs[numspecs].glyph = glyphidx;
> - specs[numspecs].x = (short)xp;
> - specs[numspecs].y = (short)yp;
> - xp += runewidth;
> + if(numspecs < nspecs) {
> + specs[numspecs].font = font->match;
> + specs[numspecs].glyph = glyphidx;
> + specs[numspecs].x = (short)xp;
> + specs[numspecs].y = (short)yp;
> + }
> numspecs++;
> continue;
> }
> _AT_@ -3401,20 +3540,43 @@ xmakeglyphfontspecs(XftGlyphFontSpec *specs, const Glyph *glyphs, int len, int x
> FcCharSetDestroy(fccharset);
> }
>
> - specs[numspecs].font = frc[f].font;
> - specs[numspecs].glyph = glyphidx;
> - specs[numspecs].x = (short)xp;
> - specs[numspecs].y = (short)(winy + frc[f].font->ascent);
> - xp += runewidth;
> + if(numspecs < nspecs) {
> + specs[numspecs].font = frc[f].font;
> + specs[numspecs].glyph = glyphidx;
> + specs[numspecs].x = (short)xp;
> + specs[numspecs].y = (short)(winy + frc[f].font->ascent);
> + }
> numspecs++;
> }
>
> return numspecs;
> }

Again, I think we could try some way to merge these two sections of code that are
identical (and they are in the same function).


I would like continue the discussion of this patch, but please, if you can find some
way to split it in small patches it will make it easier my life.

Regards,
Received on Mon Jun 15 2015 - 10:25:19 CEST

This archive was generated by hypermail 2.3.0 : Thu Jun 18 2015 - 17:37:10 CEST