Re: [dev] rebooting the web (it was: surf rewrite for WebKit2GTK)

From: Anthony J. Bentley <>
Date: Fri, 31 Oct 2014 14:17:32 -0600

FRIGN writes:
> On Fri, 31 Oct 2014 20:04:58 +0100
> Alexandre Niveau <> wrote:
> > <!DOCTYPE html>
> > <meta charset=UTF-8>
> > <title>Page title</title>
> > <p>Hello world</p>
> This is not valid XHTML.

Then that’s XHTML’s deficiency. This has been valid HTML (except the
doctype) for over 15 years. There’s no ambiguity either; all XML would
do is add redundant elements.

> > I'd say it's hard to suck less than that as far as HTML goes...
> Well, look at what XHTML 2.0 tried to achieve (it was a step in the
> right direction). I'll never use HTML5 for the simple reason that
> it's a bloated hell. So you better not insult the suckless-philosophy
> with some HTML5-smartness.

HTML is hardly what I would call a good language. But XHTML is no better.
Talk about bloat—xslt? seriously? HTML, in practice, is simpler and saner
than XHTML.

> > Also it's worth noting that while it's still recommended to keep the
> > meta charset tag in there, using any encoding other than UTF-8 is
> > invalid HTML5 [3].
> No, read your link again. It was talking about XML-documents, which
> actually declare the charset in a sane place (at the bloody beginning).

The sane place is the HTTP header. Well, saner would be to assume UTF-8
by default, but this is the next best option.

> > I believe that all these simplifications do not break backwards
> > compatibility too much (that's the whole point), but I'm not certain.
> > Maybe that's the reason why you still have to use XHTML?
> Have to? XHTML is my weapon of choice, because it is not parsed with
> a stupid and bloated SGML-parser but with an XML-parser.

Come on. HTML hasn’t been parsed as SGML since the early 90s.

HTML5 has been some steps forward and some steps back. But one of the
unambiguously good things they did was drop any pretense of SGML
compatibility, and introduce well‐defined error handling rules (instead
of the XML practice of dropping things on the floor as soon as it sees
a missing angle bracket).

Talk about error handling. What should happen when you get some mojibake
in UTF-8 output? Obviously you should replace with U+FFFD and continue
as normal, like everything else does. But XML will terminate processing
right there… if you’re lucky. More likely, you’ll see a pretty yellow
screen of death and zero meaningful content. What a robust technology to
base the suckless world on.

The real disgusting parts of HTML5 are CSS and Javascript, and the XML
bits that are seeping in, like MathML. An XHTML world would embrace
those, or replace them with alternatives that are even worse.

XML is just SGML with a little air freshener sprayed over it.

Anthony J. Bentley
Received on Fri Oct 31 2014 - 21:17:32 CET

This archive was generated by hypermail 2.3.0 : Fri Oct 31 2014 - 21:24:08 CET