Re: [dev] suckless html to markdown (text)

From: Alexander Krotov <ilabdsf_AT_gmail.com>
Date: Sun, 6 Jan 2019 12:44:11 +0300

> Ideally, with sed/awk, or better in C.

"Parsing" HTML with sed is simply wrong.

You need to use a decent HTML parsing library, as parsing HTML is complex.

There is https://github.com/yujiahaol68/downmark that uses Go html
library, but I have not tried it.

Seriously though, if you are not going to convert HTML to markdown every
day and you are not building a long-term solution, just use pandoc.
Received on Sun Jan 06 2019 - 10:44:11 CET

This archive was generated by hypermail 2.3.0 : Sun Jan 06 2019 - 10:48:07 CET