Re: [dev] xml parser

From: Hiltjo Posthuma <hiltjo_AT_codemadness.org>
Date: Sun, 3 Feb 2019 01:36:33 +0100

On Sat, Feb 02, 2019 at 06:15:26PM +0000, sylvain.bertrand_AT_gmail.com wrote:
> Hi,
>
> I am looking at xml parsers.
>
> I am about to go expat, but I am wondering if there are some interesting
> alternatives I did miss?
>
> --
> Sylvain
>

Hi,

I suck at writing parsers, but wrote one:
        https://git.codemadness.org/xmlparser/file/README.html

It is a non-compliant non-strict parser that also parses a subset of HTML.

Some examples/projects I use it for:

RSS/Atom parser:
        https://git.codemadness.org/sfeed/file/README.html

Twitter to text scraper:
        https://git.codemadness.org/tscrape/file/README.html

OpenStreetMap parser (20GB+ XML file of the Netherlands):
        https://git.codemadness.org/osm-zipcodes/file/README.html

HTML <title> grabber:
        https://git.codemadness.org/grabtitle/file/README.html


It doesn't have to be said but ofcourse XML sucks :)

-- 
Kind regards,
Hiltjo
Received on Sun Feb 03 2019 - 01:36:33 CET

This archive was generated by hypermail 2.3.0 : Sun Feb 03 2019 - 01:48:07 CET