Re: [dev] sfeed: a simple RSS and Atom parser and reader

From: pancake <pancake_AT_youterm.com>
Date: Mon, 6 Aug 2012 00:59:53 +0200

On Aug 5, 2012, at 16:35, Hiltjo Posthuma <hiltjo_AT_codemadness.org> wrote:

> On Sun, Aug 5, 2012 at 4:11 PM, pancake <pancake_AT_youterm.com> wrote:
>> I wrote rss2html with my own xml parser and http protocol (0deps) so many years ago to read my feeds.
> In a previous version I had my own hacky XML parser, but it was too
> hard to manage alot of corner cases imho (CDATA, HTML in XML
> specificly (eeew)). Expat also handles state while parsing a buffer. I
> might rewrite XML parsing though if I find a good alternative.
>

Did you tried with parsifal? Anyway.. My parser was simpler than all that xml-strict foo. So it worked too with corrupted and partially downloaded rss files.

http://hg.youterm.com/mksend/file/14984ebd1529/parsifal

> I like to use curl because it handles https, http redirection and also
> allows me to pass the date of the latest update so HTTP caching will
> work too. But curl can easily be replaced by wget or fetch though.

I end up using wget and using local files with rss2html to process them. Depending on a library for this is imho not suckless
>
>> Actually, the only useful feature was the 'planet' option which sorts/merges all your feeds in a single timeline.
> You can specify multiple feeds in a config file and run sfeed_update
> with this config file as a parameter. Then pipe it through sfeed_html
> .

Config file for what? Specifying a list of feeds should not be in a config file. Maybe in a wrapper script or so.

Iirc A suckless way should be exporting a tsv where the first word of each line is the unix timestamp, so using sort -n should be more unix friendly.

At the end a feed reading should just comvert from various crappy atom/rss formats to an unified tsv output. The rest can be done with grep, sort and awk. Even the html output

>
>> The html output of my tool supports templates so i use it to create a planet.foo website to read news.
> I don't support templates, it's just hard-coded in sfeed_html.c atm.

I would suggest exporting json too. That will make templating work on client side and no need to do any templating system. Static html is good for lynx... Another option i would suggest is to put that template design in config.h
>
>> I end up using twitter. RSS is so retro.
> I actually follow some people via twitter with RSS. I don't use
> twitter though. You can for example use the url:
> https://api.twitter.com/1/statuses/user_timeline.rss?include_rts=true&screen_name=barackobama&count=25

Yeah. I know that, but thats only useful if you use twitter in read only mode.

Can you specify filter for words? Grep will work here?

>
>> I also wanted to have a way to keep synced my already read links. But that was a boring task.
>
> Atm I just mark all items a day old or newer as new in sfeed_html and
> sfeed_plain. In your browser visited links will ofcourse be coloured
> differently.
>

The workflow i would like to have with feeds is:

Fetch list of new stuff
Mark them as:
 - uninteresting (stroke, possibly add new filtering rules)
 - read later (have a separate list of urls to read when i have time)
 - mark as read/unread.
 - favorite (flag as imprtant thing)
 - show/hide all news from a single feed

I understand that this workflow shouldnt be handled by sfeed, because thats a frontend issue. But having html output does not allows me to do anything of that.

With json it would be easy to write a frontend like that easily in javascript ( blame me, but its fast and its everywhere). There's also a minimalist json parser named js0n that can do that from commandline too.

But probably people in this list would expect an awk friendly format instead of json. (tsv can be easily converted to json)

--pancake
Received on Mon Aug 06 2012 - 00:59:53 CEST

This archive was generated by hypermail 2.3.0 : Mon Aug 06 2012 - 01:12:02 CEST