Re: [dev] Lexers and parsers from Maurício on 2009-08-20 (dev mail list archive)

From: Maurício <mauricio.antunes_AT_gmail.com>
Date: Thu, 20 Aug 2009 12:31:50 -0300

(2/2 - I believe these messages didn't went to
the list. Sorry if they actually did.)

>> (...) If you had such "shell yacc", how would you like it to be
>> or behave?

> (...) So the important thing is being able to whip something up
> quickly; this isn't parser "specs" that's going to be carefully
> developed and then used for a very long time.

Sure. I want something that helps testing and can deal with
complex input, or even input with unknown structure to which you
want to check if one works, even if temporarily. Example: someone
gives you some unorganized data and you just want to transform it
into something you can deal with.

> A general point: one of the most important things to think
> about, particularly with parsers, is what would be most
> effective in tracking down the inevitable problems when there's
> a bug in the user input and/or mismatched input, particularly if
> it happens in the middle of a pipe process: how are you going to
> report which part of the input stream was wrong, particularly if
> it doesn't exist on its own, in a way which is effective for a
> human to track down the problem? (...)

The exact answer will probably depend on the chosen grammar type
and parsing algorithm. Allowing specified limits on match size or
deepness of analysis we could get error logs to be readable.

However, "errors" in these tools should not be errors in a strict
sense. I do want to write a tool that you can use to check grammar
hyphotesis on text, and that means that even if you don't get
fatal errors you still want to know how well your grammar did with
some input, and get meaninfull report on, say, how long a match
had to be to solve ambiguity, how deep an analysis had to be to
find a match, which false matches were more common etc, and you
want this report to be good for automated analysis.

Best,
Maurício
Received on Thu Aug 20 2009 - 15:31:50 UTC

This archive was generated by hypermail 2.2.0 : Thu Aug 20 2009 - 15:36:03 UTC