Re: [dev] lex and yacc

From: Louis Santillan <lpsantil_AT_gmail.com>
Date: Sun, 31 Mar 2019 22:45:04 -0700

On Sun, Mar 10, 2019 at 6:48 AM <sylvain.bertrand_AT_gmail.com> wrote:
>
> On Sun, Mar 10, 2019 at 06:17:16AM +0100, Markus Wichmann wrote:
> > Well, other people have made that point before: Why use a regex to
> > identify a token when a simple loop will do?
> >
> > So for lexing, usually a simple token parser in C will do the job
> > better. And for parsing, you get the problem that yacc will create an
> > LALR parser, which is a bottom-up parser. Which may be faster but
> > doesn't allow for good error messages on faulty input ("error: expected
> > this, that, or the other token before this one"). That's why top-down
> > recursive-descent parsers (or LL(1) parsers) are superior. Maybe
> > supplemented with a shunting-yard algorithm to get the binary
> > expressions right without having to call layer after layer of functions.
>
> This is exactly what I am experiencing while coding this little/simple custom
> language parser.
> Yep, I guess lex/yacc (then GNU flex/GNU bison) are inappropriate, I even would
> generalize to they do not belong in "suckless".


There's options. Have you tried Lemon Parser [0] or miniyacc + qbe
[1][2]? ucpp [3] lexes/parses C-like languages with C pre-processing.
re2c [4] is a great lexer. Crockford prefers Pratt's Top-Down
Operator Precedence [5][6] and his webpage source repo even includes a
nifty lexer that is easy to translate from JS to C [7].

HTH,

[0] https://www.hwaci.com/sw/lemon/
[1] http://c9x.me/yacc/
[2] http://c9x.me/compile/
[3] https://github.com/lpsantil/ucpp
[4] http://re2c.org/
[5] http://crockford.com/javascript/tdop/tdop.html
[6] https://www.oilshell.org/blog/2016/11/02.html
[7] https://github.com/douglascrockford/TDOP/blob/master/tokens.js
Received on Mon Apr 01 2019 - 07:45:04 CEST

This archive was generated by hypermail 2.3.0 : Mon Apr 01 2019 - 07:48:08 CEST