instaparse

If you're not trampolining your parser, why bother getting up in the morning?
mlimotte 2018-08-02T18:10:30.000427Z

Can I get some help with a (hopefully) a simple grammar? I haven’t done much with CFGs so I could be totally off base. I want to find variable expressions in a string. For example: hello, {{name}}. This is similar to the Mustache variety of interpolation, but I need to pre-parse it to do something slightly different. I can recognize the pattern above pretty easily. My problem is having it ignore single brackets. For example: hello, {{name}}. Please choose {Yes, No}. The last part is not a double bracket expression and should just be treated like the other uninteresting text. So, my grammar looks like this (I’ve tried a bunch of other variations, this is the closest I’ve come):

(def p 
  (insta/parser
    "<S> = (block | TXT)*
     block = <'{{'> TXT <'}}'>
     <TXT> = (OPEN | CLOSE | A | block)*
     <OPEN> = !'{' '{'
     <CLOSE> = !'}' '}'
     <A> = #'[^{}]*'"))

mlimotte 2018-08-02T18:12:03.000393Z

A call (p "x{a}") yields:

=> Parse error at line 1, column 2:
x{a}
 ^
Expected one of:
"{{"
"}"
NOT "{"

mlimotte 2018-08-02T18:12:54.000338Z

Seems like the x got picked up by <A>. I would have liked it to match !‘{’, so that the next char could match in <OPEN>

aengelberg 2018-08-02T18:13:07.000172Z

try changing the OPEN and CLOSE rules to

&lt;OPEN&gt; = '{' !'{'
&lt;CLOSE&gt; = '}' !'}'

mlimotte 2018-08-02T18:14:34.000506Z

😄

mlimotte 2018-08-02T18:14:38.000449Z

That seems to work.

aengelberg 2018-08-02T18:15:10.000293Z

The problem in the original grammar was that the negative lookahead was conflicting with the token itself. It was basically saying "If there isn't an open bracket, please parse an open bracket"

aengelberg 2018-08-02T18:15:26.000439Z

Whereas what you really want is "Please parse an open bracket but only if there isn't another open bracket right after"

mlimotte 2018-08-02T18:15:53.000503Z

hmm.. ok, i think that makes sense to me.

mlimotte 2018-08-02T18:16:06.000513Z

Very cool. Thanks for the quick help!

aengelberg 2018-08-02T18:16:12.000082Z

no problem

mlimotte 2018-08-02T18:18:58.000506Z

Here’s an edge case that still fails. But it’s a bit contrived, so if it’s not a trivial fix, I don’t need to worry about it. (p "{{y}")

aengelberg 2018-08-02T18:19:11.000332Z

do you want that to parse as normal text?

mlimotte 2018-08-02T18:19:21.000488Z

yep

mlimotte 2018-08-02T18:19:43.000336Z

not a block

aengelberg 2018-08-02T18:21:38.000386Z

maybe something like

&lt;S&gt; = TXT*
block = &lt;'{{'&gt; TXT &lt;'}}'&gt;
&lt;TXT&gt; = (block / A)*
&lt;A&gt; = #'[^{}]*' | '{' | '}'

aengelberg 2018-08-02T18:22:27.000444Z

here I'm changing the A rule to match any text (including brackets and double brackets) but then using the ordered choice (`/`) to prefer parsing complete blocks when possible.

mlimotte 2018-08-02T18:24:41.000041Z

oh.. that’s great. I had tried an approach like that previously, but didn’t know how to prefer one parse over another … that / operator is new to me.

aengelberg 2018-08-02T18:24:57.000363Z

👍

mlimotte 2018-08-02T18:25:28.000254Z

thanks for your help, again

aengelberg 2018-08-02T18:25:33.000177Z

np