instaparse

If you're not trampolining your parser, why bother getting up in the morning?
borkdude 2017-12-16T17:42:57.000041Z

Hello. Why is my InstaParser so slow? 850ms vs 8ms hand-written Clojure: https://github.com/borkdude/aoc2017/blob/master/src/day16.clj#L119

borkdude 2017-12-16T17:43:09.000064Z

Maybe it’s my grammar, but I don’t see it.

aengelberg 2017-12-16T17:43:12.000109Z

How big is the input?

borkdude 2017-12-16T17:43:27.000011Z

The input is this: https://github.com/borkdude/aoc2017/blob/master/resources/day16.txt

borkdude 2017-12-16T17:44:35.000004Z

Here’s the grammar: https://github.com/borkdude/aoc2017/blob/master/src/day16.clj#L17

aengelberg 2017-12-16T17:48:34.000080Z

I mean, you're basically generating one string per character in a 40kb file. I'm not surprised it's slow.

aengelberg 2017-12-16T17:48:51.000067Z

And wrapping with vectors, etc

borkdude 2017-12-16T17:50:31.000086Z

That’s fair, but even without wrapping the arguments I get a similar time

borkdude 2017-12-16T17:50:48.000041Z

I had that before, but I wanted to transform the arguments to ints, that’s why I wrapped them later

aengelberg 2017-12-16T17:51:33.000068Z

Hmm yeah

borkdude 2017-12-16T17:53:21.000093Z

I mean, it’s not really a problem, but just curious why or if I made a mistake in my grammar

aengelberg 2017-12-16T17:53:55.000100Z

I don't think you made a mistake like an ambiguity issue or anything

aengelberg 2017-12-16T17:54:23.000039Z

It's just really exercising the parser's dataflow overhead

aengelberg 2017-12-16T17:55:15.000048Z

I saw similar issues when trying to perf-tune @dave's Alda parser a while ago

aengelberg 2017-12-16T17:56:09.000061Z

Because his rules were like "if you see this single character, parse this other single character"

aengelberg 2017-12-16T17:57:41.000023Z

Instaparse does a lot of bookkeeping during a parse to make sure it magically works with weirdly recursive grammars, so for super low level parsers like this it doesn't exactly shine

borkdude 2017-12-16T17:57:57.000037Z

(def parse2
  (insta/parser
   "<INPUT>       = (INSTRUCTION <','>)+ INSTRUCTION 
    <INSTRUCTION> = SPIN | EXCHANGE | PARTNER
    SPIN          = <'s'> POSITION
    EXCHANGE      = <'x'> POSITION <'/'> POSITION
    PARTNER       = <'p'> PROGRAM  <'/'> PROGRAM
    <POSITION>    = #'\\d\\d?'
    <PROGRAM>     = #'[a-p]'"))
No nesting of position and program, 800ms

borkdude 2017-12-16T17:58:15.000060Z

ok

aengelberg 2017-12-16T17:59:53.000114Z

Just curious, does anything improve if you change the first rule to INSTRUCTION (<','> INSTRUCTION)*?

borkdude 2017-12-16T18:04:45.000106Z

I wondered about that rule as well. Quickbenching…

borkdude 2017-12-16T18:05:13.000060Z

Yup, 582ms!

borkdude 2017-12-16T18:05:50.000002Z

Does the order of rules matter for performance?

borkdude 2017-12-16T18:06:21.000007Z

I mean, when it’s more likely to encounter EXCHANGE, does it help putting that first?