instaparse

If you're not trampolining your parser, why bother getting up in the morning?
bwstearns 2016-04-21T03:00:48.000003Z

Does anyone have any quick guidance on this question at SO: http://stackoverflow.com/questions/36706854/instaparse-series-of-numbers-or-letters-as-one-leaf

bwstearns 2016-04-21T03:01:30.000005Z

I think it might be an instance of lacking the right words to google for the answer effectively.

aengelberg 2016-04-21T03:21:44.000006Z

@bwstearns: You could concatenate all the strings as a transform step.

aengelberg 2016-04-21T03:23:01.000007Z

i.e. unhide the letter and number tags, but add :letter str, :number str into your transformer map.

bwstearns 2016-04-21T03:24:18.000008Z

@aengelberg: that's what I'm doing now. Because I'm doing it for a bunch of tags I was wondering if there was something built in for handling that as a common case or not.

aengelberg 2016-04-21T03:24:27.000009Z

Other than regexes, there's no way to concatenate strings in a way specified entirely by the instaparse grammar.

aengelberg 2016-04-21T03:25:34.000010Z

The transform approach is the easiest way I could think of out of all the "do something to the tree, fresh out of the parser" possible approaches.

bwstearns 2016-04-21T03:26:57.000011Z

That makes sense. I think what I'll do is put the preprocessor transforms into another hash to keep the more meaninful transform actions less cluttered and then merge them right before usage.

aengelberg 2016-04-21T03:27:48.000012Z

That can work. Or just call insta/transform twice, if you don't mind the performance impact of traversing the tree twice.

bwstearns 2016-04-21T03:28:27.000013Z

that works too. I don't think I have any performance issues on the horizon with this project.

bwstearns 2016-04-21T03:32:38.000014Z

@aengelberg: thanks a ton for taking a look. The question got some foot traffic but no feedback. If you're looking for internet points feel free to drop what you said in there and I'll accept it. Otherwise I'll copy it in as an own-answer for the next person.

aengelberg 2016-04-21T03:50:41.000015Z

@bwstearns: Any time! I've added an answer to your post

bwstearns 2016-04-21T03:51:30.000016Z

awesome. thanks. Didn't think about the performant part, is that primarily due to the extra step of having to transform it or is it because regexes are inherently faster than using parser rules?

aengelberg 2016-04-21T03:58:30.000017Z

#'a+' is faster than 'a'+, as letting regexes do the work of searching for all possible "a"s is faster than having instaparse do that work

aengelberg 2016-04-21T04:03:12.000019Z

@bwstearns: ^

bwstearns 2016-04-21T04:16:53.000020Z

Right, that makes sense because of the greediness. Thanks a ton for taking the time on this.

aengelberg 2016-04-21T04:20:03.000021Z

It's not exactly *because* of the greediness, it's just speedier when a Java program is doing this task than Clojure :simple_smile: