instaparse

If you're not trampolining your parser, why bother getting up in the morning?
micha 2015-08-17T14:30:46.000092Z

@aengelberg: made some progress this weekend with my instaparse project! https://github.com/adzerk-oss/zerkdown

micha 2015-08-17T14:31:28.000094Z

it's a work in progress of course

micha 2015-08-17T14:31:54.000095Z

it does a braindead parsing of clojure maps/vectors

micha 2015-08-17T14:32:39.000097Z

feedback appreciated :simple_smile:

aengelberg 2015-08-17T20:18:21.000098Z

@micha this project is very cool

aengelberg 2015-08-17T20:18:44.000099Z

Does your ebnf allow [{]?

micha 2015-08-17T20:19:15.000100Z

not in a :CLJ or :VEC block

micha 2015-08-17T20:19:31.000101Z

["{"] would be ok though

aengelberg 2015-08-17T20:20:05.000102Z

<VEC-CHAR> = !(LSB | RSB | DQ) ANY-CHAR looks like it would allow mismatched map delimiters inside it

micha 2015-08-17T20:20:32.000103Z

oh interesting

micha 2015-08-17T20:20:52.000104Z

yeah it's ambiguous

micha 2015-08-17T20:21:24.000105Z

!(LSB | RSB | STRING | MAP) ANY-CHAR would be nice there

aengelberg 2015-08-17T20:22:00.000106Z

then it would still allow mismatched quotes and delimiters because strings and maps don't successfully parse :simple_smile:

micha 2015-08-17T20:22:36.000107Z

hmm

aengelberg 2015-08-17T20:23:17.000108Z

maybe (!(LSB | RSB | LCB | RCB | DQ) ANY-CHAR) | STRING | CLJ

micha 2015-08-17T20:23:24.000109Z

yeah

micha 2015-08-17T20:23:42.000110Z

i will try that

micha 2015-08-17T20:23:54.000111Z

i am planning to do the recursion from clojure btw

micha 2015-08-17T20:24:24.000112Z

i will parse one level of indentation, then for each :BLOCK call insta again on the body

micha 2015-08-17T20:25:10.000113Z

it seems like it will be straightforward, i hope

aengelberg 2015-08-17T20:25:22.000114Z

cool

aengelberg 2015-08-17T20:26:03.000115Z

Just make sure instaparse Failures are returned / shortcircuited properly :simple_smile:

micha 2015-08-17T20:26:28.000116Z

how do you mean?

aengelberg 2015-08-17T20:26:51.000117Z

if a "sub-parse" returns a failure, then what?

aengelberg 2015-08-17T20:29:43.000118Z

I imagine it will be most idiomatic to call insta/parse again within the transformer. But my point is, if a parse failure arises (malformed zerkdown) within that, you will need to propagate that error properly :simple_smile:

micha 2015-08-17T20:32:00.000119Z

ah right

micha 2015-08-17T20:32:38.000120Z

what did you mean before about strings and maps not successfully parsing?

aengelberg 2015-08-17T20:33:43.000121Z

negative lookahead = make sure this thing does not successfully parse

aengelberg 2015-08-17T20:37:11.000122Z

!STRING x means no "complete well-formed strings" allowed, but you probably wanted "no double-quotes of any kind really"

aengelberg 2015-08-17T20:41:24.000123Z

and then you can add in | STRING to allow well-formed strings

micha 2015-08-17T20:50:43.000124Z

oh i see

micha 2015-08-17T20:50:51.000125Z

i actually don't care about double quotes

micha 2015-08-17T20:51:10.000126Z

i just don't want well formed strings, because those can legitimately contain {}[] etc

micha 2015-08-17T20:51:39.000127Z

i'm not trying to fully parse the clojure data, i just need to know where it ends

micha 2015-08-17T20:51:59.000128Z

i send it as a string and use clojure.core/read-string on it later

aengelberg 2015-08-17T20:52:20.000129Z

that's fair, but if [{]}] is allowed it's not exactly obvious where it ends :simple_smile:

micha 2015-08-17T20:52:33.000130Z

haha yes

micha 2015-08-17T20:53:21.000131Z

very interesting

aengelberg 2015-08-17T20:53:29.000132Z

anyway I don't think it's super hard to make the delimiters correct. (!(LSB | RSB | LCB | RCB | DQ) ANY-CHAR) | STRING | CLJ

aengelberg 2015-08-17T20:53:40.000133Z

That basically says "no double-quotes, UNLESS there is a well formed string"

micha 2015-08-17T20:54:08.000134Z

yes that's awesome

micha 2015-08-17T20:54:36.000135Z

testing was pretty easy to do, by configuring with different start rules

aengelberg 2015-08-17T20:54:55.000136Z

yeah. just don't forget negative testing :simple_smile:

micha 2015-08-17T20:55:10.000137Z

ah right

micha 2015-08-17T20:55:19.000138Z

yeah i didn't think of that

aengelberg 2015-08-17T20:57:13.000139Z

really cool idea. what is the intended use case for zerkdown?

micha 2015-08-17T21:16:56.000140Z

well i want to use it for just about everything!

micha 2015-08-17T21:17:02.000141Z

mostly for websites

micha 2015-08-17T21:17:14.000142Z

but i can imagine using it for literate programming and things like that

micha 2015-08-17T21:17:52.000143Z

but for making webapps it's really nice to have a "prose" syntax you can customize for your use case

micha 2015-08-17T21:18:37.000144Z

like normally you have like

# My Title
that compiles down to
<h1>My Title</h1>

micha 2015-08-17T21:20:21.000145Z

but what if you need something like

<h1>My Title <small>The Best Thing Ever</small></h1>
i want to be able to just define a new inline tag for that, like
# My Title <<The Best Thing Ever>>

micha 2015-08-17T21:20:38.000146Z

or even more complex things with behavior and everything, like forms and buttons