instaparse

If you're not trampolining your parser, why bother getting up in the morning?
aengelberg 2015-06-17T00:06:34.000039Z

That part of the abnf namespace seems not properly designed for supplementary characters altogether...

aengelberg 2015-06-17T00:07:49.000040Z

It isn't using any utility (unlike the clj version) to turn an oversized integer into a series of two characters.

aengelberg 2015-06-17T02:09:38.000041Z

The get-char-combinator function needs a rework, so ABNF terminals like %x5D-10FFFF can work.

aengelberg 2015-06-17T02:10:43.000042Z

JavaScript, unlike Java, does not seem to support regular expressions with \x{10FFFF}.

aengelberg 2015-06-17T02:13:24.000043Z

In instaparse for Clojure, single characters are represented as a string combinator with the surrogate pair (two 16-bit chars side by side), and a character range uses the regex \x{10FFF} syntax. ClojureScript or JavaScript appear to not have much support for either of these things. It may be impossible to support Unicode character ranges in ABNF without introducing third-party js libraries.

aengelberg 2015-06-17T02:40:12.000044Z

OK, the former is doable via goog.i18n.uChar/fromCharCode.

lucasbradstreet 2015-06-17T04:03:46.000045Z

Nice! Yeah, this character support code is probably the weakest part of the port.

lucasbradstreet 2015-06-17T04:04:38.000046Z

I'm glad that you're finding these issues. I had a feeling there were some lurking issues there.

aengelberg 2015-06-17T04:56:43.000047Z

goog has some utils to work with surrogate strings, but the regex (char range) seems impossible without pulling in an external dependency like Regenerate. https://github.com/mathiasbynens/regenerate

lucasbradstreet 2015-06-17T05:45:18.000049Z

Ah, yeah, I think I’d rather recreate the functionality internally than pull in extra deps. Definitely a bit of a pain though.

lucasbradstreet 2015-06-17T05:47:24.000050Z

Interesting https://mathiasbynens.be/notes/javascript-unicode

lucasbradstreet 2015-06-17T05:47:44.000051Z

I wonder if this issue is true for all browsers

lucasbradstreet 2015-06-17T05:47:45.000052Z

https://mathiasbynens.be/notes/es6-unicode-regex

lucasbradstreet 2015-06-17T05:48:49.000053Z

Actually, if you could create a PR with a failing cljs test case that would be a good place to start

aengelberg 2015-06-17T17:00:41.000054Z

https://github.com/lbradstreet/instaparse-cljs/pull/9

aengelberg 2015-06-17T17:41:37.000056Z

Hmm, now I'm mildly concerned because circleci is passing... ;)

aengelberg 2015-06-17T17:52:06.000057Z

Hmm, I think that's because there isn't really a notion of the cljs tests "passing" or "failing" (no exit codes)