Clojurians Log v2

Clojure programming

Channels

# 100-days-of-code # aatree # admin-announcements # adventofcode # ai # alda # aleph # all-the-channels # announcements # arachne # architecture # asami # atlanta-clojurians # atom-editor # autochrome-github # avi # aws # aws-lambda # babashka # babashka-sci-dev # bangalore-clj # beginners # berlin # biff # bigdata # bitcoin # boot # boot-dev # boulder-clojurians # braid-chat # braveandtrue # brevis # bristol-clojurians # business # calva # capetown # carry # cbus # cestmeetup # chestnut # chlorine-clover # cider # circleci # clara # clj-commons # cljdoc # cljfx # clj-http # clj-kondo # clj-on-windows # cljs-dev # cljs-experience # cljsfiddle # cljsjs # cljsrn # cljtogether # clojars # clojure # clojure-android # clojure-argentina # clojure-art # clojure-austin # clojure-australia # clojure-austria # clojure-bangladesh # clojure-bay-area # clojure-beijing # clojure-belgium # clojure-berlin # clojure-boston # clojure-brasil # clojurebridge # clojurebridge-ams # clojure-canada # clojure-chennai # clojure-chicago # clojure-china # clojure-colombia # clojure-conj # clojurecup # clojure-czech # clojured # clojure-denmark # clojure-denver # clojure-derby # clojuredesign-podcast # clojure-dev # clojure-dusseldorf # clojure-ecuador # clojure-egypt # clojure-estonia # clojure-europe # clojure-filipino # clojure-finland # clojure-france # clojure-gamedev # clojure-germany # clojure-greece # clojure-guangzhou # clojure-hamburg # clojure-hk # clojure-houston # clojure-hungary # clojure-india # clojureindia # clojure-indonesia # clojure-ireland # clojure-israel # clojure-italy # clojure-japan # clojure-kc # clojure-korea # clojure-losangeles # clojure-madison # clojure-mexico # clojure-miami # clojure-mk # clojure-mke # clojure-morsels # clojure-my # clojure-new-zealand # clojure-nl # clojure-nlp # clojure-norway # clojure-poland # clojure-portugal # clojure-provo # clojure-quebec # clojureremote # clojure-romania # clojure-russia # clojure-sanfrancisco # clojurescript # clojurescript-ios # clojure-sdn # clojure-seattle # clojure-serbia # clojure-sg # clojure-shanghai # clojure-spain # clojure-spec # clojuresque # clojure-survey # clojure-sweden # clojure-switzerland # clojure-taiwan # clojure-turkiye # clojure-uk # clojure-ukraine # clojureverse-ops # clojurewerkz # clojurewest # clojurex # clojure-za # clojurian-chat-app # clojutre # cloverage # cloxp # clr # code-art # code-reviews # community-development # component # conf-proposals # conjure # consulting # contributions-welcome # copenhagen-clojurians # core-async # core-logic # core-matrix # core-typed # cryogen # crypto # css # cursive # cz-clojure # d2q # datacrypt # datahike # datalevin # datalog # data-oriented-programming # data-science # datascript # datavis # dato # datomic # defnpodcast # deps-new # depstar # devcards # devops # dirac # docker # docs # domino-clj # duct # dunaj # eastwood # editors # emacs # error-message-catalog # etaoin # ethereum # euroclojure # events # exercism # expound # figwheel # figwheel-main # flambo # fulcro # funcool # functionalprogramming # funimage # garden # ghostwheel # girouette # gis # google-cloud # gorilla # graalvm # graalvm-mobile # graclj # graphql # gratitude # gsoc # hammock-driven-dev # helix # heroku # hispano # holy-lambda # honeysql # hoplon # hugsql # humor # hypercrud # hyperfiddle # immutant # improve-getting-started # incanter # indycljs # inf-clojure # instaparse # integrant # interceptors # interop # introduce-yourself # iot # iotivity # ipfs # jackdaw # jaunt # java # javascript # javelin # jobs # jobs-discuss # jobs-rus # joker # jukebox # juxt # jvm # kaocha # keechma # kekkonen # keyboards # klipse # kosmos # lambdaisland # ldnclj # ldnproclodo # lein-figwheel # leiningen # liberator # liquid # livestream # local-first-clojure # london-clojurians # lsp # luminus # lumo # mail # malli # mathematics # meander # melbourne # membrane # mental-health # microservices # mid-cities-meetup # midje # minecraft # minimallist # missionary # monads # mount # music # new-channels # new-clojure # nextjournal # nginx # nrepl # numerical-computing # nyc # observability # off-topic # om # om-next # onyx # other-languages # other-lisps # overtone # pamela # parinfer # pathom # pedestal # perun # philosophy # phzr # planck # plastic # play-clj # podcasts # polylith # portal # portkey # portland-or # powderkeg # practicalli # precept # prelude # programming-beginners # project-updates # proletarian # proton # protorepl # pulsar # pure-frame # qa # qlkit # quil # random # rdf # react # reactive # reading-clojure # reagent # reclojure # re-frame # reitit # releases # remote-jobs # respo # rethinkdb # reveal # rewrite-clj # ring # ring-swagger # robots # rum # schema # sci # sfcljs # shadow-cljs # _silence # sim-testing # sioux-falls # slack-help # sneer # sneer-br # spacemacs # specmonstah # specter # speculative # spirituality-ethics # sql # startup-in-a-month # sydney # test200 # test-check # testing # thejaloniki # timbre # tmp-json-parsing # tools-build # tools-deps # trading # tree-sitter # uncomplicate # unrepl # untangled # utah-clojurians # videos # vim # vrac # vscode # wasm # web-security # windows # xtdb # yada # yleinen

Apps

instaparse

If you're not trampolining your parser, why bother getting up in the morning?

2016-07-04T01:53:46.000012Z

What's the status of cljs support ?

2016-07-04T01:54:49.000013Z

Is it still living as a fork?

aengelberg 2016-07-04T01:57:56.000014Z

The only cljs support still lives in lbradstreet/instaparse-cljs

aengelberg 2016-07-04T01:59:57.000015Z

But I'm currently in the process of rewriting instaparse-cljs into a form that we'd be willing to accept back into upstream, now that cljsee exists

aengelberg 2016-07-04T07:46:31.000016Z

@seylerius: Here's a grammar that parses exponents like you were you asking:

boot.user=&gt; (def p (insta/parser "
&lt;S&gt; = ows (exponent ows)+
&lt;exponent&gt; = token &lt;'^'&gt; super
super = token | &lt;'{'&gt; token &lt;'}'&gt;
&lt;token&gt; = #'[^\\s\\^{}]+'
&lt;ows&gt; = &lt;#'\\s*'&gt;
"))
#'boot.user/p
boot.user=&gt; (p "foo^2 x^{x+1}")
("foo" [:super "2"] "x" [:super "x+1"])

This parser is pretty naive about the range of possible inputs, since I'm not totally sure myself what that range of inputs is in your use case.

seylerius 2016-07-04T16:43:30.000018Z

Thanks!

seylerius 2016-07-04T16:47:13.000020Z

Another question: * / + = & ~ can appear in singles without being tokens. How would you represent that? Current parser: http://sprunge.us/GNDe

seylerius 2016-07-04T16:54:58.000021Z

@aengelberg: What I have will do for the moment, but it's a part of the spec I'd like to meet eventually.

2016-07-04T17:03:26.000022Z

Hi, We switched recently for parsing user input using plain regex to instaparse. Code looks way better. However there are two corner cases where I am not sure what would be idiomatic way: 1) parsing of certain domain of inputs should result on noop. Our current solution is:

"sentence = define / explain / help / catchall
&lt;&lt;skipped definitions&gt;&gt;
 catchall = #'(.|[\n\r])*'"

with an intention to just ignore last part during transformation : catchall (fn [_] nil) Now I wonder if there is another way to catch this case and ignore without using exceptions. 2)`'(.|[\n\r])*'` comes with | which on JVM leads on recursion and might result in stack overflow. In fact it happened one to us. Is there a better way to write catchall which would account for anything including \n and \r.

aengelberg 2016-07-04T17:10:05.000023Z

@happy.lisper for catchall you could do #'[\s\S]*'

2016-07-04T17:10:24.000024Z

aengelberg 2016-07-04T17:11:16.000025Z

So your use case is: "Parse the entire string as a define, an explain, or a help, but if that doesn't work then return nil"?

aengelberg 2016-07-04T17:11:43.000026Z

Because you could just run the parse and a transform, then check (insta/failure? result)

2016-07-04T17:11:52.000027Z

yes, where nil is just a signal to ignore the input.

aengelberg 2016-07-04T17:13:54.000028Z

(def p (insta/parser ...))
(let [result (p input-string)
      transformed (insta/transform p {...})]
  (when-not (insta/failure? transformed)
    transformed))

aengelberg 2016-07-04T17:14:12.000029Z

Note that insta/transform is specifically designed to pass through failures

2016-07-04T17:15:13.000030Z

Let me consider that 🙂.

aengelberg 2016-07-04T17:19:50.000031Z

@seylerius: Given an input ~a ~b, how do you know the a and b are to be parsed as individual ~'s, as opposed to a code string of "a " followed by "b"?

seylerius 2016-07-04T17:24:06.000034Z

@aengelberg: If I'm reading this correctly, the characters touching the inside of the tokens need to be alphanumeric, or at least non-whitespace.

aengelberg 2016-07-04T17:27:43.000035Z

so *a b c* shouldn't be allowed?

aengelberg 2016-07-04T17:28:24.000036Z

the current grammar that I suggested would allow that. Just trying to get a sense of the range of inputs so I can help design a parser accordingly

seylerius 2016-07-04T17:29:24.000037Z

*foo* *bar* ➡️ [:b "foo" "bar"] foo* bar* ➡️ "foo* bar*"

seylerius 2016-07-04T17:33:36.000038Z

@aengelberg: that make sense?

aengelberg 2016-07-04T17:34:50.000039Z

for the first example do you mean [:b "foo"] [:b "bar"]?

aengelberg 2016-07-04T17:37:16.000040Z

is there a guarantee that *a**b* won't happen?

seylerius 2016-07-04T17:38:46.000041Z

@aengelberg: Yes. And guarantee? No. Ambiguity in the spec we can lock to an interpretation? Yes.

seylerius 2016-07-04T17:45:17.000042Z

We basically get to decide if that's a pair of bold characters or a flat string we'll leave be.

seylerius 2016-07-04T17:45:28.000043Z

It would only likely happen as a typo.

seylerius 2016-07-04T17:45:41.000044Z

(Or a stupid user)

seylerius 2016-07-04T17:48:07.000045Z

@aengelberg: I'm basically upgrading organum. Sample org file: http://sprunge.us/KBbL

aengelberg 2016-07-04T17:51:01.000046Z

hmm, thinking through how to enforce alphanumeric chars on the insides of tokens.

aengelberg 2016-07-04T17:52:22.000047Z

doing a "lookbehind" on the last * is nontrivial.

seylerius 2016-07-04T18:01:16.000048Z

What if I stripped leading and trailing whitespace before parsing, and modified the base string rule to start and end alphanumeric? Would that be easier?

seylerius 2016-07-04T18:05:37.000049Z

But, no, that wouldn't quite work.

seylerius 2016-07-04T18:11:29.000050Z

@aengelberg: Will the parser ignore escaped tokens, like \*?

seylerius 2016-07-04T18:12:48.000051Z

Ach. Clojure doesn't like \* in a string

seylerius 2016-07-04T18:30:43.000052Z

@aengelberg: Is here any way to mark tokens to not be parsed?

2016-07-04T18:33:35.000053Z

would angle brackets <> to hide parsed elements work?

aengelberg 2016-07-04T18:35:29.000054Z

@seylerius you'd have to do \\* if inside a Clojure string

aengelberg 2016-07-04T18:36:54.000055Z

the goal is to avoid parsing *a * as [:b "a "]

seylerius 2016-07-04T18:37:34.000056Z

@aengelberg: Anything special I have to do to mark that? I just tried parsing \\*foo\\* and got ("\\" [:b "foo\\"])

aengelberg 2016-07-04T18:38:22.000057Z

instaparse doesn't automatically handle backslashes in any special way besides what has been defined in your grammar.

seylerius 2016-07-04T18:41:42.000059Z

Okay. How do you define a simple backslash replacement in this type of grammar, then?

aengelberg 2016-07-04T18:45:59.000060Z

Maybe replace <string> with:

&lt;string&gt; = '\\\\*' | #'[^*/_+=~^_\\\\]+'

user&gt; (inline-markup "a\\* b")
("a" "\\*" " b")

aengelberg 2016-07-04T18:46:17.000061Z

Pretty messy, I know. (four backslashes :face_with_rolling_eyes:)

aengelberg 2016-07-04T18:48:03.000062Z

I don't know if this solves your problem though; you don't want to escape *'s in every ** My Subsection text, do you?

aengelberg 2016-07-04T18:49:13.000063Z

sorry if I'm a bit unhelpful; phasing in and out of AFK

seylerius 2016-07-04T18:50:38.000064Z

I'm thinking I'm just going to tell users that if they want a plain * they have to escape it.

seylerius 2016-07-04T18:51:23.000065Z

Headlines are already handled by the time this stage of parsing is invoked, so those won't be an issue.

seylerius 2016-07-04T18:53:21.000066Z

And your special case of *a**b* is apparently already readily converted to ([:b "a"] [:b "b"])

seylerius 2016-07-04T20:11:06.000067Z

@aengelberg: Separate (earlier stage) parser: Is it possible (other than by having respective rules for #'^* ', #'^** ', #'^*** ', etc) to easily produce h1, h2, h3, etc?

seylerius 2016-07-04T20:20:25.000068Z

Actually, yeah. Just don't hide the token, and I can put that through a counter after the fact.