Clojurians Log v2

Clojure programming

Channels

# 100-days-of-code # aatree # admin-announcements # adventofcode # ai # alda # aleph # all-the-channels # announcements # arachne # architecture # asami # atlanta-clojurians # atom-editor # autochrome-github # avi # aws # aws-lambda # babashka # babashka-sci-dev # bangalore-clj # beginners # berlin # biff # bigdata # bitcoin # boot # boot-dev # boulder-clojurians # braid-chat # braveandtrue # brevis # bristol-clojurians # business # calva # capetown # carry # cbus # cestmeetup # chestnut # chlorine-clover # cider # circleci # clara # clj-commons # cljdoc # cljfx # clj-http # clj-kondo # clj-on-windows # cljs-dev # cljs-experience # cljsfiddle # cljsjs # cljsrn # cljtogether # clojars # clojure # clojure-android # clojure-argentina # clojure-art # clojure-austin # clojure-australia # clojure-austria # clojure-bangladesh # clojure-bay-area # clojure-beijing # clojure-belgium # clojure-berlin # clojure-boston # clojure-brasil # clojurebridge # clojurebridge-ams # clojure-canada # clojure-chennai # clojure-chicago # clojure-china # clojure-colombia # clojure-conj # clojurecup # clojure-czech # clojured # clojure-denmark # clojure-denver # clojure-derby # clojuredesign-podcast # clojure-dev # clojure-dusseldorf # clojure-ecuador # clojure-egypt # clojure-estonia # clojure-europe # clojure-filipino # clojure-finland # clojure-france # clojure-gamedev # clojure-germany # clojure-greece # clojure-guangzhou # clojure-hamburg # clojure-hk # clojure-houston # clojure-hungary # clojure-india # clojureindia # clojure-indonesia # clojure-ireland # clojure-israel # clojure-italy # clojure-japan # clojure-kc # clojure-korea # clojure-losangeles # clojure-madison # clojure-mexico # clojure-miami # clojure-mk # clojure-mke # clojure-morsels # clojure-my # clojure-new-zealand # clojure-nl # clojure-nlp # clojure-norway # clojure-poland # clojure-portugal # clojure-provo # clojure-quebec # clojureremote # clojure-romania # clojure-russia # clojure-sanfrancisco # clojurescript # clojurescript-ios # clojure-sdn # clojure-seattle # clojure-serbia # clojure-sg # clojure-shanghai # clojure-spain # clojure-spec # clojuresque # clojure-survey # clojure-sweden # clojure-switzerland # clojure-taiwan # clojure-turkiye # clojure-uk # clojure-ukraine # clojureverse-ops # clojurewerkz # clojurewest # clojurex # clojure-za # clojurian-chat-app # clojutre # cloverage # cloxp # clr # code-art # code-reviews # community-development # component # conf-proposals # conjure # consulting # contributions-welcome # copenhagen-clojurians # core-async # core-logic # core-matrix # core-typed # cryogen # crypto # css # cursive # cz-clojure # d2q # datacrypt # datahike # datalevin # datalog # data-oriented-programming # data-science # datascript # datavis # dato # datomic # defnpodcast # deps-new # depstar # devcards # devops # dirac # docker # docs # domino-clj # duct # dunaj # eastwood # editors # emacs # error-message-catalog # etaoin # ethereum # euroclojure # events # exercism # expound # figwheel # figwheel-main # flambo # fulcro # funcool # functionalprogramming # funimage # garden # ghostwheel # girouette # gis # google-cloud # gorilla # graalvm # graalvm-mobile # graclj # graphql # gratitude # gsoc # hammock-driven-dev # helix # heroku # hispano # holy-lambda # honeysql # hoplon # hugsql # humor # hypercrud # hyperfiddle # immutant # improve-getting-started # incanter # indycljs # inf-clojure # instaparse # integrant # interceptors # interop # introduce-yourself # iot # iotivity # ipfs # jackdaw # jaunt # java # javascript # javelin # jobs # jobs-discuss # jobs-rus # joker # jukebox # juxt # jvm # kaocha # keechma # kekkonen # keyboards # klipse # kosmos # lambdaisland # ldnclj # ldnproclodo # lein-figwheel # leiningen # liberator # liquid # livestream # local-first-clojure # london-clojurians # lsp # luminus # lumo # mail # malli # mathematics # meander # melbourne # membrane # mental-health # microservices # mid-cities-meetup # midje # minecraft # minimallist # missionary # monads # mount # music # new-channels # new-clojure # nextjournal # nginx # nrepl # numerical-computing # nyc # observability # off-topic # om # om-next # onyx # other-languages # other-lisps # overtone # pamela # parinfer # pathom # pedestal # perun # philosophy # phzr # planck # plastic # play-clj # podcasts # polylith # portal # portkey # portland-or # powderkeg # practicalli # precept # prelude # programming-beginners # project-updates # proletarian # proton # protorepl # pulsar # pure-frame # qa # qlkit # quil # random # rdf # react # reactive # reading-clojure # reagent # reclojure # re-frame # reitit # releases # remote-jobs # respo # rethinkdb # reveal # rewrite-clj # ring # ring-swagger # robots # rum # schema # sci # sfcljs # shadow-cljs # _silence # sim-testing # sioux-falls # slack-help # sneer # sneer-br # spacemacs # specmonstah # specter # speculative # spirituality-ethics # sql # startup-in-a-month # sydney # test200 # test-check # testing # thejaloniki # timbre # tmp-json-parsing # tools-build # tools-deps # trading # tree-sitter # uncomplicate # unrepl # untangled # utah-clojurians # videos # vim # vrac # vscode # wasm # web-security # windows # xtdb # yada # yleinen

Apps

instaparse

If you're not trampolining your parser, why bother getting up in the morning?

mbjarland 2017-12-14T16:31:28.000597Z

I'm playing around with instaparse and for kicks and giggles I wrote a parser to parse some log files I have laying around

mbjarland 2017-12-14T16:31:50.000334Z

is there a way to define a fixed width "anything goes" string in instaparse

mbjarland 2017-12-14T16:32:33.000731Z

i.e. if I just want to gobble up a few characters into a tree node and don't care about the content there, is that possible?

aengelberg 2017-12-14T16:32:36.000605Z

Fixed width? Maybe #'.{N}'?

mbjarland 2017-12-14T16:33:09.000338Z

right, yes regex does the job but is probably not very performant for just "take substring of 10 from where you are"

mbjarland 2017-12-14T16:34:30.000018Z

ok, so regex is the way to go for this in instaparse?

aengelberg 2017-12-14T16:35:01.000532Z

I think regex is the most performant way to grab a not-static set of characters

mbjarland 2017-12-14T16:37:08.000382Z

: ) well I should probably mention that I think instaparse is excellent and by far the best parser lib I've run across....so my intent was not to come here and critique it

aengelberg 2017-12-14T16:38:00.000011Z

Thanks! And no worries, I was just answering your question from the perspective of what instaparse actually supports

mbjarland 2017-12-14T16:38:23.000194Z

that being said...if I parse 2G of log files (without instaparse) and compare the simplest regex match with (subs line 10 20), regex performace doesn't exactly shine

aengelberg 2017-12-14T16:38:32.000419Z

But I see your point that if it theoretically supported a dedicated "substring" combinator, that would be faster

mbjarland 2017-12-14T16:39:59.000897Z

anyway, figured I would ask, but regex does indeed do the job and perhaps what I'm doing with this parser is a bit of an edge case

aengelberg 2017-12-14T16:40:25.000280Z

Maybe we should support "custom combinators" so people like you with special use cases can write their own more performant specialized versions

mbjarland 2017-12-14T16:40:42.000280Z

that would be awesome

mbjarland 2017-12-14T16:42:47.000926Z

you would have to add some kind of extension point to the instaparse bnf syntax I guess

aengelberg 2017-12-14T16:47:05.000180Z

Maybe, or we don't allow extensions to the EBNF syntax and just let people make custom combinators for the combinator syntax

mbjarland 2017-12-14T16:50:49.000614Z

ah, ok, hadn't grokked the combinators syntax until now

mbjarland 2017-12-14T16:56:23.000245Z

right now I'm considering writing my own mini language for this log parsing, I could use instaparse to parse that language and then do custom, optimized parsing based on the format specification tree coming out from instaparse...so still useful

mbjarland 2017-12-14T17:21:35.000419Z

hmm, how come I need to double escape the not-inclusive rule in the following grammmar:

(def my-p 
  (instaparse.core/parser 
    "spec = (field-spec &lt;' '?&gt;)+
     field-spec = &lt;'['&gt;name ' '* &lt;':'&gt; ' '* (width | not-inclusive | not-exclusive | rest)&lt;']'&gt;
     name = #'[^:]+'
     width = &lt;'{'&gt; #'\\d+' &lt;'}'&gt;
     not-inclusive = &lt;'\\\\'&gt; #'.'
     not-exclusive = &lt;'/'&gt; #'.'
     rest = '*'    
    "))

aengelberg 2017-12-14T17:22:25.000423Z

you mean the '\\\\'?

mbjarland 2017-12-14T17:22:27.000648Z

yeah

mbjarland 2017-12-14T17:22:41.000042Z

shouldn't two have been enough?

aengelberg 2017-12-14T17:23:09.000479Z

because 1) you need to tell Clojure that you aren't escaping a character within a string 2) you need to tell Instaparse that you aren't escaping a character within a string combinator

mbjarland 2017-12-14T17:23:34.000533Z

ok, missed point 2 there