@aengelberg: Any particular instaparse way to go multi-line, or just specify it in the regexps?
@seylerius not sure what exactly you're confused about, but here are some examples: imagine you're parsing the following input:
aaaa
bbb
cccccc
the grammar could look like
S = A '\n' B '\n' C
A = 'a'+
B = 'b'+
C = 'c'+
or
S = #'a+\n' #'b+\n' #'c+'
either \n
or \\n
would work if you are inside a Clojure string.
S = A ows B ows C
A = 'a'+
B = 'b'+
C = 'c'+
(* optional whitespace *)
<ows> = <#'\s*'>
For that example, inside a Clojure string you would need to change \s
to \\s
.
@aengelberg: Also, how would you modify what would normally be a .*
to not eat an optional :[a-zA-Z0-9_@]+:([a-zA-Z0-9_@]+:)*
that follows it? Or would you just post-process that out after?
so you're trying to parse #"shown-part hidden-part"
but only return "shown-part"
in the parse result?
you could use the regex lookahead to omit it from the result, but then actually parse it (with instaparse's <>
hiding feature) in order to properly advance the parser.
S = #'shown-part(?=hidden-part)' <#'hidden-part'>
or unhide the second #'hidden-part'
if you actually do want it in the parse tree, but separate from #'shown-part'
.
@aengelberg: More like org-mode headlines allow tags at the end in that style. Not hidden so much as separate.