I found a strange behavior with #'\\Z'
, I wonder if it is a bug or if it is normal.
((insta/parser
"Paragraph = NonBlankLine+ BlankLine+
BlankLine = #'[ \\t]'* EOL
NonBlankLine = #'\\S'+ EOL
EOL = (#'\\n' | EOF)
EOF = #'\\Z'")
"abc\ndef\n")
=>
[:Paragraph
[:NonBlankLine "a" "b" "c" [:EOL "\n"]]
[:NonBlankLine "d" "e" "f" [:EOL [:EOF ""]]]
[:BlankLine [:EOL "\n"]]]
EOF appears before "\n"
in the parsed result.
This other approach which uses the negative lookahead does put the "\n"
in the right place in the result, but there is another problem: The BlankLine
is missing in the result. That may be a bug of instaparse. I am using the version 1.4.9
.
((insta/parser
"Paragraph = NonBlankLine+ BlankLine+
BlankLine = #'[ \\t]'* EOL
NonBlankLine = #'\\S'+ EOL
EOL = (#'\\n' | EOF)
EOF = !#'.'")
"abc\ndef\n")
=>
[:Paragraph [:NonBlankLine "a" "b" "c" [:EOL "\n"]]
[:NonBlankLine "d" "e" "f" [:EOL "\n"]]]
I am going to use this workaround for now: append “EOF” at the end of the input and parse it. It works very well 🙂
((insta/parser
"Paragraph = NonBlankLine+ BlankLine+
BlankLine = #'[ \\t]'* EOL
NonBlankLine = #'\\S'+ EOL
EOL = (#'\\n' | EOF)
EOF = 'EOF' #'\\Z'") ; works as well with !#'.'
"abc\ndef\nEOF")
=>
[:Paragraph
[:NonBlankLine "a" "b" "c" [:EOL "\n"]]
[:NonBlankLine "d" "e" "f" [:EOL "\n"]]
[:BlankLine [:EOL [:EOF "EOF" ""]]]]
love instaparse