Hi! I’m interested in implementing a SLOC counter for clojure (if it is not too difficult)
it needs to count blank lines, comments (including (comment )
and #_
I will need the parser + node api right?
correct
you can inspect the node/tag
to see if the top level node is a comment or not
#_
is called :uneval
I think
thanks for the pointers!
@dimitar.ouzounoff please let us know how it goes!
will do, I have to wrap one task first and I will get on this; my first impression is that most likely I will need to parse this line by line
@dimitar.ouzounoff No, you can just use parser/parse-string-all
and this will give you a :forms
node. Then you can go over its :children
and skip all :comment
and :uneval
or empty lines, etc. And then just call str
on the remaining nodes and count the number of lines from those.
well, I saw parse-string-all but it doesn’t return a seq; I guess I need to read on the nodes to understand them better
@dimitar.ouzounoff It returns one node, a :forms
node
and that has :children
which are all of the top level nodes
hmm this is what a comment looks like in cider-inspect:
Class: rewrite_clj.node.seq.SeqNode
Meta Information:
:row = 87
:col = 1
:end-row = 91
:end-col = 5
Contents:
:tag = :list
:format-string = "(%s)"
:wrap-length = 2
:seq-fn = rewrite_clj.node.seq$list_node$fn__7201@66935f35
:children = ( { :value comment, :string-value "comment", :map-qualifier nil } { :newlines "\n" } { :whitespace " " } { :tag :list, :format-string "(%s)", :wrap-length 2, :seq-fn rewrite_clj.node.seq$list_node$fn__7201@fab763f, :children ( { :value do,
I’m not sure how to access it, I can’t seem to use nth
in order to use a comment? predicate
ah right
so the first child of this form is a symbol node with :value
'token
@dimitar.ouzounoff you can achieve what your goal with the node API, but the zip API is a bit higher level.
something like:
(defn comment-node? [node]
(and (= :list (node/tag node)) (some-> node :children first :value (= 'comment))))
Personally I don't see a need for the zipper API here, since it's pretty straightforward to only iterate over the top level nodes, there isn't a need to visit any deeper nodes and/or rewrite/remove them
No need, just an alternative that might offer easier nav through tree. Not sure exactly what @dimitar.ouzounoff wants to count here yet.
He want to count lines of code, but exclude comment forms
yes, I guess this is why something like rewrite-clj is necessary as it needs to exclude whole multiline comment forms from the line count
Oh I thought he also wanted to count comments and (comment and #_.
that doesn't change the problem very much, those are just different predicates
Yup
and you're also able to count whitespace lines, so rewrite-clj is a great tool for this problem to implement clocl
(count line of clojure)
which will soon be available as a GraalVM binary? with proper command line interface please? :P
Feel free to carry on with node API, just offering up an alternative.
Ha! Let’s not get you started! 🙂
I’m still at a beginner level, but it sounds like fun 🙂
well I’m not sure how to go over the nodes
rewrite-clj.zip/next might be something I can use
another thing that I’m not sure about is that if take this line: (set! *warn-on-reflection* true) ;; avoid reflexion so you can use graalvm
it is two children on the first depth
so it would count as a line of code + a line of column
It is! Borkdude and I curate some tips over at https://github.com/lread/clj-graal-docs. I was more joking with borkdude about proper cmd lines, which is something he has been recently irked by.
user=> (require '[clojure.string :as str] '[rewrite-clj.parser :as p])
nil
user=> (:children (p/parse-string-all "(+ 1 2 3)\n;;hello\n(comment 1 2 3)"))
(<list: (+ 1 2 3)> <newline: "\n"> <comment: ";;hello\n"> <list: (comment 1 2 3)>)
user=> (map str (:children (p/parse-string-all "(+ 1 2 3)\n;;hello\n(comment 1 2 3)")))
("(+ 1 2 3)" "\n" ";;hello\n" "(comment 1 2 3)")
user=> (map (comp count str/split-lines str) (:children (p/parse-string-all "(+ 1 2 3)\n;;hello\n(comment 1 2 3)")))
(1 0 1 1)
user=> (apply + (map (comp count str/split-lines str) (:children (p/parse-string-all "(+ 1 2 3)\n;;hello\n(comment 1 2 3)"))))
3
user=> (require '[rewrite-clj.node :as node])
nil
user=> (defn comment-node? [node]
(and (= :list (node/tag node)) (some-> node :children first :value (= 'comment))))
#'user/comment-node?
user=> (apply + (map (comp count str/split-lines str) (remove comment-node? (:children (p/parse-string-all "(+ 1 2 3)\n;;hello\n(comment 1 2 3)")))))
2
@lee I'm pretty tempted to just add rewrite-clj to babashka so you can write little scripts like this ;)
I think when bb was only a few weeks old @sogaiu proposed it already, haha
hey, I’d use it!
ok, I think I get it now, thanks @borkdude!
Any chance of rewrite-clj support :babashka: interpreter? 🧵
oh, where did you try that out? atm i don't think tree-sitter gives a grammar author the ability to tune how the correction works. consequently, i haven't found it to be so flexible for coping with broken code. the work that you all are doing on rewrite-clj, clj-kondo, clojure-lsp, etc. is much better for editor users from this perspective.
Oh @sogaiu, I am a total tree-sitter noob and have only read/watched introductory stuff. I thought that https://github.com/tree-sitter/tree-sitter/issues/224, but again have not dug deep at all.
yeah what you mentioned has a nice explanation (which i confess i do not understand that well 🙂 ) here is a more recent discussion that may be relevant for lisp-likes: https://github.com/tree-sitter/tree-sitter/issues/923 i've been working on trying to spell out specifically what one can do in the case of unbalanced delimiters. afaict, in general, when there is a missing closing delimiter, there are mutiple possible places it might go. if one can trust existing indentation i think this can be narrowed down a bit, but still hammocking / researching. anyway, sorry to have drifted off topic.
All interesting to me @sogaiu! Thanks for sharing.
(z/of-string "!/usr/bin/env bb\n\n(ns foo)")
=> Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
Invalid symbol: !/usr/bin/env.
Is just that if a user open a babashka file with that interpreter and is using clojure-lsp, it will throw a lot of those exceptions 😕
ouch!
I think rewrite-clj should be able to handle this, but you need to start it with a #
so: #!/usr/bin/env bb
oh yeah, my bad, but it happens with the # as well
that surprises me, since I'm supporting this in clj-kondo too
but it could be that it's only supported in my fork. I'll check
yeah I can confirm it throws for rewrite-clj @ericdallo
cheer up buddy, we’ll deal with it.
Thank you! 😄
compare: https://github.com/clj-kondo/clj-kondo/blob/d90e2073fef67f4b0f6d9eabba9fa50c5c9f95dc/parser/clj_kondo/impl/rewrite_clj/parser/core.clj#L181 and https://github.com/clj-commons/rewrite-clj/blob/59668f5875b5933c84844f1a1e0e6a3c3b77ac59/src/rewrite_clj/parser/core.cljc#L126
oh, it makes sense!
I shall take a peek @borkdude, thanks.
So kondo just skips, yeah?
What would we like rewrite-clj to do?
yes, #!
is read exactly the same as ;
Oh, just something rewrite-clj does not understand yet.
Maybe a new :shebang
type node?
Similar to ;
I guess
yes, a new node looks useful
user=> (p/parse-string ";; foo")
<comment: ";; foo">
user=> (p/parse-string "#! foo") ;;=> <shebang "#! foo">
Sure, sounds good.
Note:
user=> (+ 1 2 3 #! foo
4)
10
so yeah, it's exactly like ;
Huh, so where is #!
documented?
it might not be documented, but this is how it works ;)
tis the way of things, I like shebang for a name, but I’ll see if it is called something specific in the reader
You are fast man! Tx! So technically the zipper should skip these guys too.
we could also just make it a :comment
node perhaps
yes
That would fit in more easily.
Alrighty! I new rewrite-clj feature! Very exciting.
Or a bug fix.
New challenge: https://github.com/borkdude/deps.clj/blob/master/deps.bat#L1-L7
Still exciting
Yeah, exciting that rewrite-clj is finally catching up with clj-kondo's fork after 2 years :P
hehe, sorry, just kidding
I think that windows command line argument might have set your tone for the day. :simple_smile:
Here is the old issue for clj-kondo: https://github.com/clj-kondo/clj-kondo/issues/294
cool, I’ll write up an issue for rewrite-clj and fix.
yay a new feature!
now about that .bat file…
ya, thanks for raising @ericdallo!
are you using argument in two senses here?
I didn't feel like I was arguing with someone, because nobody really replied, except the guy who was the "victim" of string quoting
It was a bad pun!
No, you were empathizing not arguing.
haha ok.
I'm strangely excited about this Windows shebang though. It's always nice to support some archaic niche use case
Me having a Windows machine is really paying off here ;P
128gb at the ready… for any arcane issue…
@ericdallo, https://github.com/clj-commons/rewrite-clj/commit/187497a5d017ee21e81c674938f2427707807ca4, lemme know if it works for ya.
Nice, thank you very much! Just don't giving the exception will certainly work 😄 I'll make the change on clojure-lsp on next rewrite-clj release
Coolio, I’ll likely cut a release tomorrow then.
@ericdallo btw, you mentioned that the integration tests were broken with the newest clj-kondo. could you follow up on this?
yep, for some reason the order of the elements of the list were not in the expected order with the new version, it's not a issue and it seems now the order is "more" correct, following the asc pattern of the elements positions
I didn't investigate that much, but I'll keep an eye if next clj-kondo releases impact that again
the version from earlier in march had a bug which reported unresolved symbols in the wrong order. maybe you captured this in an integration test
@borkdude, just reviewing https://github.com/clj-kondo/clj-kondo/commits/master/parser/clj_kondo/impl/rewrite_clj, I don’t think there is anything else immediately relevant that we are not already tracking for rewrite-clj v1, do you?
did you also catch this one? https://github.com/clj-kondo/clj-kondo/commit/e6f6bf097072c0ad0f6f5ffec13379d9c79db4e2#diff-72baeadefb655c4c65b933bbaef4c62cbf6397c2d460fff9010b09d824c9de15
yup, tx
@lee I already had some patches around namespaced maps in the first commit, unfortunately I didn't specify that explicitly
but I think that was the major one before this inlined fork
I think we might be ok in that area
yeah. so I think it's pretty safe to drop the alpha suffix
yeah you were probably working around the half-finished namespaced map support in v0
yeah, also the *ns*
thing bugged me
yeah, it was annoying
I guess we could drop alpha. I would have liked more feedback around sexpr work around namespaced elements from real usage, but if it does not come naturally, then… can’t force it.
can also wait some more
I might do that… doesn’t hurt. Spec set a precedent. :simple_smile:
not sure if that's a good example to follow though ;)
:simple_smile:
fwiw tree-sitter-clojure recognizes #! too: https://github.com/sogaiu/tree-sitter-clojure/blob/master/grammar.js#L40
@sogaiu! Nice to hear from you! I finally took the time to introduce myself to tree-sitters the other day. Very interesting!
ah cool!
Yeah, I found the smart error detection pretty darn awesome. Not exactly sure how it works but the effect is nice.