So, I’m using the zip API to navigate through a file. I need to find a particular function call and edit it. I’ve found out how to do this, and I’m composing it into a context where I recursively do this to all files under a directory. Yay!
However, there are problems. I’m using find-next
to find the particular s-expression that I’m interested in, and in my predicate function, I’m using z/sexpr
to be able to inspect the s-expression as a Clojure data structure. Some of the files under the directory has discard blocks at the root level, and this causes z/sexpr
to throw.
I could (and I’m probably going to) iterate through the nodes using z/right
instead and interpret the type of node, or just catch
, to figure out whether I’m at a discard node (or some other un-sexpr-able node) that I need to skip, and skip it. I’m curious that z/find-next
doesn’t already have this option, though. I see that z/right
, for example, internally has a call to skip-whitespace
. Why not also have skip-unsexprable
as an option somewhere in the call chain?
I think z/sexpr
should not throw when a node has unevals/discards
I see the rationale, though (I think! 🙂 ); what would you return in place of a discard node?
it should be treated the same as whitespace: ignored
nil
is misleading. It’s literally discarded, not returned as nil
hm, but what would you return?
oh
don’t visit them at all
actually skip them 🙂
oh you are calling z/sexpr
on an uneval node itself? not something with an uneval node inside of it?
exactly
the latter works fine
yeah, don't do that :)
haha
I’m trying to work my way around it 🙂
you can just look at the :tag
of the node
and then skip it
yeh, that’s what I figured. Interpret the type of node.
It forces me into oldschool looping rather than the neat find-next
functionality 🙂
And I was wondering if that could maybe be an area that could improved. I don’t know the history of rewrite-clj
, though
you can also do a kind of postwalk using z/next
maybe
not sure if that helps over find-next
, I never used the latter
you should always be a bit careful with z/sexpr
this has always been the case
it can throw on invalid formed maps as well, e.g. {:a 1 :b}
, there's just nothing rewrite-clj can or should do about this I think
I get that.
So you might actually generally have to have the try/catch
safeguard
Depending on your scenario ofc
yes, and/or first check with tag
@reefersleep there is an open issue on skipping uneval nodes https://github.com/clj-commons/rewrite-clj/issues/70, we are discussing a decent way to solve.
I sometimes use zip API walk functions when I want to only visit certain nodes in a zipper.
@lee I’ll have a look! Meanwhile, I made my own find-next
.
(defn sturdy-find-next
([zloc pred]
(sturdy-find-next zloc z/right pred))
([zloc f pred]
(if (z/end? zloc)
nil
(let [next-zloc (f zloc)]
(cond (z/end? next-zloc)
nil
(= :uneval (n/tag (z/node next-zloc)))
(recur next-zloc f pred)
(pred next-zloc)
next-zloc
:else
(recur next-zloc f pred))))))
(Just a quick WIP)
I think the idea of further configurability of what is skipped is interesting, but of course needs some hammock time 🙂
(what @sogaiu mentions in the thread)
@lee Without looking into the issue, so forgive my ignorance, why is uneval not just treated as whitespace? Because one might want to rewrite it right?
That’s what I thought, and I
m
I think I’m in the same state of mind as you; maybe it should be up to the user what they want to skip, and what not. In an ideal world, anyway. But I’ve no idea if that’d pollute the interface, how it’d impact performance and so on.
I’m guessing the original authors just made a choice on what to skip. I see their point, it really is not technically whitespace.
But same could be true for commas?
What if I want to rewrite all commas into whitespace for example?
But commas are technically Clojure whitespace.
yes, I know, but you might want to get rid of those using a zipper in rewrite-clj
It is a fuzzy area, I don’t think the default choice was wrong, and am thinking something configurable probably makes sense. The fact that rewrite-clj skips comments by default is likely why cljfmt does not format them!
The difference from a Clojure perspective between ;; and # is that # actually goes through the reader. Anything invalid in # will cause the reader to throw. E.g. #{:a}. So technically it doesn't belong in the "whitespace" category of things that are plainly ignored by the reader.
Good observation, yeah
To finish my ramble on cljfmt... it does also use raw movement functions (effectively right* next* etc) to examine and effect changes to whitespace.
On a different note, I’m thinking I should probably start measuring how changes we make to rewrite-clj affect its performance. I know this can be a difficult thing to do in a meaningful way, so might not pursue. I’m particularly interested how rewrite-clj v1 compares to rewrite-cljs as I turfed some of the rewrite-cljs tunings in favour of a common code base. But I’m also generally interested. Not sure what a good strategy would be yet. Do folks know of any projects that do measure and compare performance? I do run 3rd party libs tests suits (ex. zprint, cljfmt, etc) against rewrite-clj v1. I wonder if somehow I could use the test suite runs to help me measure if rewrite-clj changes improve/degrade performance.
I am planning to upgrade cljfmt to rewrite-clj v1 (if weavejester agrees). Since cljfmt uses rewrite-cljs, I decided to take a https://github.com/weavejester/cljfmt/issues/211#issuecomment-799813626. Result are to be taken with a grain of salt, but under my little test I am seeing rewrite-clj v1 is 20% slower than rewrite-cljs for cljfmt. I guessed there might be a hit and this seems not too horrible. We’ll see what others think.
Tufte is great for manually working down towards bottlenecks. Auto-profiling sounds even better!
And, dipping into performance tuning for the easy wins feels awesome. You could timebox it and prevent wasting too much time :)
Yeah perf can be important. In a use case like clj-kondo, integrated with some sort of "squiggly lines" IDE functionality one wants a particularly tight feedback loop ...while for cljfmt it's less important IMO, as one is not reformatting files the whole time (especially for most IDEs, which basic formattings will match with cljfmt's)
Good news, weavejester does not consider potential cljs slowdowns with rewrite-clj v1 as blockers, so I’ll proceed without optimizations for now.
btw have you profiled execution looking for slow spots? btw I have a WIP lib that auto-profiles a whole codebase with https://github.com/ptaoussanis/tufte/ , so one can know the slow functions very quickly I could share a sample report if that helps