I’m hoping to get some time to work on Eastwood this fall, and I’d really like to see if there are ways to make it go faster, since linting our code base (around 50kloc) is starting to take quite a bit of time.
From my initial investigations, it seems like most time is spent in analyzing the source-code, and that the linters themselves are pretty fast.
So I was wondering if @bronsa or @andy.fingerhut had any thoughts on attack vectors here? As for me I’ll probably start by running the linting under yourkit to see if there are any obvious improvements to be had, but my fear is that this is not about making the existing implementation faster, but rather about finding another way to solve the problem, if that makes sense.
And, I should probably make a github issue of it, since whatever is discussed here is lost after a bit of time.
are you applying for clojurists together maybe?
No. I’m asking my employer to give me some days
ah ok, because I saw eastwood (and clj-kondo) being mentioned as projects people would want to support through it
but time is more of an issue than money probably
isn't part of what makes it slow mostly the evaluation that tools analyzer is trying to do?
Yeah, I saw that too, but yeah, time is more of an issue. And even if I would get time off from work to work on Eastwood for three months, I’m not sure I would be able to fill three months with eastwood work.
I found this a handy tool to do performance analysis: https://github.com/clojure-goes-fast/clj-async-profiler
Yeah, I think that’s what the slowness is caused by, but I’d need time to look into exactly how the analysis is performed, if it can be performed in other ways.
I have spent some time on trying to do it in parallel, but that also requires some thinking on my part. And I’m not good at thinking when I’m working from home after work, kids and everything else.
makes sense
I don’t know how much time @bronsa has put into making the analyzer blazingly fast either (not blaming him nor the analyzer, just saying I don’t know).
haven't put any time into it at all
it's highly likely that the overhead is entirely in the analyzer
If so, the project becomes even more interesting 🙂
@ambrosebs has done some work on a custom version of t.a.jvm that's a bit more performant
I think
don't think there's a clear win anywhere, it's death by thousand cuts
t.a.jvm was never designed to be particularly performant, but to be extensible/clear, so it is highly possible that theres slowness inherent in its design
@slipset fwiw, I'm using this very simple profiling macro for clj-kondo to see how much % of the time is spent where: https://github.com/borkdude/clj-kondo/blob/master/src/clj_kondo/impl/profiler.clj
e.g. 40% of linting time is spent in parsing
@bronsa that’s what I’m fearing too (that it’s many small things), but it’d be good to at least get that as a fact (which I’ll try to do by profiling)
@borkdude there is some simple timing stuff in eastwood as well.
The main thing is that I haven’t quite dug into how the analyzer works.
AFAIK eastwood feeds the analyzer with one ns at the time, which may or may not be optimal
This approach is probably great if you want to use Eastwood from inside eg Cider where you lint one file at a time, but might not be optimal when you’re running Eastwood on a whole project, like you’d do when running in CI.
not that simple to parallelise
ns analysis is side-effectful
clj isn't declarative as cljs
yeah, we talked about that previously. Eastwood uses tools.namespace(?) to figure out in which order the project ns’s need to be analyzed
If you could find islands in the graph that tools.ns creates, those islands could be done in parallel.
But again, this (for me) requires thinking, which is not something I do well late at night after work.
yes, in theory if everybody plays nice
in practice that may still not be enough
there's code in the wild that relies on namespace loading ordering, implicitely
It’d had to be behind a setting, so you would opt in for parallelisation (what a word to type)
but then it becomes a matter of compromising/parametrising
yeah sure
From now on, messages here should be copied into long-history-searchable Clojurians Zulipchat https://clojurians.zulipchat.com
I don't have any thoughts on ways to significantly speed up Eastwood. There is perf measuring code as a command line option in Eastwood today, and sounds like you have seen that with recent Eastwood versions most of the time is in analysis, and bronsa is the expert on that step.
I haven't used flycheck or similar editor add-ons that people have made to use Eastwood on their code as they type, but I suspect what happens there is that they just run Eastwood pretty much continually, starting another run soon after the last one completes, filtered by changes made to files? Something like that, probably. The reason I mention it is that in such a scenario, maybe one doesn't even notice how long it takes Eastwood to run, because you do not manually invoke it.
It's sort of the "install a mirror next to a slow elevator" kind of approach, to attack the psychology of the people involved rather than changing the performance of the system 🙂
yea, fwiw here's the main things my analyzer does differently from tools.analyzer https://github.com/clojure/core.typed.analyzer.jvm#differences-from-toolsanalyzerjvm
probably the biggest performance win is not using resolve-{ns,sym}
and allowing custom implementations for each platform
but if you don't need incremental analysis it might be a bit overkill to use
Just from poking around a bit, I see that eg analyzer.jvm/build-ns-map
is called a lot with the same count of ns’s
Seems like for Eastwood, it could be called once. (but I’m only poking around, not understanding it correctly)
It wouldn't be difficult to experiment with a change to build-ns-map
that calls all-ns
and looks up the return value in a memoization map, returning the cached value on a hit.
I’m just about to try that out 🙂
Such an experiment would at least tell you if it saves much time.
Just not seeing where all-ns
is defined…
It is in core.
clojure.core?
Also, just a warning about the memoization there -- I expect 99% of the time it will return the same value if you memoize like that, but note that namespace objects are mutable, and can have new aliases and Vars added to them over time, so the memoized version could return a subset of what the non-memoized one would.
I have no idea what fraction of Clojure code might be affected by that.
anyways, it’d be interesting to see what effect it has on timing
I also changed the impl a bit
agreed. I hope it saves a bunch of time, and rarely causes changes in behavior 🙂
(defn build-ns-map []
(persistent!
(reduce (fn [acc ns]
(let [ns-name' (ns-name ns)]
(assoc! acc ns-name' {:mappings (merge (ns-map ns) {'in-ns #'clojure.core/in-ns
'ns #'clojure.core/ns})
:aliases (reduce-kv (fn [a k v] (assoc a k (ns-name v)))
{} (ns-aliases ns))
:ns ns-name'}))) (transient {}) (all-ns))))
Haven’t measured if it gives any perf gains, but might be faster.
yes
it most certainly caused a change in behaviour 😕
if it is called a bunch of times, e.g. once for each top level form in a file with the code of one namespace, then the value of the :mappings key would differ significantly from the first form to the last in a file.
wait, I might be wrong about that ...
memoized:
src dev test
== Linting ardoq.logger ==
done in 678.707711 ms
== Linting ardoq.schemas ==
done in 598.816618 ms
== Linting ardoq.utils ==
done in 947.06261 ms
== Linting ardoq.core ==
done in 478.296453 ms
== Linting ardoq.service.metadata-service ==
done in 250.096475 ms
== Linting ardoq.client.client-utils ==
done in 290.824036 ms
== Linting ardoq.persistence.mongo.mongo ==
done in 237.54222 ms
== Linting ardoq.system ==
done in 29.137272 ms
== Linting ardoq.utils.sanitizer ==
done in 51.532309 ms
== Linting ardoq.utils.org-utils ==
done in 99.718969 ms
== Linting ardoq.utils.commons-utils ==
done in 54.200468 ms
== Linting ardoq.utils.service-utils ==
done in 40.833131 ms
== Linting ardoq.utils.encoder ==
done in 197.032283 ms
== Linting ardoq.specs ==
done in 689.955477 ms
== Linting ardoq.service.heartbeat-service ==
done in 345.340966 ms
== Linting ardoq.persistence.crud-store ==
done in 149.723064 ms
== Linting ardoq.pubsub-service ==
done in 341.467959 ms
== Linting ardoq.service.event-service ==
done in 252.765624 ms
== Linting ardoq.persistence.mongo.repo ==
done in 215.895817 ms
== Linting ardoq.persistence.organization-store ==
done in 546.132456 ms
== Linting ardoq.utils.system-utils ==
done in 98.541928 ms
== Linting ardoq.persistence.repo ==
done in 198.904394 ms
== Linting ardoq.persistence.model-store ==
done in 160.790918 ms
== Linting ardoq.utils.model-adapter ==
done in 120.944057 ms
unmemoized
src dev test
== Linting ardoq.logger ==
done in 801.748673 ms
== Linting ardoq.schemas ==
done in 1197.414751 ms
== Linting ardoq.utils ==
done in 2149.84938 ms
== Linting ardoq.core ==
done in 1071.887829 ms
== Linting ardoq.service.metadata-service ==
done in 514.068053 ms
== Linting ardoq.client.client-utils ==
done in 597.463114 ms
== Linting ardoq.persistence.mongo.mongo ==
done in 586.476757 ms
== Linting ardoq.system ==
done in 51.211548 ms
== Linting ardoq.utils.sanitizer ==
done in 63.239537 ms
== Linting ardoq.utils.org-utils ==
done in 234.722717 ms
== Linting ardoq.utils.commons-utils ==
done in 118.071369 ms
== Linting ardoq.utils.service-utils ==
done in 86.937504 ms
== Linting ardoq.utils.encoder ==
done in 423.066819 ms
== Linting ardoq.specs ==
done in 1637.538032 ms
== Linting ardoq.service.heartbeat-service ==
done in 892.486891 ms
== Linting ardoq.persistence.crud-store ==
done in 396.092711 ms
== Linting ardoq.pubsub-service ==
done in 888.633896 ms
== Linting ardoq.service.event-service ==
done in 739.321349 ms
== Linting ardoq.persistence.mongo.repo ==
done in 467.753098 ms
== Linting ardoq.persistence.organization-store ==
done in 1557.947442 ms
== Linting ardoq.utils.system-utils ==
done in 277.459718 ms
== Linting ardoq.persistence.repo ==
OK, experimenting with ns-map calls, I think my statement above is correct. ns-map returns a mapping of symbols to Vars for all currently-def'd Vars in the namespace. That set of currently-def'd Vars in the namespace definitely grows as more forms are eval'd within a namespace, while analyzing it.
Such sadness, the memoized version takes basically half time.
When you say it definitely changes the behavior, do you mean that it changes warnings found by Eastwood?
It makes the source unlintable:
== Linting ardoq.fake.fake-subscription-service ==
Exception thrown during phase :analyze+eval of linting namespace ardoq.fake.fake-subscription-service
Got exception with extra ex-data:
msg='Could not resolve var: defn'
(keys dat)=(:var :file :end-column :column :line :end-line)
ExceptionInfo Could not resolve var: defn
After linting a bunch of ns’s
at least my reduce/persistent! seems a couple of ms faster pr ns.
Ah, but there seems to be a way to half the calls to build-ns-map
(defn global-env []
(atom {:namespaces (build-ns-map)
:update-ns-map! (fn update-ns-map! []
(swap! *env* assoc-in [:namespaces] (build-ns-map)))}))
Perhaps the expression for calculating the value of the :aliases key could be memoized, based on the return value of ns-aliases
? I don't know how much time is spent evaluating that expression vs. the rest of it, but if it is a large fraction, that might help reduce compute time, without changing the results.
no, I was reading to fast. Time to let the brain rest.
It seems like that in analyzer.jvm/macroexpand-1
we call build-ns-map
twice if we have a macro or an inline-fn. I guess that’s needed since we can do most anytning in a macro, but it does seem a bit crude.
I think there are some macros that expand to multiple def's on different vars
Run Eastwood on a collection of a few hundred different open source Clojure projects, and you find some weird stuff. You also find semi-weird stuff in Clojure's implementation itself, or at least it can seem weird the first time you encounter it.
Clojure/Java is a very dynamic language with mutable namespaces, and a few library authors have taken advantage of that.
Yeah, but in analyzer.clj
there is special handling for def
forms which update the ns-map directly for instance
I wouldn't be surprised if you are able to find some redundant work being done by tools.analyzer, by the way. Just trying to point out perhaps-surprising kinds of Clojure code that exists.
Much appreciated 🙂
And much learning to be done here as well 🙂
I should maybe make a list of whatever I remember coming across that surprised me while debugging Eastwood -- only a few examples stick out in my mind now, years later.
There are a few macros that have side effects while compiling code that invokes the macro. Clojure's gen-class, maybe? Not sure if I ever saw others.
Maybe one or two other macros with names beginning with def
in Clojure.
I have a vague memory of some code redefining the meaning of catch
, intended to override the behavior of a catch clause inside of a try. I can't find it right now, in a couple minutes of poking around.
The potemkin library takes advantage of some dynamicity in ways that most libs don't, but gets used by a fair number of other libs.