@brandon.ringe would you mind testing it on your windows VM later?
I sometimes notice that I get duplicate entries for find-references:
Heyo, I tried to upgrade to new version and noticed that I cannot jump to definition of the re-frame events anymore. Is this something known? Or maybe something wrong on my side?
@dmytro.bunin I think this will be re-introduced soon. LSP moved their analysis to clj-kondo and clj-kondo didn't expose this yet, but will soon
(hopefully)
oh okie, thought that was the reason 🙂
I sometimes copy / paste expressions and now that I have enabled lsp, it takes a few seconds before my editor becomes responsive again. What can I do?
also slurping and barfing using paredit takes long
Maybe this has to do with formatting? Can this be turned off?
lsp-toggle-on-type-formatting: I'll try this
doesn't help :/
Timeout while ... rangeFormatting, was in my mini-buffer
oh it was in the config:
lsp-enable-indentation nil ; uncomment to use cider indentation instead of lsp
You've probably mentioned this before, but are you doing this in emacs? Very recent versions of emacs have orders-of-magnitude improvements in parsing the (ginormous) json responses from LSP.
yes, emacs
oh then I should probably upgrade my emacs
I'm running 26.3
I'll do this next weekend or so
It made a big difference to me (I rely a lot on LSP at work for Javascript/typescript and Scala). I believe they added it in emacs 27.1: https://www.i-programmer.info/news/90-tools/13945-emacs-adds-native-json-parsing.html
I use the bleeding edge 28 on Ubuntu and 27.1 on Mac. Both are nice and speedy now using lsp.
@borkdude the emacs 27 should improve indeed, but there is a performance bug on range formatting of clojure-lsp we need to fix Also, for the duplicated references, this happens if client is sending includeDeclaration too
Oh, I think this bug is new actually, this should happen for some symbols that come from a cljc file
oh right
maybe it's something in the clj-kondo analysis because .cljc code is linted twice, once for each language
the issue is that clj-kondo returns a anlaysis for each lang, clj and cljs
exactly
we need to filter somehow if the element is the same, like same line/col/name
({:fixed-arities #{0},
:name-end-col 6,
:name-end-row 12,
:name-row 12,
:name blow,
:lang :clj,
:filename "/home/greg/dev/clojure-sample/src/clojure_sample/d.cljc",
:from clojure-sample.d,
:col 2,
:name-col 2,
:arity 0,
:bucket :var-usages,
:row 12,
:to clojure-sample.d}
{:fixed-arities #{0},
:name-end-col 6,
:name-end-row 12,
:name-row 12,
:name blow,
:lang :cljs,
:filename "/home/greg/dev/clojure-sample/src/clojure_sample/d.cljc",
:from clojure-sample.d,
:col 2,
:name-col 2,
:arity 0,
:bucket :var-usages,
:row 12,
:to clojure-sample.d})
this is the return of a function referenced in a cljc file
probably distinct by name row col bucket should work, not sure is a good approach though
yeah, probably
maybe it's good to err on the side of duplicates
what do you mean?
to just leave it
Is clojure-lsp supposed to give completions for e.g.:
(ns foo
(:require [clojure.set :as set]))
(set/...)
?not sure, we already had this kind of bug in the past with Calva
I made a safe distinct that I think it should work, testing it...
yes, unless its set/
since is not a parseable code, this is WIP yet
(set/u)
also doesn'tmaybe this has to do with my project not being analyzed correctly
not sure
I made a /tmp/foo/deps.edn and tmp/foo/src/foo.clj
yeah, we depend on that, you can check if set was analyzed correctly with
lsp-clojure-cursor-info
but it doesn't seem like it's being analyzed on startup
it'll show kondo analysis of that cursor
how can I re-configure the project root when visiting a clojure file?
hum, lsp-mode should prompt it
it doesn't :/
not sure if lsp-workspace-folders-remove should fix
that did something, thanks
Fixed on master, thanks for the report 🙂 it was indeed a corner case that we didn't find during kondo integration
:thumbsup:
Do you folks think there's room to introduce more streaming in the initial cache building? And avoid the memory spike? If yes, should I look for it in kondo or lsp code? Or both?
probably kondo, clojure-lsp startup has no more any logic regarding scans, so we rely on clj-kondo classpath analysis
c/c @borkdude
clj-kondo doesn't scan anything unless you tell it to
e.g. the clj-kondo VSCode plugin doesn't scan anything either. this just happens as a side effect of people visiting their files
it's very passive in that regard
yeah, I meant on the clojure-lsp startup, we tell clj-kondo to scan the whole classpath
and that is when clojure-lsp has the memory-spike
we already do that in batches which improves a lot
I think @claudius.nicolae wants to improve even more that
maybe the batch size can be made configurable
I think the memory spike happens because of stuff accumulating
and I don't think there's anything that should change in clj-kondo itself for that
but feel free to investigate of course, using VisualVM or some other tool
yeah, good point, ATM we use a batch size of 50 files, and keep the 48 analysis while analyzing the 49 probably keep the memory high
if we could stream analyze in clojure-lsp, not sure if that works, but probably it'd help a lot @claudius.nicolae
Why does clojure-lsp crawler put kondo/run results in it's mem db and disk db?
mem db for the process and every feature, disk db for next startups
with that, next startups takes 2-3s
unles the project deps changed
Calling (System/gc)
as the first statement in clojure-lsp.crawler/run-kondo-on-paths!
keeps the memory under 200MB (as opposed to reaching 2GB) with a performance penalty of ~100ms on a project with cljs & sci as deps.
No significant gains with transducer.
The transducer
Im'a look at instructing kondo to return less, I noticed yesterday something on the lines: kondo returns 100mb something and lsp only uses 10mb something out of that.
I notice a concat in the deep-merge code. That might not be what you want
@claudius.nicolae we already have some options to exclude things in the analysis. We can add more if that’s useful for LSP
Call gc after each condo if mem past 500mb keeps it under 500 and time is 100s (vs 75s on big mem).\
Im'a look into kondo a bit - on some (most?) calls it ups the mem with 200mb for a 10mb return; I hope there's room for using transients here and there to allocate less.
:thumbsup:
Rabbit hole is deep here. I don't have anything of substance other than the idea I started with: analysis/analyze incrementally adds to context slots that could benefit from being https://hypirion.com/musings/understanding-clojure-transients as they would allocate less. Then again, there's some parallelism there that would be affected. I'ma stop here.
Should I do a lsp PR with "gc after each condo if mem past 500mb"?
What do you think @snoe?
Personally, I think we should leave it if we don't understand exactly what's happening.
:thumbsup:
what is happening, (on project init):
• kondo allocates 100-500mb memory per jar to compute a 1-20mb analysis data return
• the memory (used at once, not just allocated and not gc'ed) goes to ~1.5gb when analyzing multiple jars at once (batch 50)
• low spec pc folks report freezes, whine and uninstall Calva (some had Macbooks).
my (potential) pr does:
• batch size 1 (jar by jar) keeps under 500mb (for the most part) - low spec pc folk ca use it.
• not too significant time penalty as proven by:
• - calling System/gc after each kondo slows to 200s from 75s.
• - however, calling System/gc after kondo only when memory past 500mb slows to just 100s from 75.
ballpark numbers
"used at once" because setting -Xmx lower -> OOME
cc @snoe
Not sure batch size 1 is the way... increasing the time to by 25s is not good, since for big projects with it will take more time
What about a batch size between 1 and 50? Is there any better batch size that could improve the time and reduce the mem?
or is there something in clj-kondo that could be optimized? what about running without analysis, just for debugging this? does it also spike memory?
{:output {:analysis false}}
Probably it will not spike, but it's a good idea for the test @claudius.nicolae
lsp makes use of kondo :alaysis output and puts it in a mem db and disk db
yes, but for debugging if this is the cause
I'll try the flag, but the logic may break
yes, understood. just for cause analysis
(assoc-in [:config :output :analysis] false)
gives no improvement (still 1.5gb)
I think it worth disabling the findings
too
I would say that if you want to change batch size or force gc for low spec users @claudius.nicolae it could be sent in as a user-setting. Right now, our minimum spec is 2gb free mem, I wouldn't want to slow down processing for others to meet people who can't get there.
(-> (assoc-in [:config :output :analysis] false)
(assoc-in [:config :findings] false)
(assoc-in [:config :linters] ^:replace {}))
Is that the config shape? (couldn't spot it)I think the second line is not necessary (will do nothing) could you confirm @borkdude?
correct, it won't do anything
still nothing significant
maybe you can make a repro with clj-kondo only?
as in, run clj-kondo/run!
a 100 times of on the same jar?
Let me see
@claudius.nicolae I debugged a previous issue like here: https://github.com/clj-kondo/clj-kondo/issues/1036
This runs kondo on a bunch on jars (like lsp batch 50 does) and ups the memory to 1.5gb+ @borkdude
Make sure to get the paths right, there.
Oh, so the mem spike is really on clj-kondo
I used this little fn a lot:
(defn used-memory-mb []
(let [mb (* 1024 1024)
rt (Runtime/getRuntime)
total (.totalMemory rt)
free (.freeMemory rt)
used (- total free)]
(Math/round (double (/ used mb)))))
and https://github.com/clojure-goes-fast/clj-memory-meter - with some caveats (I think it realizes lazy seqs or something, I saw memory spike when using it in some cases).> Edit it: make :paths from :override-paths I don't get this. Can you make a repro that I can just git clone and run?
see deps.edn there
I am staring at it
I don't see override-paths, only override-deps.
put that one
*override-deps my mistake
gist is: to have a bunch of deps in it
ok
Also:
(read-string (slurp (str project "paths.edn")))
instead in fn there
[(read-string (slurp (str project "paths.edn")))]
🤦yeah, I already got that one. I can reproduce this with -Xmx1g:
$ clj -J-Xmx1g -X:clj-kondo/dev clj-kondo.core/runf
Exception in thread "pool-1-thread-3" Exception in thread "pool-1-thread-8" java.lang.OutOfMemoryError: Java heap space
But yeah, not sure if this a bug or just a result of loading lots of data in one go.It's building up one giant data structure of analysis data and findings
Why is it faster (75s) when I do one run on all the jars, and slower (100s) when I do many runs jar by jar? If the two would be as fast - we could go jar by jar with little memory.
(and no extra Calva config)
Not sure if it matters much, but does the stacktrace say at what point in the code the OOM occurred? @borkdude If not, maybe some logging could tell. Though I do see what you mean, if all the data returned is needed, and of course we want it returned in a reasonable time span, then even if the returned result is fairly small, the intermediate aggregates of whatever operations built the data will have to all be in memory at once, unless somehow GC could run in that short time span in which the result is built in order to free up unused memory. Having said the above, I'm not sure System/gc would be a good solution in any case, as I think it's not guaranteed to run when called, and results may differ depending on JDK version (speculating). So, if all efforts have been made to avoid intermediate aggregates in the building of the result in clj-kondo, and we know clojure-lsp needs all info in the result, I'm not sure what else can be done to lower mem usage. > if all efforts have been made to avoid intermediate aggregates in the building of the result in clj-kondo I think this is worth looking more into, but maybe @borkdude has optimized the code in this respect as best as it can be (I haven't looked). If it is fully optimized, then the options lie in clojure-lsp. What can it avoid doing at startup, or what can it break into chunks better and do over a longer period of time. Or perhaps if there is some speed trade-off, then an option to enable this less-mem-intensive method would be good as @snoe said. I somewhat refuse to believe though that we're stuck with the current startup mem usage, but I'm an optimist 😄.
bringe: yes, during linting when it tries add another thing to the findings map. Which makes sense, because that adds to the memory usage. what happens when you lint x libraries at once, is that clj-kondo will hold on to a lot of data, so it can make conclusions about all the data as a whole. a solution is to lint in batches, which is already done in clojure-lsp using a magic number of 50 libraries / classpath segments, but maybe this number could be made configurable or calculated using a heuristic on the available JVM memory
There is room for some optimization. When you are linting only for analysis purposes, you must pass the :no-warnings true
flag (which is also documented). When you pass this, together with :parallel true
, clj-kondo will skip already seen jar files. Possibly we can also use this setting to put less data in the output, which is I think not yet done. I'll take a look
good point ☝️ @claudius.nicolae could you test it with that :no-warnings true
flag? I didn't know about it (sorry 😅)
Is there any way to get the available Mem on an OS via clojure/Java ? this way we could calculate a good batch size indeed I think
I think clj-kondo doesn't fully exploit memory savings when you turn on :no-warnings true. It just lints as always, but it just skips the jars it's already seen
Yes, you can read this from some properties, I'm pretty sure
Note that even the GraalVM binaries honor -Xmx
flags as command line args
Yeah, for example clojure.core jar that is present in multiple libs would be scanned only 1 time, right?
that's already what a build tool does when you calculate a classpath: clojure core is only in there once
the :no-warnings flag is about when you invoke clj-kondo multiple times with the same .jar files
Oh, got it, it makes sense
the example clyfe gave has 274 libraries, which might be a bit extreme, I don't think most projects have this amount of libs, but I could be wrong
btw, with :no-warnings true
I can lint the 274 with Xmx2g
I'll also try with 1g...
ok, there I run into problems
I'll try some optimizations
ok, this works with 1g:
(def project "/tmp/repro/")
(def paths (partition 30 (edn/read-string (slurp (str project "paths.edn")))))
(defn runf [_]
(let [project "/tmp/repro/"
args (fn [paths]
{:cache true
:cache-dir (str project ".clj-kondo/.cache")
:no-warnings true
:parallel true
:lint paths
:config {:output {:analysis {:arglists true
:locals false}
:canonical-paths true}}})]
(clojure.core/run! #(run! (args %)) paths)))
so I guess that should be sufficient
1) use :no-warnings true
2) batch size depending on memory heuristic
out of 274 some take very little and some a lot (like cljs jar); when several that take a lot find themselves in the same batch, that's the bottleneck.
you could look at the size of the jar as well as another heuristic ;)
Thanks for the tests @borkdude so, I tihnk we could use the batch size dynamic checking the available mem, and include the :no-warnings
I found one possible optimization, to drop the rewrite-clj expr from the call objects that are used for linting. But I'm not sure if this will affect analysis...
Using that I can lint 274 libs with 1g
I suspect it will be ok
It would be nice if lsp could do some unit tests to see if clj-kondo still delivers the correct analysis results using the :no-warnings true + :parallel true settings. Try again with clj-kondo master (commit 175c48839299c445f6684fa15e5692b03c9bcb5a)
https://github.com/clj-kondo/clj-kondo/commit/175c48839299c445f6684fa15e5692b03c9bcb5a
We already have unit-tests for most(all?) features, if there is a bug I think they will fail (I hope)
ok!
That check on Github is new for me 😮 cool
yeah, that's done using a github action
really cool
@ericdallo the deployed version: clj-kondo-2021.01.21-20210203.204227-15.jar
I'll test it
All tests passed 😄
and how about the RAM?
Testing with a project of 250 files classpath, it took 58s and with the 50 batch size and RAM with spikes of 18-19% (3.1GB) 😞
Let's wait for @claudius.nicolae test it too to confirm that
Test with -Xmx1g
, that's what it's about
The memory can go high when it just delays GC
Alright, I'll test it
I tested comparing the master clojure-lsp with clojure-lsp with that clj-kondo version and could not see any differences, I used both Xmx1g but I noticed I can get no OOM with my 250 files project, even uing master clojure-lsp This is a print of the startup jvm heap:
we can see the spikes of each batch analysis
Try the deps that Clyfe posted
Yep, I think it was a good quick win 😄
We added it here: https://github.com/clojure-lsp/clojure-lsp/blob/master/src/clojure_lsp/crawler.clj#L193
Oh, & with batch size 1.
That shouldn't be necessary though, calling gc manually? If the JVM runs out of memory it will do so itself
We have this issue explain why we added: https://github.com/clojure-lsp/clojure-lsp/issues/229
Maybe change jvm-opts in uberjar? I see -Xmx2g there.
I propose a default of -J-Xmx500m
& batch size 1. Things look well with that on my trials (small codebase).
medium/huge codebases will throw a OOM
Or, leave it 2g & manually call System/gc then.
Right, the issue seems to be that with less than 2g Xmx, some people get OOMEs (something that I think is worth exploring more itself), but when that max isn't reached at startup, GC doesn't seem to be triggered, so mem can hover at something like 1.7g even though really only something like ~300+ MB is actually used. I've analyzed this with VisualVM and when I manually ran GC after startup, it dropped that much mem.
I think the real issue is the OOME, though. There may be ways to prevent that with code changes, though I haven't dug into it.
batch size 1, OOME won't be a problem even with 500mb methinks.
It is though
I believe Eric tried to drop it to 1.5 G and soon after got a report of an OOME
The idea is to not kondo too much at once. Little by little.
so batch size 1.
Right, that has been implemented somewhat so far, from what I know.
yep, I could reproduce the OOM with a mid project when I decreased the Xmx to 1.5
https://github.com/clojure-lsp/clojure-lsp/blob/master/src/clojure_lsp/crawler.clj#L92
and there was no batch analysis on that time
☝️
@claudius.nicolae Feel free to test it increasing or decreasing the batch size, testing it with big and small projects
to test it with a big project, you can just add a lot of deps to deps.edn
ok
@nicdaoraf may be interested here
Check that ☝️
I have had OOMs with the clj-kondo VSCode plugin once. The mistake that was made then was that the sci ctx (used for hooks) was memoized and thus forever kept into memory. But that got fixed.
That was clearly a leak. I'm not sure about this one.
If you don't see an improvement, the code that runs the batches may need to be restructured. https://github.com/clojure-lsp/clojure-lsp/blob/0b8b62bab312fdc0367cb7e63be5141052ebc369/src/clojure_lsp/crawler.clj#L116 Here ^ the paths are partitioned, but then all batches are mapped and reduced together. Just speculating, but this might be the same as running all paths at once as far as GC is concerned. If that were true, I don't know the solution off hand, but might be worth digging into.
Also with that code, there may be some unnecessary intermediate collections being built, as opposed to using a transducer to avoid that
(I'm no transducer expert but I know they're used to eliminate intermediate results.)
We could move the save in db code to the batch, so the batchs would be independent , and we would't need to merge all batchs
not so eaasy to do though
I pondered on reducing with transients but didn't find obvious stuff; code seems pretty optimal as is.
I think you could also get rid of the :findings
earlier maybe since you're probably only interested in the :analysis
?
@borkdude it'd be necessary to refactor the code actions code, which rely on :findings
, which IMO is the right choice to check if we have a unused import for example
Not sure if it would imporve a lot performance, we could test disabling the findings and check the mem difference
but this is only relevant for the currently visited file, I think?
and not for the saved analysis?
yeah, good point, by default the findings are enabled in clj-kondo?
is there a opt-out flag for findings?
yeah, that's more or less the primary use case. yeah, you can override the linters with level :off
nice, it's something to test it @claudius.nicolae
Maybe try:
{:linters ^:replace {}}
I think this should turn all the linters offNice, @claudius.nicolae feel free to test it, I can only check that tomorrow probably
With all https://github.com/tape-framework/versions deps, batch size 1, gc call on each kondo: it mostly stays under 250mb with occasional spikes to 500mb.
Check the logs, what is saying?
Im'a try with no findings now.
looks good, could you try removing the batch logic and full analysing?
is just call run-kondo-analysis instead of the batch sufix
spiky spiky 1.5gb and counting
Ok, how much time took the batch size 1 analisis?
If you feel like trying a transducer as well to see if there's a difference (again, I'm no expert but it's something I'd want to try), you could try this:
(defn ^:private run-kondo-on-paths-batch!
"Run kondo on paths by partition the paths, with this we should call
kondo more times but we fewer paths to analyze, improving memory."
[paths]
(let [total (count paths)
batch-count (int (Math/ceil (float (/ total clj-kondo-analysis-batch-size))))
xf (comp
(partition-all clj-kondo-analysis-batch-size)
(map-indexed (fn [index batch-paths]
(log/info "Analyzing" (str (inc index) "/" batch-count) "batch paths with clj-kondo...")
(run-kondo-on-paths! batch-paths))))]
(log/info "Analyzing" total "paths with clj-kondo with batch size of" batch-count "...")
(if (<= total clj-kondo-analysis-batch-size)
(run-kondo-on-paths! paths)
(transduce xf deep-merge paths))))
(batch 1) took 206.38297 secs
yes, I think that should help as well. map-indexed might also be chunked (in none transducer mode) which means there would be at least 32 of those in memory at once
vs 75.0604 secs on big mem
too much time..
yeah, a transducer on that seems to fit well
Right, it does seem that the intermediate aggregates are what's piling up in memory, since when we GC so much is obviously not being used
more tomorrow bc is late here 🙂 ; gn for now.