lsp

:clojure-lsp: Clojure implementation of the Language Server Protocol: https://clojure-lsp.io/
ericdallo 2021-02-02T03:28:32.198700Z

@brandon.ringe would you mind testing it on your windows VM later?

borkdude 2021-02-02T10:23:06.199800Z

https://twitter.com/borkdude/status/1356545209157488640 🎉

🚀 3
🎉 1
borkdude 2021-02-02T10:27:54.200300Z

I sometimes notice that I get duplicate entries for find-references:

borkdude 2021-02-02T10:28:18.200400Z

2021-02-02T10:37:59.200900Z

Heyo, I tried to upgrade to new version and noticed that I cannot jump to definition of the re-frame events anymore. Is this something known? Or maybe something wrong on my side?

borkdude 2021-02-02T10:40:31.201500Z

@dmytro.bunin I think this will be re-introduced soon. LSP moved their analysis to clj-kondo and clj-kondo didn't expose this yet, but will soon

borkdude 2021-02-02T10:40:44.201900Z

(hopefully)

2021-02-02T10:40:50.202100Z

oh okie, thought that was the reason 🙂

borkdude 2021-02-02T11:00:11.202800Z

I sometimes copy / paste expressions and now that I have enabled lsp, it takes a few seconds before my editor becomes responsive again. What can I do?

borkdude 2021-02-02T11:01:53.203200Z

also slurping and barfing using paredit takes long

borkdude 2021-02-02T11:02:17.203600Z

Maybe this has to do with formatting? Can this be turned off?

borkdude 2021-02-02T11:02:37.203900Z

lsp-toggle-on-type-formatting: I'll try this

borkdude 2021-02-02T11:02:55.204100Z

doesn't help :/

borkdude 2021-02-02T11:03:09.204400Z

Timeout while ... rangeFormatting, was in my mini-buffer

borkdude 2021-02-02T11:05:10.204700Z

oh it was in the config:

lsp-enable-indentation nil ; uncomment to use cider indentation instead of lsp

Eamonn Sullivan 2021-02-02T11:22:12.205900Z

You've probably mentioned this before, but are you doing this in emacs? Very recent versions of emacs have orders-of-magnitude improvements in parsing the (ginormous) json responses from LSP.

borkdude 2021-02-02T11:39:17.206100Z

yes, emacs

borkdude 2021-02-02T11:39:27.206300Z

oh then I should probably upgrade my emacs

borkdude 2021-02-02T11:39:38.206500Z

I'm running 26.3

borkdude 2021-02-02T11:39:58.206800Z

I'll do this next weekend or so

Eamonn Sullivan 2021-02-02T12:15:14.208200Z

It made a big difference to me (I rely a lot on LSP at work for Javascript/typescript and Scala). I believe they added it in emacs 27.1: https://www.i-programmer.info/news/90-tools/13945-emacs-adds-native-json-parsing.html

☝️ 1
Eamonn Sullivan 2021-02-02T12:15:59.210100Z

I use the bleeding edge 28 on Ubuntu and 27.1 on Mac. Both are nice and speedy now using lsp.

ericdallo 2021-02-02T12:16:27.210900Z

@borkdude the emacs 27 should improve indeed, but there is a performance bug on range formatting of clojure-lsp we need to fix Also, for the duplicated references, this happens if client is sending includeDeclaration too

ericdallo 2021-02-02T12:18:32.211100Z

https://github.com/clojure-lsp/clojure-lsp/issues/266

ericdallo 2021-02-02T12:31:25.211400Z

Oh, I think this bug is new actually, this should happen for some symbols that come from a cljc file

borkdude 2021-02-02T12:36:51.211600Z

oh right

borkdude 2021-02-02T12:37:08.211900Z

maybe it's something in the clj-kondo analysis because .cljc code is linted twice, once for each language

ericdallo 2021-02-02T12:37:15.212200Z

the issue is that clj-kondo returns a anlaysis for each lang, clj and cljs

ericdallo 2021-02-02T12:37:17.212400Z

exactly

ericdallo 2021-02-02T12:37:42.212600Z

we need to filter somehow if the element is the same, like same line/col/name

ericdallo 2021-02-02T12:47:08.212800Z

({:fixed-arities #{0},
  :name-end-col 6,
  :name-end-row 12,
  :name-row 12,
  :name blow,
  :lang :clj,
  :filename "/home/greg/dev/clojure-sample/src/clojure_sample/d.cljc",
  :from clojure-sample.d,
  :col 2,
  :name-col 2,
  :arity 0,
  :bucket :var-usages,
  :row 12,
  :to clojure-sample.d}
 {:fixed-arities #{0},
  :name-end-col 6,
  :name-end-row 12,
  :name-row 12,
  :name blow,
  :lang :cljs,
  :filename "/home/greg/dev/clojure-sample/src/clojure_sample/d.cljc",
  :from clojure-sample.d,
  :col 2,
  :name-col 2,
  :arity 0,
  :bucket :var-usages,
  :row 12,
  :to clojure-sample.d})

ericdallo 2021-02-02T12:47:19.213Z

this is the return of a function referenced in a cljc file

ericdallo 2021-02-02T12:47:44.213200Z

probably distinct by name row col bucket should work, not sure is a good approach though

borkdude 2021-02-02T12:48:53.213400Z

yeah, probably

borkdude 2021-02-02T12:49:03.213600Z

maybe it's good to err on the side of duplicates

ericdallo 2021-02-02T12:49:30.213800Z

what do you mean?

borkdude 2021-02-02T12:54:55.214Z

to just leave it

borkdude 2021-02-02T12:55:20.214700Z

Is clojure-lsp supposed to give completions for e.g.:

(ns foo
  (:require [clojure.set :as set]))
(set/...)
?

ericdallo 2021-02-02T12:55:20.214800Z

not sure, we already had this kind of bug in the past with Calva

☝️ 1
ericdallo 2021-02-02T12:55:37.215Z

I made a safe distinct that I think it should work, testing it...

ericdallo 2021-02-02T12:56:09.215200Z

yes, unless its set/ since is not a parseable code, this is WIP yet

ericdallo 2021-02-02T12:56:28.215400Z

https://github.com/clojure-lsp/clojure-lsp/issues/270

borkdude 2021-02-02T12:56:31.215700Z

(set/u)
also doesn't

borkdude 2021-02-02T12:57:00.215900Z

maybe this has to do with my project not being analyzed correctly

borkdude 2021-02-02T12:57:01.216100Z

not sure

borkdude 2021-02-02T12:57:23.216300Z

I made a /tmp/foo/deps.edn and tmp/foo/src/foo.clj

ericdallo 2021-02-02T12:57:26.216500Z

yeah, we depend on that, you can check if set was analyzed correctly with

ericdallo 2021-02-02T12:57:31.216700Z

lsp-clojure-cursor-info

borkdude 2021-02-02T12:57:34.216900Z

but it doesn't seem like it's being analyzed on startup

ericdallo 2021-02-02T12:57:39.217100Z

it'll show kondo analysis of that cursor

borkdude 2021-02-02T12:58:18.217300Z

how can I re-configure the project root when visiting a clojure file?

ericdallo 2021-02-02T12:58:48.217500Z

hum, lsp-mode should prompt it

borkdude 2021-02-02T12:58:59.217700Z

it doesn't :/

ericdallo 2021-02-02T12:59:14.217900Z

not sure if lsp-workspace-folders-remove should fix

borkdude 2021-02-02T13:00:24.218100Z

that did something, thanks

👍 1
ericdallo 2021-02-02T13:13:03.218500Z

Fixed on master, thanks for the report 🙂 it was indeed a corner case that we didn't find during kondo integration

borkdude 2021-02-02T13:13:49.218800Z

:thumbsup:

clyfe 2021-02-02T13:54:04.222600Z

Do you folks think there's room to introduce more streaming in the initial cache building? And avoid the memory spike? If yes, should I look for it in kondo or lsp code? Or both?

ericdallo 2021-02-02T13:54:52.222700Z

probably kondo, clojure-lsp startup has no more any logic regarding scans, so we rely on clj-kondo classpath analysis

ericdallo 2021-02-02T13:55:01.222900Z

c/c @borkdude

borkdude 2021-02-02T13:55:51.223100Z

clj-kondo doesn't scan anything unless you tell it to

borkdude 2021-02-02T13:56:26.223300Z

e.g. the clj-kondo VSCode plugin doesn't scan anything either. this just happens as a side effect of people visiting their files

borkdude 2021-02-02T13:56:39.223700Z

it's very passive in that regard

ericdallo 2021-02-02T13:56:53.223900Z

yeah, I meant on the clojure-lsp startup, we tell clj-kondo to scan the whole classpath

ericdallo 2021-02-02T13:57:18.224100Z

and that is when clojure-lsp has the memory-spike

ericdallo 2021-02-02T13:57:39.224300Z

we already do that in batches which improves a lot

ericdallo 2021-02-02T13:57:50.224500Z

I think @claudius.nicolae wants to improve even more that

borkdude 2021-02-02T13:57:57.224700Z

maybe the batch size can be made configurable

borkdude 2021-02-02T13:58:09.224900Z

I think the memory spike happens because of stuff accumulating

borkdude 2021-02-02T13:58:26.225100Z

and I don't think there's anything that should change in clj-kondo itself for that

borkdude 2021-02-02T13:59:02.225300Z

but feel free to investigate of course, using VisualVM or some other tool

ericdallo 2021-02-02T13:59:48.225500Z

yeah, good point, ATM we use a batch size of 50 files, and keep the 48 analysis while analyzing the 49 probably keep the memory high

ericdallo 2021-02-02T14:00:11.225700Z

if we could stream analyze in clojure-lsp, not sure if that works, but probably it'd help a lot @claudius.nicolae

👍 1
clyfe 2021-02-02T17:26:51.226900Z

Why does clojure-lsp crawler put kondo/run results in it's mem db and disk db?

ericdallo 2021-02-02T17:27:18.227100Z

mem db for the process and every feature, disk db for next startups

ericdallo 2021-02-02T17:27:31.227300Z

with that, next startups takes 2-3s

ericdallo 2021-02-02T17:27:40.227500Z

unles the project deps changed

clyfe 2021-02-02T18:39:32.229700Z

Calling (System/gc) as the first statement in clojure-lsp.crawler/run-kondo-on-paths! keeps the memory under 200MB (as opposed to reaching 2GB) with a performance penalty of ~100ms on a project with cljs & sci as deps.

clyfe 2021-02-03T08:11:32.247600Z

No significant gains with transducer.

clyfe 2021-02-03T08:12:30.248Z

The transducer

clyfe 2021-02-03T08:14:59.248400Z

Im'a look at instructing kondo to return less, I noticed yesterday something on the lines: kondo returns 100mb something and lsp only uses 10mb something out of that.

borkdude 2021-02-03T08:15:41.249500Z

I notice a concat in the deep-merge code. That might not be what you want

borkdude 2021-02-03T08:17:17.250800Z

@claudius.nicolae we already have some options to exclude things in the analysis. We can add more if that’s useful for LSP

clyfe 2021-02-03T10:24:48.251200Z

Call gc after each condo if mem past 500mb keeps it under 500 and time is 100s (vs 75s on big mem).\

clyfe 2021-02-03T10:26:49.251400Z

Im'a look into kondo a bit - on some (most?) calls it ups the mem with 200mb for a 10mb return; I hope there's room for using transients here and there to allocate less.

borkdude 2021-02-03T10:27:26.251600Z

:thumbsup:

clyfe 2021-02-03T15:14:58.254700Z

Rabbit hole is deep here. I don't have anything of substance other than the idea I started with: analysis/analyze incrementally adds to context slots that could benefit from being https://hypirion.com/musings/understanding-clojure-transients as they would allocate less. Then again, there's some parallelism there that would be affected. I'ma stop here.

clyfe 2021-02-03T15:15:41.255100Z

Should I do a lsp PR with "gc after each condo if mem past 500mb"?

ericdallo 2021-02-03T15:16:28.255300Z

What do you think @snoe?

snoe 2021-02-03T16:00:44.256200Z

Personally, I think we should leave it if we don't understand exactly what's happening.

borkdude 2021-02-03T16:08:02.256400Z

:thumbsup:

clyfe 2021-02-03T16:20:47.256600Z

what is happening, (on project init): • kondo allocates 100-500mb memory per jar to compute a 1-20mb analysis data return • the memory (used at once, not just allocated and not gc'ed) goes to ~1.5gb when analyzing multiple jars at once (batch 50) • low spec pc folks report freezes, whine and uninstall Calva (some had Macbooks). my (potential) pr does: • batch size 1 (jar by jar) keeps under 500mb (for the most part) - low spec pc folk ca use it. • not too significant time penalty as proven by: • - calling System/gc after each kondo slows to 200s from 75s. • - however, calling System/gc after kondo only when memory past 500mb slows to just 100s from 75.

clyfe 2021-02-03T16:22:17.256800Z

ballpark numbers

clyfe 2021-02-03T16:23:48.257Z

"used at once" because setting -Xmx lower -> OOME

clyfe 2021-02-03T16:24:32.257200Z

cc @snoe

ericdallo 2021-02-03T16:27:31.257400Z

Not sure batch size 1 is the way... increasing the time to by 25s is not good, since for big projects with it will take more time

ericdallo 2021-02-03T16:28:40.257600Z

What about a batch size between 1 and 50? Is there any better batch size that could improve the time and reduce the mem?

borkdude 2021-02-03T16:31:17.257800Z

or is there something in clj-kondo that could be optimized? what about running without analysis, just for debugging this? does it also spike memory?

borkdude 2021-02-03T16:31:31.258Z

{:output {:analysis false}}

ericdallo 2021-02-03T16:32:14.258200Z

Probably it will not spike, but it's a good idea for the test @claudius.nicolae

clyfe 2021-02-03T16:34:46.258400Z

lsp makes use of kondo :alaysis output and puts it in a mem db and disk db

borkdude 2021-02-03T16:35:04.258600Z

yes, but for debugging if this is the cause

clyfe 2021-02-03T16:35:08.258800Z

I'll try the flag, but the logic may break

borkdude 2021-02-03T16:35:23.259Z

yes, understood. just for cause analysis

☝️ 1
clyfe 2021-02-03T16:57:07.259400Z

(assoc-in [:config :output :analysis] false) gives no improvement (still 1.5gb)

ericdallo 2021-02-03T17:00:09.259600Z

I think it worth disabling the findings too

snoe 2021-02-03T17:02:21.260100Z

I would say that if you want to change batch size or force gc for low spec users @claudius.nicolae it could be sent in as a user-setting. Right now, our minimum spec is 2gb free mem, I wouldn't want to slow down processing for others to meet people who can't get there.

clyfe 2021-02-03T17:07:39.260300Z

(-> (assoc-in [:config :output :analysis] false)
    (assoc-in [:config :findings] false)
    (assoc-in [:config :linters] ^:replace {}))
Is that the config shape? (couldn't spot it)

ericdallo 2021-02-03T17:09:32.260500Z

I think the second line is not necessary (will do nothing) could you confirm @borkdude?

borkdude 2021-02-03T17:16:11.260700Z

correct, it won't do anything

clyfe 2021-02-03T17:32:03.261400Z

still nothing significant

borkdude 2021-02-03T17:35:13.261600Z

maybe you can make a repro with clj-kondo only?

borkdude 2021-02-03T17:35:23.261800Z

as in, run clj-kondo/run! a 100 times of on the same jar?

clyfe 2021-02-03T17:36:15.262Z

Let me see

borkdude 2021-02-03T17:37:42.262200Z

@claudius.nicolae I debugged a previous issue like here: https://github.com/clj-kondo/clj-kondo/issues/1036

👀 1
clyfe 2021-02-03T18:14:47.262700Z

This runs kondo on a bunch on jars (like lsp batch 50 does) and ups the memory to 1.5gb+ @borkdude

clyfe 2021-02-03T18:16:12.263100Z

Make sure to get the paths right, there.

ericdallo 2021-02-03T18:32:08.263600Z

Oh, so the mem spike is really on clj-kondo

clyfe 2021-02-03T18:54:44.263800Z

I used this little fn a lot:

(defn used-memory-mb []
  (let [mb (* 1024 1024)
        rt (Runtime/getRuntime)
        total (.totalMemory rt)
        free (.freeMemory rt)
        used (- total free)]
    (Math/round (double (/ used mb)))))
and https://github.com/clojure-goes-fast/clj-memory-meter - with some caveats (I think it realizes lazy seqs or something, I saw memory spike when using it in some cases).

borkdude 2021-02-03T18:55:11.264200Z

> Edit it: make :paths from :override-paths I don't get this. Can you make a repro that I can just git clone and run?

clyfe 2021-02-03T18:55:35.264400Z

see deps.edn there

borkdude 2021-02-03T18:55:50.264600Z

I am staring at it

borkdude 2021-02-03T18:56:10.264800Z

I don't see override-paths, only override-deps.

clyfe 2021-02-03T18:57:19.265700Z

clyfe 2021-02-03T18:57:23.266100Z

put that one

clyfe 2021-02-03T18:57:39.266300Z

*override-deps my mistake

clyfe 2021-02-03T18:57:54.266500Z

gist is: to have a bunch of deps in it

borkdude 2021-02-03T18:58:00.266700Z

ok

clyfe 2021-02-03T19:04:00.266900Z

Also:

(read-string (slurp (str project "paths.edn")))
instead in fn there
[(read-string (slurp (str project "paths.edn")))]
🤦

borkdude 2021-02-03T19:06:52.267100Z

yeah, I already got that one. I can reproduce this with -Xmx1g:

$ clj -J-Xmx1g -X:clj-kondo/dev clj-kondo.core/runf
Exception in thread "pool-1-thread-3" Exception in thread "pool-1-thread-8" java.lang.OutOfMemoryError: Java heap space
But yeah, not sure if this a bug or just a result of loading lots of data in one go.

borkdude 2021-02-03T19:07:25.267400Z

It's building up one giant data structure of analysis data and findings

clyfe 2021-02-03T19:16:35.267600Z

Why is it faster (75s) when I do one run on all the jars, and slower (100s) when I do many runs jar by jar? If the two would be as fast - we could go jar by jar with little memory.

clyfe 2021-02-03T19:17:10.267800Z

(and no extra Calva config)

bringe 2021-02-03T19:32:45.268Z

Not sure if it matters much, but does the stacktrace say at what point in the code the OOM occurred? @borkdude If not, maybe some logging could tell. Though I do see what you mean, if all the data returned is needed, and of course we want it returned in a reasonable time span, then even if the returned result is fairly small, the intermediate aggregates of whatever operations built the data will have to all be in memory at once, unless somehow GC could run in that short time span in which the result is built in order to free up unused memory. Having said the above, I'm not sure System/gc would be a good solution in any case, as I think it's not guaranteed to run when called, and results may differ depending on JDK version (speculating). So, if all efforts have been made to avoid intermediate aggregates in the building of the result in clj-kondo, and we know clojure-lsp needs all info in the result, I'm not sure what else can be done to lower mem usage. > if all efforts have been made to avoid intermediate aggregates in the building of the result in clj-kondo I think this is worth looking more into, but maybe @borkdude has optimized the code in this respect as best as it can be (I haven't looked). If it is fully optimized, then the options lie in clojure-lsp. What can it avoid doing at startup, or what can it break into chunks better and do over a longer period of time. Or perhaps if there is some speed trade-off, then an option to enable this less-mem-intensive method would be good as @snoe said. I somewhat refuse to believe though that we're stuck with the current startup mem usage, but I'm an optimist 😄.

👍 1
borkdude 2021-02-03T19:50:40.268300Z

bringe: yes, during linting when it tries add another thing to the findings map. Which makes sense, because that adds to the memory usage. what happens when you lint x libraries at once, is that clj-kondo will hold on to a lot of data, so it can make conclusions about all the data as a whole. a solution is to lint in batches, which is already done in clojure-lsp using a magic number of 50 libraries / classpath segments, but maybe this number could be made configurable or calculated using a heuristic on the available JVM memory

👍 1
borkdude 2021-02-03T19:52:08.268500Z

There is room for some optimization. When you are linting only for analysis purposes, you must pass the :no-warnings true flag (which is also documented). When you pass this, together with :parallel true, clj-kondo will skip already seen jar files. Possibly we can also use this setting to put less data in the output, which is I think not yet done. I'll take a look

ericdallo 2021-02-03T19:53:19.268700Z

good point ☝️ @claudius.nicolae could you test it with that :no-warnings true flag? I didn't know about it (sorry 😅)

ericdallo 2021-02-03T19:54:09.268900Z

Is there any way to get the available Mem on an OS via clojure/Java ? this way we could calculate a good batch size indeed I think

borkdude 2021-02-03T19:54:33.269100Z

I think clj-kondo doesn't fully exploit memory savings when you turn on :no-warnings true. It just lints as always, but it just skips the jars it's already seen

borkdude 2021-02-03T19:54:51.269300Z

Yes, you can read this from some properties, I'm pretty sure

borkdude 2021-02-03T19:55:08.269500Z

Note that even the GraalVM binaries honor -Xmx flags as command line args

👍 1
ericdallo 2021-02-03T19:55:22.269700Z

Yeah, for example clojure.core jar that is present in multiple libs would be scanned only 1 time, right?

borkdude 2021-02-03T19:56:04.270Z

that's already what a build tool does when you calculate a classpath: clojure core is only in there once

borkdude 2021-02-03T19:57:33.270200Z

the :no-warnings flag is about when you invoke clj-kondo multiple times with the same .jar files

ericdallo 2021-02-03T19:58:19.270400Z

Oh, got it, it makes sense

borkdude 2021-02-03T19:58:27.270600Z

the example clyfe gave has 274 libraries, which might be a bit extreme, I don't think most projects have this amount of libs, but I could be wrong

borkdude 2021-02-03T19:59:37.270800Z

btw, with :no-warnings true I can lint the 274 with Xmx2g

👍 1
borkdude 2021-02-03T20:00:01.271100Z

I'll also try with 1g...

borkdude 2021-02-03T20:00:49.271300Z

ok, there I run into problems

borkdude 2021-02-03T20:02:45.271500Z

I'll try some optimizations

borkdude 2021-02-03T20:11:36.271700Z

ok, this works with 1g:

(def project "/tmp/repro/")
(def paths (partition 30 (edn/read-string (slurp (str project "paths.edn")))))

(defn runf [_]
  (let [project "/tmp/repro/"
        args (fn [paths]
               {:cache true
                :cache-dir (str project ".clj-kondo/.cache")
                :no-warnings true
                :parallel true
                :lint paths
                :config {:output {:analysis {:arglists true
                                                          :locals false}
                                  :canonical-paths true}}})]
    (clojure.core/run! #(run! (args %)) paths)))

borkdude 2021-02-03T20:11:44.271900Z

so I guess that should be sufficient

borkdude 2021-02-03T20:12:13.272100Z

1) use :no-warnings true 2) batch size depending on memory heuristic

clyfe 2021-02-03T20:12:46.272300Z

out of 274 some take very little and some a lot (like cljs jar); when several that take a lot find themselves in the same batch, that's the bottleneck.

borkdude 2021-02-03T20:14:18.272500Z

you could look at the size of the jar as well as another heuristic ;)

👍 2
ericdallo 2021-02-03T20:16:18.272900Z

Thanks for the tests @borkdude so, I tihnk we could use the batch size dynamic checking the available mem, and include the :no-warnings

borkdude 2021-02-03T20:24:36.273100Z

I found one possible optimization, to drop the rewrite-clj expr from the call objects that are used for linting. But I'm not sure if this will affect analysis...

borkdude 2021-02-03T20:24:48.273300Z

Using that I can lint 274 libs with 1g

1
borkdude 2021-02-03T20:26:46.273700Z

I suspect it will be ok

borkdude 2021-02-03T20:36:04.273900Z

It would be nice if lsp could do some unit tests to see if clj-kondo still delivers the correct analysis results using the :no-warnings true + :parallel true settings. Try again with clj-kondo master (commit 175c48839299c445f6684fa15e5692b03c9bcb5a)

ericdallo 2021-02-03T20:38:41.274300Z

We already have unit-tests for most(all?) features, if there is a bug I think they will fail (I hope)

borkdude 2021-02-03T20:38:55.274500Z

ok!

ericdallo 2021-02-03T20:39:25.274700Z

That check on Github is new for me 😮 cool

😮 1
borkdude 2021-02-03T20:40:06.275100Z

yeah, that's done using a github action

ericdallo 2021-02-03T20:40:11.275300Z

really cool

borkdude 2021-02-03T20:45:58.275500Z

@ericdallo the deployed version: clj-kondo-2021.01.21-20210203.204227-15.jar

ericdallo 2021-02-03T20:50:11.275700Z

I'll test it

ericdallo 2021-02-03T20:51:58.275900Z

All tests passed 😄

borkdude 2021-02-03T20:52:56.276100Z

and how about the RAM?

ericdallo 2021-02-03T20:56:55.276300Z

Testing with a project of 250 files classpath, it took 58s and with the 50 batch size and RAM with spikes of 18-19% (3.1GB) 😞

ericdallo 2021-02-03T20:57:24.276500Z

Let's wait for @claudius.nicolae test it too to confirm that

borkdude 2021-02-03T21:20:56.276800Z

Test with -Xmx1g , that's what it's about

borkdude 2021-02-03T21:21:08.277Z

The memory can go high when it just delays GC

ericdallo 2021-02-03T21:21:27.277200Z

Alright, I'll test it

ericdallo 2021-02-04T03:56:08.277800Z

I tested comparing the master clojure-lsp with clojure-lsp with that clj-kondo version and could not see any differences, I used both Xmx1g but I noticed I can get no OOM with my 250 files project, even uing master clojure-lsp This is a print of the startup jvm heap:

ericdallo 2021-02-04T03:56:30.278200Z

we can see the spikes of each batch analysis

borkdude 2021-02-04T08:01:05.279100Z

Try the deps that Clyfe posted

ericdallo 2021-02-02T18:40:17.229800Z

Yep, I think it was a good quick win 😄

clyfe 2021-02-02T18:40:44.230200Z

Oh, & with batch size 1.

borkdude 2021-02-02T18:42:58.230500Z

That shouldn't be necessary though, calling gc manually? If the JVM runs out of memory it will do so itself

ericdallo 2021-02-02T18:44:00.230700Z

We have this issue explain why we added: https://github.com/clojure-lsp/clojure-lsp/issues/229

clyfe 2021-02-02T18:44:05.231Z

Maybe change jvm-opts in uberjar? I see -Xmx2g there.

clyfe 2021-02-02T18:54:28.231300Z

I propose a default of -J-Xmx500m & batch size 1. Things look well with that on my trials (small codebase).

ericdallo 2021-02-02T18:55:17.231500Z

medium/huge codebases will throw a OOM

clyfe 2021-02-02T18:57:50.231800Z

Or, leave it 2g & manually call System/gc then.

bringe 2021-02-02T20:58:14.233400Z

Right, the issue seems to be that with less than 2g Xmx, some people get OOMEs (something that I think is worth exploring more itself), but when that max isn't reached at startup, GC doesn't seem to be triggered, so mem can hover at something like 1.7g even though really only something like ~300+ MB is actually used. I've analyzed this with VisualVM and when I manually ran GC after startup, it dropped that much mem.

bringe 2021-02-02T20:59:33.233600Z

I think the real issue is the OOME, though. There may be ways to prevent that with code changes, though I haven't dug into it.

clyfe 2021-02-02T21:00:14.233800Z

batch size 1, OOME won't be a problem even with 500mb methinks.

bringe 2021-02-02T21:00:24.234Z

It is though

bringe 2021-02-02T21:00:54.234200Z

I believe Eric tried to drop it to 1.5 G and soon after got a report of an OOME

clyfe 2021-02-02T21:00:55.234400Z

The idea is to not kondo too much at once. Little by little.

clyfe 2021-02-02T21:01:12.234600Z

so batch size 1.

bringe 2021-02-02T21:01:18.234800Z

Right, that has been implemented somewhat so far, from what I know.

ericdallo 2021-02-02T21:01:37.235Z

yep, I could reproduce the OOM with a mid project when I decreased the Xmx to 1.5

ericdallo 2021-02-02T21:01:50.235500Z

and there was no batch analysis on that time

ericdallo 2021-02-02T21:01:56.235700Z

☝️

ericdallo 2021-02-02T21:02:26.235900Z

@claudius.nicolae Feel free to test it increasing or decreasing the batch size, testing it with big and small projects

ericdallo 2021-02-02T21:02:41.236100Z

to test it with a big project, you can just add a lot of deps to deps.edn

clyfe 2021-02-02T21:02:49.236300Z

ok

ericdallo 2021-02-02T21:03:10.236500Z

@nicdaoraf may be interested here

👋 1
ericdallo 2021-02-02T21:03:48.236700Z

https://github.com/clojure-lsp/clojure-lsp/issues/268

ericdallo 2021-02-02T21:03:52.237100Z

Check that ☝️

borkdude 2021-02-02T21:05:45.237300Z

I have had OOMs with the clj-kondo VSCode plugin once. The mistake that was made then was that the sci ctx (used for hooks) was memoized and thus forever kept into memory. But that got fixed.

borkdude 2021-02-02T21:06:02.237500Z

That was clearly a leak. I'm not sure about this one.

bringe 2021-02-02T21:06:11.237700Z

If you don't see an improvement, the code that runs the batches may need to be restructured. https://github.com/clojure-lsp/clojure-lsp/blob/0b8b62bab312fdc0367cb7e63be5141052ebc369/src/clojure_lsp/crawler.clj#L116 Here ^ the paths are partitioned, but then all batches are mapped and reduced together. Just speculating, but this might be the same as running all paths at once as far as GC is concerned. If that were true, I don't know the solution off hand, but might be worth digging into.

bringe 2021-02-02T21:07:24.238Z

Also with that code, there may be some unnecessary intermediate collections being built, as opposed to using a transducer to avoid that

bringe 2021-02-02T21:07:53.238200Z

(I'm no transducer expert but I know they're used to eliminate intermediate results.)

ericdallo 2021-02-02T21:09:08.238500Z

We could move the save in db code to the batch, so the batchs would be independent , and we would't need to merge all batchs

ericdallo 2021-02-02T21:09:16.238700Z

not so eaasy to do though

clyfe 2021-02-02T21:10:11.238900Z

I pondered on reducing with transients but didn't find obvious stuff; code seems pretty optimal as is.

borkdude 2021-02-02T21:10:19.239100Z

I think you could also get rid of the :findings earlier maybe since you're probably only interested in the :analysis?

ericdallo 2021-02-02T21:11:21.239300Z

@borkdude it'd be necessary to refactor the code actions code, which rely on :findings, which IMO is the right choice to check if we have a unused import for example

ericdallo 2021-02-02T21:11:52.239500Z

Not sure if it would imporve a lot performance, we could test disabling the findings and check the mem difference

borkdude 2021-02-02T21:11:59.239700Z

but this is only relevant for the currently visited file, I think?

borkdude 2021-02-02T21:12:06.239900Z

and not for the saved analysis?

ericdallo 2021-02-02T21:12:49.240100Z

yeah, good point, by default the findings are enabled in clj-kondo?

ericdallo 2021-02-02T21:13:03.240300Z

is there a opt-out flag for findings?

borkdude 2021-02-02T21:13:30.240500Z

yeah, that's more or less the primary use case. yeah, you can override the linters with level :off

ericdallo 2021-02-02T21:14:17.240700Z

nice, it's something to test it @claudius.nicolae

borkdude 2021-02-02T21:14:29.240900Z

Maybe try:

borkdude 2021-02-02T21:15:15.241100Z

{:linters ^:replace {}}
I think this should turn all the linters off

👍 1
ericdallo 2021-02-02T21:17:42.241400Z

Nice, @claudius.nicolae feel free to test it, I can only check that tomorrow probably

👍 1
clyfe 2021-02-02T21:37:26.241700Z

With all https://github.com/tape-framework/versions deps, batch size 1, gc call on each kondo: it mostly stays under 250mb with occasional spikes to 500mb.

ericdallo 2021-02-02T21:39:08.242Z

Check the logs, what is saying?

clyfe 2021-02-02T21:39:13.242200Z

clyfe 2021-02-02T21:39:57.242600Z

Im'a try with no findings now.

ericdallo 2021-02-02T21:40:17.242800Z

looks good, could you try removing the batch logic and full analysing?

ericdallo 2021-02-02T21:40:32.243Z

is just call run-kondo-analysis instead of the batch sufix

clyfe 2021-02-02T21:43:15.243400Z

spiky spiky 1.5gb and counting

clyfe 2021-02-02T21:43:57.243600Z

ericdallo 2021-02-02T21:44:29.244Z

Ok, how much time took the batch size 1 analisis?

bringe 2021-02-02T21:45:08.244200Z

If you feel like trying a transducer as well to see if there's a difference (again, I'm no expert but it's something I'd want to try), you could try this:

(defn ^:private run-kondo-on-paths-batch!
  "Run kondo on paths by partition the paths, with this we should call
  kondo more times but we fewer paths to analyze, improving memory."
  [paths]
  (let [total (count paths)
        batch-count (int (Math/ceil (float (/ total clj-kondo-analysis-batch-size))))
        xf (comp
            (partition-all clj-kondo-analysis-batch-size)
            (map-indexed (fn [index batch-paths]
                           (log/info "Analyzing" (str (inc index) "/" batch-count) "batch paths with clj-kondo...")
                           (run-kondo-on-paths! batch-paths))))]
    (log/info "Analyzing" total "paths with clj-kondo with batch size of" batch-count "...")
    (if (<= total clj-kondo-analysis-batch-size)
      (run-kondo-on-paths! paths)
      (transduce xf deep-merge paths))))

clyfe 2021-02-02T21:45:38.244400Z

(batch 1) took 206.38297 secs

borkdude 2021-02-02T21:45:55.244600Z

yes, I think that should help as well. map-indexed might also be chunked (in none transducer mode) which means there would be at least 32 of those in memory at once

clyfe 2021-02-02T21:45:57.244800Z

vs 75.0604 secs on big mem

ericdallo 2021-02-02T21:45:59.245Z

too much time..

ericdallo 2021-02-02T21:46:28.245200Z

yeah, a transducer on that seems to fit well

bringe 2021-02-02T21:46:50.245400Z

Right, it does seem that the intermediate aggregates are what's piling up in memory, since when we GC so much is obviously not being used

☝️ 2
2
clyfe 2021-02-02T21:53:46.245800Z

more tomorrow bc is late here 🙂 ; gn for now.

2
👋 3