Is there anything in the new tools.deps stuff that would make removing the Clojure coming from the system deps file easier? My main use cases are using a fork of Clojure, and also using Deps to manage non-Clojure projects (i.e. parts of Cursive).
@cfleming Can't you just use :override-deps
and an alias?
I don’t think that works.
I’ll play around with that and see, but IIRC it didn’t work last I tried.
As for non-Clojure stuff, nothing yet but maybe tools.build
might fingers crossed ?
tools.build?
I know you can select any released Clojure version via :override-deps
so I assume you could point it to a forked artifact...?
tools.build
was mentioned in one of Alex's blog posts with a "shhh, secret" comment and it's popped up occasionally here
I assume it's something Cognitect have developed internally, probably to help with Datomic (like the recent -X
option in the Clojure CLI with the install
option for Datomic dev-local
).
Maybe I will finally be able to get off Ant 🙂
@seancorfield Looking at the docs, :override-deps
seems to only be about overriding versions, not changing the artifact.
I assumed your "artifact" would have the same group/artifact ID but the "version" would point to :local/root
(a JAR)?
Wow, what do you know - it works!
~/d/tools-deps-test> cat deps.edn
{:override-deps {org.clojure/clojure {:local/root "/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar"}}}⏎ ~/d/tools-deps-test> clj -Spath
/Users/colin/.m2/repository/org/clojure/clojure/1.10.1/clojure-1.10.1.jar:/Users/colin/.m2/repository/org/clojure/spec.alpha/0.2.176/spec.alpha-0.2.176.jar:/Users/colin/.m2/repository/org/clojure/core.specs.alpha/0.2.44/core.specs.alpha-0.2.44.jar:src
Thanks @seancorfield, that may make my life much easier.
For modules where I don’t want Clojure, I should be able to use that trick to point Clojure to an empty jar, and then just filter that out when building.
You just made my week! 🙂
And mine 🙂
Oops, wait a minute - I didn’t look closely enough. The classpath still refers to the original Clojure 😞
Aw, sorry...
Actually, no, the root cause is that I’m an idiot. Here’s the working version, with override-deps in an alias as it should be:
~/d/tools-deps-test> cat deps.edn
{:aliases {:fork {:override-deps {org.clojure/clojure {:local/root "/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar"}}}}}
~/d/tools-deps-test> clj -Spath -A:fork
/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar:/Users/colin/.m2/repository/org/clojure/spec.alpha/0.2.176/spec.alpha-0.2.176.jar:/Users/colin/.m2/repository/org/clojure/core.specs.alpha/0.2.44/core.specs.alpha-0.2.44.jar:src
Oh, thank goodness!!
One risk of the new runtime basis is that programs are hard-coded against them, e.g. a repeat of the lein-figwheel problem. So the clj tools become mandatory for tooling. Setting a system property is easier than whatever assumptions lein plugins make about the environment. We should perhaps dissuade ourselves from doing this 🙂
On another note, if I do store information inside of deps.edn (e.g. a figwheel build info), how could I reload it? Is that part of tools.build?
the add-lib branch distinguishes between the initial basis (static set at launch) and the runtime basis mutable as you add libraries
so I imagine you don't reload the basis
you take it as an initial value and stick it in an atom and mutate that
There are definitely tradeoffs involved. When we got started with Clojure a decade ago, we used lein run
in production and continued to use that approach when we switched to Boot (2015) and later to the Clojure CLI. It's only been since that switch that we started building uberjars instead I think, and for a long while, we did not AOT code for those. In the end, we added the AOT step purely to improve startup time (it made a big difference for a couple of our processes, so they could come back into the cluster faster after a deployment). We've always generally had our app configuration in external files -- separate from the build/run tooling -- and/or via system properties (or environment variables). But I don't have a concern about requiring our programs to be run via clojure
if there are enough benefits to be worth switching from java -jar
or java -cp
.
(add-lib2 I guess, add-lib still calls it the lib-map instead of the basis)
As for "clj tools become mandatory" -- we're sort of already in that place: to work with a given project pretty much requires you use whatever tooling that project has adopted, be it lein
, boot
, or clj
.
the basis is just a jvm property that points to a file that is an edn map. right now, clj makes that and injects it but nothing there is "special". lein could make a basis map and pass it too. or a user could make it and set it in jvm opts in lein.
> we're sort of already in that place: That's true for projects, but not tools. To clarify, I mean that your project makes clojure cli mandatory (although it could support lein alongside clj too, some projects do this). I'm worried about tooling like lein-figwheel or lein-sass, etc. which is tied to a particular JVM start tool needlessly.
@dominicm rather than expecting to reload the basis, you could have the basis include a reference to a different file that you watch and reload (or whatever)
tooling that is tied to a particular tool (ie a lein plugin) was always not a good factoring of the problem. it is far better to write a tool that is independent and just code and then adapt it to tooling as needed (and to prefer tools that levy minimal requirements such that the adapter may be nothing)
some lein plugins are written well for this, some aren't
The clojure tools have inherently encouraged a lot of this through encouraging adoption of clojure main, so there's an inherent reduction of coupling to only be Clojure itself for many tools. (Not mine, mine all read deps.edn files!)
I don't know enough about lein-figwheel or lein-sass or whatever to say on those (but having "lein" in the name seems like a flag)
One consideration will be what is standard/non-standard in the runtime basis. Lein can't tell you what aliases were loaded, it's a semantic mismatch. It could tell you some information about the deps though.
https://clojure.github.io/tools.deps.alpha/#clojure.tools.deps.alpha/calc-basis
docs the basis map return (more detail is needed, a bit of it is intentionally a little vague right now, but it's mostly pretty set). I have not yet updated the specs for tools.deps but I will get to it eventually
To me, those seem like they should be namespaced to tools deps if you think it's OK for other tools to generate. With the exception of lib-map, which I could see being generated by other tools (boot/lein)
The overall map is the combined deps.edn
hash map, with those four extra keys added, so you're already tied to the deps.edn
structure in the clojure.basis
property. Leiningen isn't going to be able to create that master-edn
map anyway, unless it adopts t.d.a. and deps.edn
as an option (alongside project.clj
) and changes its dependency resolution approach. So it seems you're already tied to clojure.basis
and those simple keys at this point -- no need to qualify them, IMO.
Perhaps Leiningen could/should produce a lein.basis
file/property when running code, that reflects the hash map structure of the project.clj
file? But I don't think you can expect portability between tools (and I don't think you should).
"Lein can't tell you what aliases were loaded, it's a semantic mismatch." -- clojure.basis
doesn't tell you what aliases were used to run the program, only the resulting "basis" of those aliases, but it includes the whole merged master-edn
so you get "all" aliases available -- and lein
could provide the :aliases
structure from project.clj
(although it's a completely different beast).
So interesting finding just now that might be useful to others. I operate a tiny little chat bot, that runs on a cheap / low-spec PAAS. Earlier today I switched from running it via the Clojure CLI (i.e. via something like clojure -m mybot.main
) to constructing an uberjar and running that (i.e. via something like java -jar mybot.jar
), mostly in order to reduce startup times (the PAAS doesn’t provide blue/green deployment at the subscription level I’m on, and the brief outage during scheduled restarts had been noticed and commented on by some users).
As expected, startup time basically went to zero. What I didn’t expect was how much of a memory saving there’d be too. Previously, my bot had stabilised at around 340MB total (heap & off-heap) memory usage, which was quite close to the 512MB total provided by the PAAS, and which has to cover all processes’ memory use (not just that of my bot’s JVM).
Since the switch to an uberjar, total memory usage has stabilised at around 210MB. While I didn’t confirm this in detail, my preliminary conclusion is that the extra ~130MB is allocated by the tools.deps
machinery (after all that’s the only thing that changed), and is not released after tools.deps
has done its thing and is no longer in use.
Those are separate jvms so I don’t think I buy that if that was your theory
What is your theory?
Note: the uberjar is not AOT compiled (except for the ns with the -main
function).
I would buy it for a container around the clj call, not sure what the PAAS is or how it works
Heroku dyno (“hobby” subscription level). It’s pretty simple - a VM, as best I can tell. I’m not using Docker or any other container-like deployment mechanism.
I guess I'm saying "is allocated by the `tools.deps` machinery" is not in your final process, but may consume resources in the container itself
Does that JVM stick around though? Once the “app” JVM is started?
no
the script waits for it to exit
before running the app
the clojure script forks a child process to run the first jvm, waits for it to exit, then exec's the app which replaces the script with the final jvm process
I’m out of ideas. I did look into whether the local “ephemeral” filesystem might be a RAM drive that counts towards the VM’s RAM total, but that’s not the case (the ephemeral filesystem is 6GB).
(since my app’s dependencies total around 90MB - somewhat close to the 130MB delta)
I can easily envision a container expanding its memory footprint and thus taking more time in the first case or something like that
but certainly the time savings makes sense
Right, but the dashboard I’m looking at shows total memory used within the VM (and doesn’t give me any idea of the total RAM of the physical hardware that VM is running on).
Yeah, and startup time savings was the impetus for making this change. I’m just surprised (and wish I could explain) the additional savings in RAM usage. Especially given how substantial those savings were (memory use dropped by ~40%).
Here’s what I see, fwiw, spanning the time of the deployment where I switched to an uberjar:
That graph is basically flat, going back a month or two (to when I first deployed the bot).
well, that could be reserved memory based on the max used (affected by the first jvm)
It isn’t though - that dashboard further breaks down the JVM’s heap and off-heap memory usage, and both of those have dropped by approximately the same %ages.
I don't really have any way to tell
I wouldn’t expect Heroku to “keep counting” a JVM that’s been terminated, indefinitely.
Let me see if I can repro locally…
Different numbers obvs, but I think I see a similar pattern locally (using visualvm to look at just heap & metaspace usage, so definitely not an apples-to-apples comparison with Heroku).
Here’s clojure -m mybot.main
:
And here’s java -jar mybot.jar
:
(look at “used”, specifically - the JVM has allocated different total amounts from the OS in these runs, which throws off the vertical scales between the two runs)
For easier reading, here are the key numbers:
| clj | jar |
------------+--------+--------+
Heap | 83MB | 39MB |
Metaspace | 71MB | 60MB |
------------+--------+--------+
aot compilation is transitive, so unless you work to avoid it, aot compiling a single namespace will also aot all namespaces it depends on as well
(similar pattern, albeit different numbers, on successive runs, and with a GC requested before taking measurements)
No clue what’s going, tbh. I’m not one to look a gift horse in the mouth, but then unexpected changes of this magnitude also make me suspicious. 😉
Is that true of (gen-class)
?
That’s the only thing I’m using (since it’s required in order to generate an “executable JAR” that java -jar
can run directly).
one difference is that the jvm memory maps the jars - and here you've got many jars vs 1
That said, this is my first use of depstar
, so perhaps it’s doing something like that…
might be some per-jar overhead you're avoiding?
another thing to try would be to use -Scp
clojure -Scp the.jar -m your.namespace
that would be still using the script, not computing a classpath but using 1 jar
guessing that's still the lower number
and the next test would be to get the script out of the way. grab the computed classpath with -Spath
, then java -jar vs java -cp with that
it is complicated, but if you are using gen-class in the ns form, then yes, everything is being aot compiled
I think you’re onto something. I cracked open the uberjar and sure enough my code is in there twice in .clj
and .class
form.
Yeah I was wondering about total # of open file handles, but it’s hard to come up with 130MB that way. 😉
@hiredman found something though - it looks like all my code is being AOT compiled (the uberjar contains both .clj
and .class
versions of all of my namespaces). I don’t know why that’s happening (this is my first time using depstar
, so I may have messed something up there), though apparently (gen-class)
(which is used on my “-main” ns) can cause transitive AOT compilation too.
Would runtime compilation of Clojure generate a lot of long-lived memory usage like that though? That’s the bit I can’t quite wrap my head around - this extra memory isn’t being GCed.
(as part of trying to keep memory usage down on this restricted runtime environment, the bot calls (System/gc)
every hour - I know that doesn’t guarantee that the GC will kick in but I figure it can’t hurt, especially as this particular bot is 99% batch/offline functions so it doesn’t matter if it’s not highly responsive to user activity)
have you set a max memory on the jvm?
When running locally, yes: -Xmx300m
(I keep it low to try to emulate the memory-limited Heroku environment).
On Heroku it’s container managed: -XX:+UseContainerSupport
(locally my JVM is 11.0.2+9, while Heroku is JVM 11.0.8, btw)