tools-deps

Discuss tools.deps.alpha, tools.build, and the clj/clojure command-line scripts! See also #depstar #clj-new
cfleming 2020-07-30T02:14:36.358200Z

Is there anything in the new tools.deps stuff that would make removing the Clojure coming from the system deps file easier? My main use cases are using a fork of Clojure, and also using Deps to manage non-Clojure projects (i.e. parts of Cursive).

seancorfield 2020-07-30T02:16:23.358600Z

@cfleming Can't you just use :override-deps and an alias?

cfleming 2020-07-30T02:16:42.359100Z

I don’t think that works.

cfleming 2020-07-30T02:17:17.360300Z

I’ll play around with that and see, but IIRC it didn’t work last I tried.

seancorfield 2020-07-30T02:17:17.360400Z

As for non-Clojure stuff, nothing yet but maybe tools.build might fingers crossed ?

cfleming 2020-07-30T02:17:39.360900Z

tools.build?

seancorfield 2020-07-30T02:18:07.361500Z

I know you can select any released Clojure version via :override-deps so I assume you could point it to a forked artifact...?

seancorfield 2020-07-30T02:18:35.362100Z

tools.build was mentioned in one of Alex's blog posts with a "shhh, secret" comment and it's popped up occasionally here

1🤫
seancorfield 2020-07-30T02:19:17.363Z

I assume it's something Cognitect have developed internally, probably to help with Datomic (like the recent -X option in the Clojure CLI with the install option for Datomic dev-local).

cfleming 2020-07-30T02:19:48.363300Z

Maybe I will finally be able to get off Ant 🙂

cfleming 2020-07-30T02:21:57.364100Z

@seancorfield Looking at the docs, :override-deps seems to only be about overriding versions, not changing the artifact.

seancorfield 2020-07-30T02:22:48.364900Z

I assumed your "artifact" would have the same group/artifact ID but the "version" would point to :local/root (a JAR)?

cfleming 2020-07-30T02:27:27.366400Z

Wow, what do you know - it works!

~/d/tools-deps-test> cat deps.edn
{:override-deps {org.clojure/clojure {:local/root "/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar"}}}⏎                                                                     ~/d/tools-deps-test> clj -Spath
/Users/colin/.m2/repository/org/clojure/clojure/1.10.1/clojure-1.10.1.jar:/Users/colin/.m2/repository/org/clojure/spec.alpha/0.2.176/spec.alpha-0.2.176.jar:/Users/colin/.m2/repository/org/clojure/core.specs.alpha/0.2.44/core.specs.alpha-0.2.44.jar:src

cfleming 2020-07-30T02:28:53.366800Z

Thanks @seancorfield, that may make my life much easier.

cfleming 2020-07-30T02:29:30.367500Z

For modules where I don’t want Clojure, I should be able to use that trick to point Clojure to an empty jar, and then just filter that out when building.

seancorfield 2020-07-30T02:42:26.367900Z

You just made my week! 🙂

cfleming 2020-07-30T02:43:17.368100Z

And mine 🙂

cfleming 2020-07-30T02:44:34.368800Z

Oops, wait a minute - I didn’t look closely enough. The classpath still refers to the original Clojure 😞

seancorfield 2020-07-30T02:44:43.369Z

Aw, sorry...

cfleming 2020-07-30T02:46:28.369700Z

Actually, no, the root cause is that I’m an idiot. Here’s the working version, with override-deps in an alias as it should be:

~/d/tools-deps-test> cat deps.edn
{:aliases {:fork {:override-deps {org.clojure/clojure {:local/root "/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar"}}}}}

~/d/tools-deps-test> clj -Spath -A:fork
/Users/colin/.m2/repository/com/cursive-ide/clojure/1.10.1/clojure-1.10.1.jar:/Users/colin/.m2/repository/org/clojure/spec.alpha/0.2.176/spec.alpha-0.2.176.jar:/Users/colin/.m2/repository/org/clojure/core.specs.alpha/0.2.44/core.specs.alpha-0.2.44.jar:src

1😄
seancorfield 2020-07-30T02:47:33.370Z

Oh, thank goodness!!

2020-07-30T09:11:14.370300Z

I also had a similar thought when I stumbled across the basis property etc. Though I figured you could spit out the basis to a clojure.basis file inside the uberjar, and slurp it as an io/resource in your -main. However the big disadvantage to this is that the basis would be generated at build time rather than run time, so it would be more like a manifest, than what the basis is really supposed to be. On reflection I think your idea is better — though I’m not entirely sure what you’d use the basis for in that case, other than as a means of using the clojure tool to provide config to your app.

2020-07-30T09:21:22.370900Z

I guess the only real advantage is using the clojure tool and a deps.edn to manage the classpath in production code, rather than the underlying java command line.

2020-07-30T09:25:40.371100Z

I have in the past done things like java -cp:myapp-uberjar.jar:some/server/resources/* myapp.main to for example add assets to an apps resource path as a production overlay… so I guess using the clojure tool itself to help manage these kind of things with aliases might be useful; though I’m not convinced the benefits are huge

2020-07-30T09:26:32.371300Z

though I guess it might sometimes be useful to for example provide profiles in production for enabling socket servers etc :thinking_face:

2020-07-30T09:30:38.371700Z

Also another usecase is you could provide extra tooling for prod systems in this manner. e.g. a set of aliases for various production tasks, e.g. triggering a -A:backup alias over cron etc.

2020-07-30T09:31:42.371900Z

and it would mean you don’t need to waste time inventing new command line parsers and config formats etc

dominicm 2020-07-30T16:36:45.374300Z

One risk of the new runtime basis is that programs are hard-coded against them, e.g. a repeat of the lein-figwheel problem. So the clj tools become mandatory for tooling. Setting a system property is easier than whatever assumptions lein plugins make about the environment. We should perhaps dissuade ourselves from doing this 🙂

dominicm 2020-07-30T16:37:09.374900Z

On another note, if I do store information inside of deps.edn (e.g. a figwheel build info), how could I reload it? Is that part of tools.build?

2020-07-30T16:41:06.377300Z

the add-lib branch distinguishes between the initial basis (static set at launch) and the runtime basis mutable as you add libraries

2020-07-30T16:41:21.377700Z

so I imagine you don't reload the basis

2020-07-30T16:41:43.378500Z

you take it as an initial value and stick it in an atom and mutate that

seancorfield 2020-07-30T16:47:52.382400Z

There are definitely tradeoffs involved. When we got started with Clojure a decade ago, we used lein run in production and continued to use that approach when we switched to Boot (2015) and later to the Clojure CLI. It's only been since that switch that we started building uberjars instead I think, and for a long while, we did not AOT code for those. In the end, we added the AOT step purely to improve startup time (it made a big difference for a couple of our processes, so they could come back into the cluster faster after a deployment). We've always generally had our app configuration in external files -- separate from the build/run tooling -- and/or via system properties (or environment variables). But I don't have a concern about requiring our programs to be run via clojure if there are enough benefits to be worth switching from java -jar or java -cp.

2020-07-30T16:47:54.382600Z

(add-lib2 I guess, add-lib still calls it the lib-map instead of the basis)

seancorfield 2020-07-30T16:49:33.384300Z

As for "clj tools become mandatory" -- we're sort of already in that place: to work with a given project pretty much requires you use whatever tooling that project has adopted, be it lein, boot, or clj.

alexmiller 2020-07-30T16:57:09.385800Z

the basis is just a jvm property that points to a file that is an edn map. right now, clj makes that and injects it but nothing there is "special". lein could make a basis map and pass it too. or a user could make it and set it in jvm opts in lein.

dominicm 2020-07-30T16:59:01.388Z

> we're sort of already in that place: That's true for projects, but not tools. To clarify, I mean that your project makes clojure cli mandatory (although it could support lein alongside clj too, some projects do this). I'm worried about tooling like lein-figwheel or lein-sass, etc. which is tied to a particular JVM start tool needlessly.

alexmiller 2020-07-30T16:59:06.388100Z

@dominicm rather than expecting to reload the basis, you could have the basis include a reference to a different file that you watch and reload (or whatever)

alexmiller 2020-07-30T17:00:43.389500Z

tooling that is tied to a particular tool (ie a lein plugin) was always not a good factoring of the problem. it is far better to write a tool that is independent and just code and then adapt it to tooling as needed (and to prefer tools that levy minimal requirements such that the adapter may be nothing)

alexmiller 2020-07-30T17:01:07.390100Z

some lein plugins are written well for this, some aren't

dominicm 2020-07-30T17:02:09.391800Z

The clojure tools have inherently encouraged a lot of this through encouraging adoption of clojure main, so there's an inherent reduction of coupling to only be Clojure itself for many tools. (Not mine, mine all read deps.edn files!)

alexmiller 2020-07-30T17:02:35.392100Z

I don't know enough about lein-figwheel or lein-sass or whatever to say on those (but having "lein" in the name seems like a flag)

dominicm 2020-07-30T17:04:56.393200Z

One consideration will be what is standard/non-standard in the runtime basis. Lein can't tell you what aliases were loaded, it's a semantic mismatch. It could tell you some information about the deps though.

alexmiller 2020-07-30T17:16:29.394800Z

docs the basis map return (more detail is needed, a bit of it is intentionally a little vague right now, but it's mostly pretty set). I have not yet updated the specs for tools.deps but I will get to it eventually

dominicm 2020-07-30T17:17:33.395700Z

To me, those seem like they should be namespaced to tools deps if you think it's OK for other tools to generate. With the exception of lib-map, which I could see being generated by other tools (boot/lein)

seancorfield 2020-07-30T17:29:19.399200Z

The overall map is the combined deps.edn hash map, with those four extra keys added, so you're already tied to the deps.edn structure in the clojure.basis property. Leiningen isn't going to be able to create that master-edn map anyway, unless it adopts t.d.a. and deps.edn as an option (alongside project.clj) and changes its dependency resolution approach. So it seems you're already tied to clojure.basis and those simple keys at this point -- no need to qualify them, IMO.

seancorfield 2020-07-30T17:30:41.400600Z

Perhaps Leiningen could/should produce a lein.basis file/property when running code, that reflects the hash map structure of the project.clj file? But I don't think you can expect portability between tools (and I don't think you should).

1💯
seancorfield 2020-07-30T17:33:11.402600Z

"Lein can't tell you what aliases were loaded, it's a semantic mismatch." -- clojure.basis doesn't tell you what aliases were used to run the program, only the resulting "basis" of those aliases, but it includes the whole merged master-edn so you get "all" aliases available -- and lein could provide the :aliases structure from project.clj (although it's a completely different beast).

2020-07-30T22:25:09.415200Z

So interesting finding just now that might be useful to others. I operate a tiny little chat bot, that runs on a cheap / low-spec PAAS. Earlier today I switched from running it via the Clojure CLI (i.e. via something like clojure -m mybot.main) to constructing an uberjar and running that (i.e. via something like java -jar mybot.jar), mostly in order to reduce startup times (the PAAS doesn’t provide blue/green deployment at the subscription level I’m on, and the brief outage during scheduled restarts had been noticed and commented on by some users). As expected, startup time basically went to zero. What I didn’t expect was how much of a memory saving there’d be too. Previously, my bot had stabilised at around 340MB total (heap & off-heap) memory usage, which was quite close to the 512MB total provided by the PAAS, and which has to cover all processes’ memory use (not just that of my bot’s JVM). Since the switch to an uberjar, total memory usage has stabilised at around 210MB. While I didn’t confirm this in detail, my preliminary conclusion is that the extra ~130MB is allocated by the tools.deps machinery (after all that’s the only thing that changed), and is not released after tools.deps has done its thing and is no longer in use.

alexmiller 2020-07-30T22:37:34.416Z

Those are separate jvms so I don’t think I buy that if that was your theory

2020-07-30T22:37:58.416200Z

What is your theory?

2020-07-30T22:38:33.416800Z

Note: the uberjar is not AOT compiled (except for the ns with the -main function).

alexmiller 2020-07-30T22:39:39.418Z

I would buy it for a container around the clj call, not sure what the PAAS is or how it works

2020-07-30T22:40:07.418700Z

Heroku dyno (“hobby” subscription level). It’s pretty simple - a VM, as best I can tell. I’m not using Docker or any other container-like deployment mechanism.

alexmiller 2020-07-30T22:40:29.419200Z

I guess I'm saying "is allocated by the `tools.deps` machinery" is not in your final process, but may consume resources in the container itself

2020-07-30T22:40:50.419700Z

Does that JVM stick around though? Once the “app” JVM is started?

alexmiller 2020-07-30T22:40:54.420Z

no

alexmiller 2020-07-30T22:41:40.421300Z

the script waits for it to exit

alexmiller 2020-07-30T22:41:49.421500Z

before running the app

alexmiller 2020-07-30T22:48:31.425Z

the clojure script forks a child process to run the first jvm, waits for it to exit, then exec's the app which replaces the script with the final jvm process

2020-07-30T22:48:39.425200Z

I’m out of ideas. I did look into whether the local “ephemeral” filesystem might be a RAM drive that counts towards the VM’s RAM total, but that’s not the case (the ephemeral filesystem is 6GB).

2020-07-30T22:49:08.426Z

(since my app’s dependencies total around 90MB - somewhat close to the 130MB delta)

alexmiller 2020-07-30T22:49:39.426700Z

I can easily envision a container expanding its memory footprint and thus taking more time in the first case or something like that

alexmiller 2020-07-30T22:50:38.427900Z

but certainly the time savings makes sense

2020-07-30T22:50:44.428200Z

Right, but the dashboard I’m looking at shows total memory used within the VM (and doesn’t give me any idea of the total RAM of the physical hardware that VM is running on).

2020-07-30T22:51:27.428800Z

Yeah, and startup time savings was the impetus for making this change. I’m just surprised (and wish I could explain) the additional savings in RAM usage. Especially given how substantial those savings were (memory use dropped by ~40%).

2020-07-30T22:55:19.429600Z

Here’s what I see, fwiw, spanning the time of the deployment where I switched to an uberjar:

2020-07-30T22:56:05.430100Z

That graph is basically flat, going back a month or two (to when I first deployed the bot).

alexmiller 2020-07-30T22:56:49.430500Z

well, that could be reserved memory based on the max used (affected by the first jvm)

2020-07-30T22:57:30.431400Z

It isn’t though - that dashboard further breaks down the JVM’s heap and off-heap memory usage, and both of those have dropped by approximately the same %ages.

alexmiller 2020-07-30T22:57:46.432100Z

I don't really have any way to tell

2020-07-30T22:57:51.432400Z

I wouldn’t expect Heroku to “keep counting” a JVM that’s been terminated, indefinitely.

2020-07-30T22:58:32.433Z

Let me see if I can repro locally…

2020-07-30T23:05:10.434800Z

Different numbers obvs, but I think I see a similar pattern locally (using visualvm to look at just heap & metaspace usage, so definitely not an apples-to-apples comparison with Heroku).

2020-07-30T23:05:31.435100Z

Here’s clojure -m mybot.main:

2020-07-30T23:05:59.435800Z

And here’s java -jar mybot.jar:

2020-07-30T23:06:37.437100Z

(look at “used”, specifically - the JVM has allocated different total amounts from the OS in these runs, which throws off the vertical scales between the two runs)

2020-07-30T23:08:59.438600Z

For easier reading, here are the key numbers:

| clj    | jar    |
------------+--------+--------+
Heap        | 83MB   | 39MB   |
Metaspace   | 71MB   | 60MB   |
------------+--------+--------+

2020-07-30T23:09:06.438700Z

aot compilation is transitive, so unless you work to avoid it, aot compiling a single namespace will also aot all namespaces it depends on as well

2020-07-30T23:12:04.440900Z

(similar pattern, albeit different numbers, on successive runs, and with a GC requested before taking measurements)

2020-07-30T23:13:20.441800Z

No clue what’s going, tbh. I’m not one to look a gift horse in the mouth, but then unexpected changes of this magnitude also make me suspicious. 😉

2020-07-30T23:14:27.441900Z

Is that true of (gen-class) ?

2020-07-30T23:15:06.442300Z

That’s the only thing I’m using (since it’s required in order to generate an “executable JAR” that java -jar can run directly).

alexmiller 2020-07-30T23:15:27.442900Z

one difference is that the jvm memory maps the jars - and here you've got many jars vs 1

2020-07-30T23:15:40.443200Z

That said, this is my first use of depstar, so perhaps it’s doing something like that…

alexmiller 2020-07-30T23:15:44.443500Z

might be some per-jar overhead you're avoiding?

alexmiller 2020-07-30T23:16:07.443900Z

another thing to try would be to use -Scp

alexmiller 2020-07-30T23:16:39.444600Z

clojure -Scp the.jar -m your.namespace

alexmiller 2020-07-30T23:17:01.445100Z

that would be still using the script, not computing a classpath but using 1 jar

alexmiller 2020-07-30T23:18:12.446100Z

guessing that's still the lower number

alexmiller 2020-07-30T23:18:57.447Z

and the next test would be to get the script out of the way. grab the computed classpath with -Spath, then java -jar vs java -cp with that

2020-07-30T23:20:57.447100Z

it is complicated, but if you are using gen-class in the ns form, then yes, everything is being aot compiled

2020-07-30T23:22:00.447300Z

I think you’re onto something. I cracked open the uberjar and sure enough my code is in there twice in .clj and .class form.

2020-07-30T23:37:52.447800Z

Yeah I was wondering about total # of open file handles, but it’s hard to come up with 130MB that way. 😉

2020-07-30T23:38:16.448200Z

@hiredman found something though - it looks like all my code is being AOT compiled (the uberjar contains both .clj and .class versions of all of my namespaces). I don’t know why that’s happening (this is my first time using depstar, so I may have messed something up there), though apparently (gen-class) (which is used on my “-main” ns) can cause transitive AOT compilation too.

2020-07-30T23:41:01.450900Z

Would runtime compilation of Clojure generate a lot of long-lived memory usage like that though? That’s the bit I can’t quite wrap my head around - this extra memory isn’t being GCed.

2020-07-30T23:42:06.451900Z

(as part of trying to keep memory usage down on this restricted runtime environment, the bot calls (System/gc) every hour - I know that doesn’t guarantee that the GC will kick in but I figure it can’t hurt, especially as this particular bot is 99% batch/offline functions so it doesn’t matter if it’s not highly responsive to user activity)

2020-07-30T23:49:35.452500Z

have you set a max memory on the jvm?

2020-07-30T23:49:58.453100Z

When running locally, yes: -Xmx300m (I keep it low to try to emulate the memory-limited Heroku environment).

2020-07-30T23:50:15.453500Z

On Heroku it’s container managed: -XX:+UseContainerSupport

2020-07-30T23:50:41.453700Z

(locally my JVM is 11.0.2+9, while Heroku is JVM 11.0.8, btw)