clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
Marc O'Morain 2020-06-02T17:53:02.183900Z

Starting the main app at CircleCI takes 60 seconds on my machine (3.5 GHz Dual-Core Intel Core i7). I’ve been wondering what is happening for these 60 seconds for that time for some years, and today I spent a little time looking into it. In the past, I have booted our app with a sampling profiler attached, but the data produced is very difficult to work with – loading a Clojure app is recursive in nature – the evaluation the top-level namespace will compile + eval the required namespaces, which in turn compile and eval their own dependencies. The resulting profile has a call-tree which is very deep and has a very low branching factor. This time around I added some coarse instrumentation code to src/jvm/clojure/lang/Compiler.java. From the data that I collected, I was able to confirm that there are no bottlenecks in the launch of our app, we are just loading many files which all take time. This lead to wonder about two approaches to speeding up the time to load our app: • Parallelising the loading of code • Caching the result of evaluations disk Has anyone every tried either of these approaches? I was searching JIRA to find any prior art in this area, but I could not find anything.

alexmiller 2020-06-02T17:56:10.184200Z

Re caching, https://clojure.org/guides/dev_startup_time

Marc O'Morain 2020-06-02T17:56:46.185Z

Thanks Alex.

Marc O'Morain 2020-06-02T17:58:17.186700Z

I’ll try that approach later on tonight or tomorrow. I’m excited to try it out.

alexmiller 2020-06-02T17:58:38.187200Z

Parallel compilation (really load) is something I’ve looked at but it’s inherently tricky due to the lack of immutable namespaces and locking around loads

alexmiller 2020-06-02T18:01:01.189900Z

Those are both things that we’ve talked about changing in a hand-wavey long term sort of way. Really, there are a bunch of sub problems around making load faster which are probably more tractable (and useful whether parallel or not).

Marc O'Morain 2020-06-02T18:06:51.192200Z

FWIW, in my crude data, the time to eval namespaces dominates the time to compile them. I had difficulty instrumenting the time taken to load Java dependencies in isolation from the time take to eval a form.

ghadi 2020-06-02T18:09:42.192400Z

your compilation speed will benefit from not compiling, as in the link posted above

ghadi 2020-06-02T18:11:33.194Z

what is the distinction between eval namespace and compile them?

slipset 2020-06-02T18:11:34.194200Z

FWIW @dnolen had some thoughts on caching compiled deps in a recent defn-podcast.

dnolen 2020-06-02T18:44:17.194400Z

well not super concrete thoughts for Clojure - just that ClojureScript had to solve this problem out of necessity since we're fundamentally AOT

Marc O'Morain 2020-06-02T20:52:06.194600Z

@ghadi when I say “compile” and “eval” I guess I mean Read and Eval as in REPL. I added some timers around the call to eval here: https://github.com/clojure/clojure/blob/30a36cbe0ef936e57ddba238b7fa6d58ee1cbdce/src/jvm/clojure/lang/Compiler.java#L7636

Marc O'Morain 2020-06-02T20:54:09.194900Z

In my tests, no file took longer than 10ms to in the read phase, and the slowest file to eval was clojure/core/async.clj, which took 1.52 seconds to eval, (plus a further 2.07 seconds to read and eval files that clojure/core/async.clj itself requires).

Marc O'Morain 2020-06-02T20:56:03.195200Z

@slipset - yup, I listened to that podcast recently after a having had to wait for my app to boot many times that day, which gave the imputus to do some research.

alexmiller 2020-06-02T21:04:07.195500Z

the steps involved here are really read, compile, and I'd call it load. I think the eval you're timing there really includes compile (which you can remove using the guide above). core.async is particularly painful due to the giant macro, but that's all cost paid at compile time.

alexmiller 2020-06-02T21:04:34.195700Z

load itself includes both loading classes and initializing vars

alexmiller 2020-06-02T21:06:36.196Z

compile is probably the slowest part of this, but I don't know of any specific part that's slow, it's just the cost of emitting a lot of bytecode

alexmiller 2020-06-02T21:08:21.196200Z

loading vars is something with known costs and there are some options there to make that stuff lazier (Rich made a lazy-vars impl a while back and Ghadi has a version using some of the newer indy guards that mitigates some of the downsides of laziness)

alexmiller 2020-06-02T21:09:56.196400Z

pulling way back from an app perspective, I find it helpful to think about what you actually need to do as an app to be "up" and whether you can defer parts of that loading until later. can you load 100 namespaces instead of a 1000 before you are "ready" and then load the rest as needed? (usually you can)

Marc O'Morain 2020-06-02T21:25:12.196600Z

Thanks Alex, I really appreciate your time on this, > can you load 100 namespaces instead of a 1000 Yeah, I’ve been thinking along the same lines. The situation that I find myself in is that our main Clojure app has grown to be cumbersome and resistant to change. It’s slow to start, the tests are slow to run, and it is hard to run the tests reliably locally. This hinders refactoring efforts. So yes, I could factor the app better, but making those changes will be slow, and the slow compile time feeds back into that cycle.

Marc O'Morain 2020-06-02T21:28:26.196800Z

I think it’s time for bed - my experimenting with emitting classes has hit some road-blocks which have dampened my spirits! Someone decided a namespace should have the following two functions:

(defn- identity->Identity

...

(defn- Identity->identity
Which (I assume) results in conflicting filenames for the emitted .class files, producing this error:
Execution error (NoClassDefFoundError) at java.lang.ClassLoader/defineClass1 (ClassLoader.java:-2).
federations_service_client/core$identity__GT_Identity (wrong name: federations_service_client/core$Identity__GT_identity)

alexmiller 2020-06-02T21:58:45.197200Z

yes, that's pretty evil

alexmiller 2020-06-02T21:59:18.197400Z

it's actually totally fine but bad things will happen on case insensitive file systems

alexmiller 2020-06-02T22:01:42.197600Z

when compiling in-memory, nothing hits disk and this isn't an issue, but you'd have this same problem if you aot compiled or uber jarred on a case insensitive file system like mac