graalvm

Discuss GraalVM related topics. Use clojure 1.10.2 or newer for all new projects. Contribute to https://github.com/clj-easy/graal-docs and https://github.com/BrunoBonacci/graalvm-clojure. GraalVM slack: https://www.graalvm.org/slack-invitation/.
borkdude 2021-06-17T20:37:07.146300Z

Please upvote or join this discussion: https://github.com/oracle/graal/discussions/3476

šŸ‘ 5
šŸ˜Ø 5
chrisn 2021-06-18T23:54:52.148100Z

Wow. Thanks for getting in there for the rest of us.

2021-06-21T10:47:00.166Z

Can anyone explain some of the finer details here? I know java static initializers are code blocks that execute at class instantiation; so Iā€™m assuming for native image that --initialize-at-build-time just means those blocks are executed when the native image is compiled (hence for example any side effects etc there will happen at compile time not runtime). What I donā€™t fully understand is how the clojure compiler uses static initializers, and exactly what the implications of this are for clojure. Iā€™m assuming itā€™s because clojureā€™s a lisp, and even aotā€™d clojure code still has to apply effects at runtime. e.g. ns initialisation presumably runs as a static initializer etc.

2021-06-21T10:48:50.166200Z

so presumably thatā€™s why this impacts all clojure code

borkdude 2021-06-21T10:50:05.166400Z

I've tried several times to build without this option. It's educational, but I can't really put words to this to explain it in detail, but due to how Clojure is set up / compiles things, it's needed.

borkdude 2021-06-21T10:50:19.166600Z

I would say: try without the option and perhaps come up with a better explanation, I think that would be very useful.

2021-06-21T10:51:25.166800Z

so build a clojure hello world, with and without this option? And inspect the clojure generated class files with javap to see if we can explain it?

borkdude 2021-06-21T10:52:10.167100Z

yes

borkdude 2021-06-21T10:52:37.167300Z

Clojure emits these kinds of things for functions:

static {
        const__0 = RT.var("clojure.core", "println");
    }

borkdude 2021-06-21T10:52:46.167500Z

(https://github.com/clojure-goes-fast/clj-java-decompiler)

borkdude 2021-06-21T10:52:50.167800Z

which could be related to this

2021-06-21T10:53:26.168100Z

yeah Iā€™ve seen those beforeā€¦ presumably static linking changes those too?

borkdude 2021-06-21T10:53:53.168300Z

Try to build a graalvm hello world program without the option and see how far you get

borkdude 2021-06-21T10:54:27.168500Z

I've done such a process here: https://github.com/oracle/graal/issues/3251#issuecomment-842305171

borkdude 2021-06-21T10:55:26.168900Z

If you don't initialize at build time then you will get:

Caused by: java.io.FileNotFoundException: Could not locate clojure/core__init.class, clojure/core.clj or clojure/core.cljc on classpath.
	at clojure.lang.RT.load(RT.java:462)

borkdude 2021-06-21T10:57:03.169400Z

this can be "fixed" by including clojure.core (init.class) on in the resources

2021-06-21T10:58:16.169600Z

interesting

2021-06-21T10:58:42.169800Z

Firstly itā€™s unsurprising that the line in RT is in a static initializer šŸ™‚

borkdude 2021-06-21T10:59:29.170Z

but that uses the dynamicclassloader, etc, which won't work in a native-image anymore anyway

borkdude 2021-06-21T10:59:38.170200Z

so I think that already explains why you need build time

borkdude 2021-06-21T10:59:55.170400Z

so we can quit here already

2021-06-21T11:02:29.170600Z

yeah ā€” was going to say something similar though hopefully you can clear it up for me, if Iā€™m inaccurate/wrongā€¦ My reasoning was essentially: 1. clojure is a single pass compilerā€¦ i.e. essentially your whole program is a flattened require tree / repl sessionā€¦ all deps ā€œessentially concatenatedā€. 2. therefore if weā€™re loading clojure/core there, at some point after that weā€™ll be loading all of your apps dependencies in a similar initializer block.

2021-06-21T11:02:43.170800Z

3. hence we can quit here

2021-06-21T11:03:52.171Z

Though I guess the dynamic class loader essentially just implements what I said

2021-06-21T11:05:12.171300Z

i.e. resolving clj / class files, and compiling clj into .class etcā€¦ essentially controlling ā€œRead (compile) Evalā€

borkdude 2021-06-21T11:07:41.171600Z

yeah. another way to put it: you can't "dynamically load classes" at runtime in GraalVM native-image, but Clojure does this in static initializer blocks, hence these must be initalized at build time.

2021-06-21T11:08:20.171800Z

Yeah thatā€™s a good way to put it

borkdude 2021-06-21T11:10:17.172200Z

Perhaps this could be resolved if you make a Java class which does all the loading in a static initializer block and you only initialize that one at build time and the rest of your classes could be inialized at runtime, but this would probably require changes to Clojure itself

borkdude 2021-06-21T11:12:30.172400Z

But interestingly these changes can be accomplished using substitutions as well perhaps.

borkdude 2021-06-21T11:12:57.172600Z

Here is an example: https://github.com/borkdude/clj-reflector-graal-java11-fix#the-solution

2021-06-21T11:16:34.173Z

I was wondering why the clojure compiler needs to use the dynamic class loader for AOTā€™d code? Presumably it could (at least in a graal compilation context) avoid that? Or is that essentially what youā€™re describing?

borkdude 2021-06-21T11:16:52.173200Z

yeah, that's what I was trying to describe

šŸ‘ 1
borkdude 2021-06-21T11:17:47.173500Z

in an AOT-ed (native-image) setting you know which namespaces you want, so you could just write that out explicitly

2021-06-21T11:17:55.173700Z

Yeah

2021-06-21T11:19:43.173900Z

Backing up for a second to the graal issue: Re thomaswueā€™s point: > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting? Why do we need to bundle into an uberjar? Could we not also just give them a classpath?

borkdude 2021-06-21T11:20:37.174200Z

yes, you can already do this, but the original problem in that topic is that they want to get rid of the option without explicitly specifying the classes for which you want build time initialization

borkdude 2021-06-21T11:21:08.174400Z

and here he offers some kind of compromise to be able to say for which .jar you want it. so if you provide an uberjar you will get all the classes again

borkdude 2021-06-21T11:21:49.174600Z

Feel free to follow up the discussion

2021-06-21T11:37:48.174800Z

Yeahā€¦ Iā€™m just trying to understand the tooing and froing of conversation. So to summarise the thread / they donā€™t want people to mark every class for build-time-initialization; because for some possible classes itā€™ll screw things up. For most idiomatic clojure code we need build-time-initialization. Though for some clojure code that also wonā€™t work (e.g. an ns with (def data (fetch-data-from-postgres ,,,)) will need to either be rewritten or opt in to runtime initialisation). If we default a classpath into build-time-initialization we may build invalid ā€œbuild timeā€ state into the runtime (adinnā€™s point) for java deps etc.

2021-06-21T11:37:54.175Z

Is that the general gist of it?

borkdude 2021-06-21T11:39:57.175200Z

correct. but imo taking away this option will make it harder on Clojure developers since for most CLIs I built this stuff worked fine (or I was able to work around it). Occasionally a library like httpkit would give problems: https://github.com/http-kit/http-kit#native-image

borkdude 2021-06-21T11:40:35.175600Z

But perhaps listing all clojure-related namespaces through some script is possible, I was just trying to make sure Clojure projects would still be able to run

2021-06-21T11:40:39.175800Z

Yeah I agree with that. The default for .clj(c) files should be build time, because of the nature for clojure

2021-06-21T11:41:15.176Z

@borkdude: Yeah I was literally typing: presumably we could use something like mranderson to move all clj code under a new top level package/ns, and then flag that to default as build time

borkdude 2021-06-21T11:42:48.176500Z

blegh, I don't like that solution. I like the uberjar solution much better

borkdude 2021-06-21T11:43:07.176700Z

mranderson is a hack to make multiple versions of the same library work together

2021-06-21T11:43:45.176900Z

how do you avoid mixing java library / classes into the uberjar though?

borkdude 2021-06-21T11:44:28.177300Z

@rickmoynihan well you don't have to, you could of course just make a jar with your project code + clojure libs and put the Java code into another jar

2021-06-21T11:44:31.177500Z

(I agree the mranderson thing would be a hack)

borkdude 2021-06-21T11:44:55.177700Z

but personally I would just go with all build at runtime for everything and figure out the exceptions

borkdude 2021-06-21T11:45:17.177900Z

the tools I build are usually CLI tools and not huge micronaut web server things which I think the issue is more concerned about

borkdude 2021-06-21T11:46:54.178100Z

Perhaps we can figure out a good pattern to build only clojure classes at build-time

2021-06-21T11:46:56.178300Z

> but personally I would just go with all build at runtime for everything and figure out the exceptions well to be fair that is how any approach in clj will eventually end up working ā€” the main difference would be starting from a point where you didnā€™t picking the wrong default for java libs.

2021-06-21T11:49:26.178600Z

@borkdude: Yeah I was going to say the issue is that thereā€™s no tooling that knows what a clojure lib is vs a java lib. Weā€™d need something that knew how to biject clj files into their class filesā€¦. essentially mapping munge over the .clj(c) classpath.

borkdude 2021-06-21T11:50:41.178900Z

well, that is certainly doable

2021-06-21T11:50:49.179100Z

indeed

borkdude 2021-06-21T11:51:12.179300Z

but I was trying to avoid getting into this, it all works beautifully now

2021-06-21T11:51:19.179500Z

yeah

2021-06-21T11:51:35.179700Z

it would be nice to avoid having to have another step

borkdude 2021-06-21T11:54:49.180Z

btw, I'm trying these flags:

"--initialize-at-build-time=clojure."
           "--initialize-at-build-time=clojure.core.server"
           
but I'm still getting errors about clojure.core.server

borkdude 2021-06-21T11:56:27.180200Z

(trying this in refl)

2021-06-21T11:56:59.180400Z

with graal 22?

borkdude 2021-06-21T11:59:28.180600Z

no, 21

borkdude 2021-06-21T12:04:57.180800Z

ok, for refl this seems to work:

"--initialize-at-build-time=clojure,refl"

borkdude 2021-06-21T12:05:03.181Z

but it's a small project without any dependencies

2021-06-21T12:06:53.181300Z

yeah thatā€™s essentially equivalent to listing all of the top level namespaces you use there.

2021-06-21T12:07:31.181500Z

itā€™s good to prove what theyā€™re suggesting will work for usā€¦ itā€™s just a shame itā€™s more clunky.

borkdude 2021-06-21T12:09:42.181800Z

I will try with the httpkit library

borkdude 2021-06-21T12:11:16.182Z

unfortunately there the clojure and java package overlaps

borkdude 2021-06-21T12:11:21.182200Z

:/

borkdude 2021-06-21T12:15:43.182400Z

@rickmoynihan yeah, so this works with httpkit (2.5.3):

"--initialize-at-build-time=clojure,refl,org.httpkit"
           "--initialize-at-run-time=org.httpkit.client"
           

borkdude 2021-06-21T12:16:16.182600Z

which doesn't buy you anything really

borkdude 2021-06-21T12:16:32.182800Z

since you still have to make explicit because of the overlapping package name

borkdude 2021-06-21T12:17:41.183Z

but at least, it seems doable, but annoying

borkdude 2021-06-21T12:18:02.183200Z

I might try for babashka, which is a way bigger project

borkdude 2021-06-21T12:18:06.183400Z

later this week

borkdude 2021-06-21T12:20:08.183600Z

it seems a namespace refl.main makes a package refl and a class main inside of it

borkdude 2021-06-21T12:20:25.183800Z

so you have to use the package name refl to get all the related classes refl.main__init, etc.

borkdude 2021-06-21T12:20:47.184Z

so perhaps a "simple" all-ns with some munging/post-processing could be all that's needed

borkdude 2021-06-21T12:33:19.184400Z

@rickmoynihan Something like this:

user=> (->> (map ns-name (all-ns)) (remove #(str/starts-with? % "clojure")) (map #(str/split (str %) #"\.")) (keep butlast) (map #(str/join "." %)) distinct (map munge) (cons "clojure"))
("clojure" "refl" "org.httpkit")

šŸ‘ 1
borkdude 2021-06-21T12:33:30.184600Z

which is what I used for refl + httpkit

borkdude 2021-06-21T12:34:21.184800Z

for babashka:

("clojure" "sci.impl" "selmer" "babashka.nrepl" "babashka.impl.clojure.java" "babashka.impl" "rewrite_clj.node" "bencode" "rewrite_clj.parser" "babashka.impl.clojure" "org.httpkit" "rewrite_clj.custom_zipper" "rewrite_clj.zip" "borkdude.graal" "babashka.nrepl.impl" "babashka.pods" "cognitect" "babashka" "edamame.impl" "cheshire" "rewrite_clj" "hiccup" "sci" "borkdude" "flatland.ordered" "babashka.pods.impl" "clj_yaml" "babashka.impl.clojure.core" "datascript" "hf.depstar" "babashka.impl.tools" "sci.addons" "babashka.impl.clojure.test")

borkdude 2021-06-21T12:35:20.185Z

(could probably clean this up by looking at the existence of a prefix in others)

borkdude 2021-06-21T12:35:33.185200Z

but you get the gist

borkdude 2021-06-21T12:40:27.185600Z

ok, that leads to:

Exception raised in scope ForkJoinPool-2-worker-25.ClosedWorldAnalysis.AnalysisGraphBuilderPhase: org.graalvm.compiler.java.BytecodeParser$BytecodeParserError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: No instances of com.fasterxml.jackson.core.io.SerializedString are allowed in the image heap as this class should be initialized at image runtime. To see how this object got instantiated use --trace-object-instantiation=com.fasterxml.jackson.core.io.SerializedString.

borkdude 2021-06-21T12:41:23.185800Z

kind of demonstrating that it would be painful to have to do this exercise for every graalvm project

borkdude 2021-06-21T12:50:39.186Z

This jackson thing seems to be the only problem though

borkdude 2021-06-21T12:53:16.186200Z

so here's what I ended up with:

borkdude 2021-06-21T12:53:26.186600Z

"--initialize-at-build-time=clojure,sci.impl,selmer,babashka.nrepl,babashka.impl.clojure.java,babashka.impl,rewrite_clj.node,bencode,rewrite_clj.parser,babashka.impl.clojure,org.httpkit,rewrite_clj.custom_zipper,rewrite_clj.zip,borkdude.graal,babashka.nrepl.impl,babashka.pods,cognitect,babashka,edamame.impl,cheshire,rewrite_clj,hiccup,sci,borkdude,flatland.ordered,babashka.pods.impl,clj_yaml,babashka.impl.clojure.core,datascript,hf.depstar,babashka.impl.tools,sci.addons,babashka.impl.clojure.test"
       "--initialize-at-build-time=com.fasterxml.jackson"

borkdude 2021-06-21T13:04:22.187Z

so it seems it's feasible

2021-06-21T13:08:56.187300Z

Sorry was afk for lunch šŸ™‚ > unfortunately there the clojure and java package overlaps What do you mean? Clojure and java code inhabiting the same package/ns? Meaning the java classes are defaulted into build time init?

borkdude 2021-06-21T13:09:19.187500Z

yes, for org.httpkit for example

2021-06-21T13:09:59.187700Z

Iā€™m guessing for babashka you just ran that at a repl and pasted the output into the shell script; but would plan to automate it at somepoint (or convince the graal folk to do something different)

borkdude 2021-06-21T13:10:18.187900Z

yes

šŸ‘ 1
borkdude 2021-06-21T13:13:13.188200Z

@rickmoynihan are you on linux btw?

2021-06-21T13:13:28.188400Z

macos

borkdude 2021-06-21T13:13:59.188600Z

ok. in #babashka-circleci-builds there are new binaries compiled on the init-at-build-time branch. I wonder if this would impact startup time

borkdude 2021-06-21T13:14:08.188800Z

I don't see a real difference on macos yet

2021-06-21T13:15:08.189Z

> if this would impact startup time In which direction were you thinking?

borkdude 2021-06-21T13:15:23.189200Z

perhaps it's slower if more work has to be done at run time?

2021-06-21T13:18:44.189400Z

Shouldnā€™t we be expecting for essentially the same coverage? i.e. all clojure code (except the few exceptions) to be initialised at build time?

borkdude 2021-06-21T13:18:58.189600Z

yes

borkdude 2021-06-21T13:19:09.189800Z

perhaps when you're doing interop it's going to be different

borkdude 2021-06-21T13:19:21.190Z

but perhaps it's not really significant

borkdude 2021-06-21T13:19:43.190200Z

so it's good to have a working solution now and be prepared for 22

šŸ‘ 1
2021-06-21T13:21:44.190400Z

yeah assuming both builds behave the same wrt to correctness, Iā€™d expect there not to be a significant difference in startup timeā€¦ If there were itā€™d probably mean we werenā€™t covering everything we needed to.

2021-06-21T13:24:06.190900Z

Do you think any of this changes how the graal thread has been left? > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting?

borkdude 2021-06-21T13:24:25.191100Z

I already responded in that thread

borkdude 2021-06-21T13:24:44.191300Z

He seems to be in favor of that

2021-06-21T13:24:47.191500Z

ah thanks ā€” just refreshed

2021-06-21T13:24:49.191700Z

šŸ‘€

2021-06-21T13:31:05.191900Z

What are the use cases for the uberjar case thomaswue is pushing for? Iā€™m not even sure for clojure itā€™s sufficient

borkdude 2021-06-21T13:31:38.192200Z

I usually tend to compile and collect all the code into an uberjar first and then feed that to graalvm

borkdude 2021-06-21T13:31:59.192400Z

you don't have to do this, but I find this easier, since you just know what code you're dealing with after the uberjar step

borkdude 2021-06-21T13:32:12.192600Z

also I distribute the uberjars so people who want to make nixos derivations etc can use them

borkdude 2021-06-21T13:34:07.192800Z

I could also say in case of an issue to a graalvm dev: here you have the uberjar, I do this to compile it, but it doesn't work

borkdude 2021-06-21T13:34:13.193Z

without him/her having to install clojure, etc

2021-06-21T13:34:44.193200Z

Yeah I get that itā€™s useful for your other requirements (you want uberjars anyway etc). But an uberjar is just a reified/flattened classpathā€¦ so why canā€™t they just take a classpath?

borkdude 2021-06-21T13:35:19.193400Z

You should ask this to Thomas, I don't know his reasoning

2021-06-21T13:35:22.193600Z

I should probably ask them šŸ™‚

2021-06-21T13:35:26.193800Z

jinx

borkdude 2021-06-21T13:35:43.194Z

His reasoning could be:

2021-06-21T13:35:54.194200Z

Just want to check that Iā€™m not arguing against what you want šŸ™‚

borkdude 2021-06-21T13:35:55.194400Z

Libraries aren't allowed to say: everything at build time

borkdude 2021-06-21T13:36:07.194600Z

but if you have a fat jar, you're not a library owner saying this, you are the end user

2021-06-21T13:36:19.194800Z

yeah ok

2021-06-21T13:36:24.195Z

that makes sense

2021-06-21T13:38:16.195300Z

(actually I was meaning to ask you about this for another reasonā€¦ Iā€™ll start another thread on the channel for it though as itā€™s a change of topic)

chrisn 2021-06-29T15:00:29.238200Z

I think this also relates to clojure startup time. Perhaps we attempt a clojure-side compilation flag that solves (or makes progress towards) both issues at AOT time?

chrisn 2021-06-29T15:01:16.238400Z

meaning when this flag is in effect the clojure compiler generates different byte code and this byte code both starts up faster and works with graal native without needing --initialize-at-build-time.

borkdude 2021-06-29T15:04:15.238600Z

@chris441 do you have any concrete ideas of what can be done differently?

chrisn 2021-06-29T15:04:34.238800Z

Not without more careful consideration I do not.

borkdude 2021-06-29T15:05:06.239Z

what Clojure does in static initializers is resolve vars, load classes, etc.

borkdude 2021-06-29T15:05:25.239200Z

delaying the class loading to run time won't work in a native image

chrisn 2021-06-29T15:05:27.239400Z

I will look a lot more closely; I just know those are two related things and my profilers always show var initialization as one of the startup issues so somehow compiling that data down into something perhaps more concrete that loads faster is an interesting issue that seems related.

chrisn 2021-06-29T15:05:38.239600Z

Also interesting for dalvik.

chrisn 2021-06-29T15:06:03.239800Z

I know this is an area smart people have looked at before.

borkdude 2021-06-29T15:06:08.240Z

delaying var initialization to build time will make native images slower to start up right?

chrisn 2021-06-29T15:06:48.240200Z

I don't want to delay anything, I want AOT to produce data as a side effect that can be quickly loaded to initialize vars during runtime initialization.

borkdude 2021-06-29T15:07:34.240400Z

ok, but now these are are already initialized in the image heap, so that work has already been done when starting the image

chrisn 2021-06-29T15:07:35.240600Z

Exact opposite of delaying.

borkdude 2021-06-29T15:11:10.240800Z

My point is: moving work from build to run time makes things slower

chrisn 2021-06-29T15:11:31.241Z

Well, for example in your javap above:

const__0 = RT.var("clojure.core", "println");

chrisn 2021-06-29T15:12:27.241200Z

Yes, I agree and that is not what I am suggesting. const_0 being initialized to a static class instance in your example above would make things faster as it would bypass the RT.var mechanism.

borkdude 2021-06-29T15:13:11.241400Z

right

borkdude 2021-06-29T15:13:41.241600Z

it could directly reference the AOT-ed class which represents the println var right?

chrisn 2021-06-29T15:16:14.241800Z

Yes, in this case. You also have the case where something is initialized via a complex function that produces a persistent datastructure and in this case the data can be saved in resources and found via a hashtable lookup or straight array lookup in constant time eliding the generating code. I haven't looked at this in huge depth but for instance I was extremely careful with dtype-next and it still takes some time even after an AOT run to pull in, for instance, the ND system via require. This is a solvable problem.

chrisn 2021-06-29T15:17:43.242Z

My thought is more of the form move --initialize-at-build-time into the clojure compiler and allow anything that it did in the graal vm system to be done during the AOT step. Then --init-at-build-time should be a noop if done during graal vm compilation.

borkdude 2021-06-29T15:17:59.242200Z

As clojure.core is AOT-ed by default anyway, I guess Compiler could be instrumented in such a way that it can reference these classes directly when generating more code. For core vars only it would already be a win

chrisn 2021-06-29T15:18:45.242500Z

This is complicated by the fact that bytecode files aren't general data storage mechanisms (at least as far as I know) which means you need some level of sidecar file generated at build time for pure data.

borkdude 2021-06-29T15:18:56.242700Z

The Compiler could keep track of what vars map to which classes

chrisn 2021-06-29T15:19:20.242900Z

It would be nasty and error prone. Definitely a YMMV pathway but with time it could work well.

chrisn 2021-06-29T15:20:02.243100Z

If it opened up both simpler Graal native and more dalvik development that would IMO be a very solid win worth real invesment.

chrisn 2021-06-29T15:21:11.243300Z

Well, I guess if nubank agrees it may be worth real invesment šŸ™‚.

borkdude 2021-06-29T15:43:33.243500Z

What problems does Dalvik currently have with Clojure?

borkdude 2021-06-29T15:44:08.243700Z

I'm seeing that Dalvik is now replaced with something else

borkdude 2021-06-29T15:45:14.243900Z

maybe just a detail. Does ART (Android Runtime) interpret Java bytecode directly?

chrisn 2021-06-29T16:11:12.244100Z

I am referring to this article:

chrisn 2021-06-29T16:11:15.244300Z

https://blog.ndk.io/solving-clojure-boot-time.html

borkdude 2021-06-29T16:16:09.244500Z

Interesting article, thanks for sharing