graalvm

Discuss GraalVM related topics. Use clojure 1.10.2 or newer for all new projects. Contribute to https://github.com/clj-easy/graal-docs and https://github.com/BrunoBonacci/graalvm-clojure. GraalVM slack: https://www.graalvm.org/slack-invitation/.
2021-06-21T10:47:00.166Z

Can anyone explain some of the finer details here? I know java static initializers are code blocks that execute at class instantiation; so I’m assuming for native image that --initialize-at-build-time just means those blocks are executed when the native image is compiled (hence for example any side effects etc there will happen at compile time not runtime). What I don’t fully understand is how the clojure compiler uses static initializers, and exactly what the implications of this are for clojure. I’m assuming it’s because clojure’s a lisp, and even aot’d clojure code still has to apply effects at runtime. e.g. ns initialisation presumably runs as a static initializer etc.

2021-06-21T10:48:50.166200Z

so presumably that’s why this impacts all clojure code

borkdude 2021-06-21T10:50:05.166400Z

I've tried several times to build without this option. It's educational, but I can't really put words to this to explain it in detail, but due to how Clojure is set up / compiles things, it's needed.

borkdude 2021-06-21T10:50:19.166600Z

I would say: try without the option and perhaps come up with a better explanation, I think that would be very useful.

2021-06-21T10:51:25.166800Z

so build a clojure hello world, with and without this option? And inspect the clojure generated class files with javap to see if we can explain it?

borkdude 2021-06-21T10:52:10.167100Z

yes

borkdude 2021-06-21T10:52:37.167300Z

Clojure emits these kinds of things for functions:

static {
        const__0 = RT.var("clojure.core", "println");
    }

borkdude 2021-06-21T10:52:46.167500Z

(https://github.com/clojure-goes-fast/clj-java-decompiler)

borkdude 2021-06-21T10:52:50.167800Z

which could be related to this

2021-06-21T10:53:26.168100Z

yeah I’ve seen those before… presumably static linking changes those too?

borkdude 2021-06-21T10:53:53.168300Z

Try to build a graalvm hello world program without the option and see how far you get

borkdude 2021-06-21T10:54:27.168500Z

I've done such a process here: https://github.com/oracle/graal/issues/3251#issuecomment-842305171

borkdude 2021-06-21T10:55:26.168900Z

If you don't initialize at build time then you will get:

Caused by: java.io.FileNotFoundException: Could not locate clojure/core__init.class, clojure/core.clj or clojure/core.cljc on classpath.
	at clojure.lang.RT.load(RT.java:462)

borkdude 2021-06-21T10:57:03.169400Z

this can be "fixed" by including clojure.core (init.class) on in the resources

2021-06-21T10:58:16.169600Z

interesting

2021-06-21T10:58:42.169800Z

Firstly it’s unsurprising that the line in RT is in a static initializer 🙂

borkdude 2021-06-21T10:59:29.170Z

but that uses the dynamicclassloader, etc, which won't work in a native-image anymore anyway

borkdude 2021-06-21T10:59:38.170200Z

so I think that already explains why you need build time

borkdude 2021-06-21T10:59:55.170400Z

so we can quit here already

2021-06-21T11:02:29.170600Z

yeah — was going to say something similar though hopefully you can clear it up for me, if I’m inaccurate/wrong… My reasoning was essentially: 1. clojure is a single pass compiler… i.e. essentially your whole program is a flattened require tree / repl session… all deps “essentially concatenated”. 2. therefore if we’re loading clojure/core there, at some point after that we’ll be loading all of your apps dependencies in a similar initializer block.

2021-06-21T11:02:43.170800Z

3. hence we can quit here

2021-06-21T11:03:52.171Z

Though I guess the dynamic class loader essentially just implements what I said

2021-06-21T11:05:12.171300Z

i.e. resolving clj / class files, and compiling clj into .class etc… essentially controlling “Read (compile) Eval”

borkdude 2021-06-21T11:07:41.171600Z

yeah. another way to put it: you can't "dynamically load classes" at runtime in GraalVM native-image, but Clojure does this in static initializer blocks, hence these must be initalized at build time.

2021-06-21T11:08:20.171800Z

Yeah that’s a good way to put it

borkdude 2021-06-21T11:10:17.172200Z

Perhaps this could be resolved if you make a Java class which does all the loading in a static initializer block and you only initialize that one at build time and the rest of your classes could be inialized at runtime, but this would probably require changes to Clojure itself

borkdude 2021-06-21T11:12:30.172400Z

But interestingly these changes can be accomplished using substitutions as well perhaps.

borkdude 2021-06-21T11:12:57.172600Z

Here is an example: https://github.com/borkdude/clj-reflector-graal-java11-fix#the-solution

2021-06-21T11:16:34.173Z

I was wondering why the clojure compiler needs to use the dynamic class loader for AOT’d code? Presumably it could (at least in a graal compilation context) avoid that? Or is that essentially what you’re describing?

borkdude 2021-06-21T11:16:52.173200Z

yeah, that's what I was trying to describe

1👍
borkdude 2021-06-21T11:17:47.173500Z

in an AOT-ed (native-image) setting you know which namespaces you want, so you could just write that out explicitly

2021-06-21T11:17:55.173700Z

Yeah

2021-06-21T11:19:43.173900Z

Backing up for a second to the graal issue: Re thomaswue’s point: > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting? Why do we need to bundle into an uberjar? Could we not also just give them a classpath?

borkdude 2021-06-21T11:20:37.174200Z

yes, you can already do this, but the original problem in that topic is that they want to get rid of the option without explicitly specifying the classes for which you want build time initialization

borkdude 2021-06-21T11:21:08.174400Z

and here he offers some kind of compromise to be able to say for which .jar you want it. so if you provide an uberjar you will get all the classes again

borkdude 2021-06-21T11:21:49.174600Z

Feel free to follow up the discussion

2021-06-21T11:37:48.174800Z

Yeah… I’m just trying to understand the tooing and froing of conversation. So to summarise the thread / they don’t want people to mark every class for build-time-initialization; because for some possible classes it’ll screw things up. For most idiomatic clojure code we need build-time-initialization. Though for some clojure code that also won’t work (e.g. an ns with (def data (fetch-data-from-postgres ,,,)) will need to either be rewritten or opt in to runtime initialisation). If we default a classpath into build-time-initialization we may build invalid “build time” state into the runtime (adinn’s point) for java deps etc.

2021-06-21T11:37:54.175Z

Is that the general gist of it?

borkdude 2021-06-21T11:39:57.175200Z

correct. but imo taking away this option will make it harder on Clojure developers since for most CLIs I built this stuff worked fine (or I was able to work around it). Occasionally a library like httpkit would give problems: https://github.com/http-kit/http-kit#native-image

borkdude 2021-06-21T11:40:35.175600Z

But perhaps listing all clojure-related namespaces through some script is possible, I was just trying to make sure Clojure projects would still be able to run

2021-06-21T11:40:39.175800Z

Yeah I agree with that. The default for .clj(c) files should be build time, because of the nature for clojure

2021-06-21T11:41:15.176Z

@borkdude: Yeah I was literally typing: presumably we could use something like mranderson to move all clj code under a new top level package/ns, and then flag that to default as build time

borkdude 2021-06-21T11:42:48.176500Z

blegh, I don't like that solution. I like the uberjar solution much better

borkdude 2021-06-21T11:43:07.176700Z

mranderson is a hack to make multiple versions of the same library work together

2021-06-21T11:43:45.176900Z

how do you avoid mixing java library / classes into the uberjar though?

borkdude 2021-06-21T11:44:28.177300Z

@rickmoynihan well you don't have to, you could of course just make a jar with your project code + clojure libs and put the Java code into another jar

2021-06-21T11:44:31.177500Z

(I agree the mranderson thing would be a hack)

borkdude 2021-06-21T11:44:55.177700Z

but personally I would just go with all build at runtime for everything and figure out the exceptions

borkdude 2021-06-21T11:45:17.177900Z

the tools I build are usually CLI tools and not huge micronaut web server things which I think the issue is more concerned about

borkdude 2021-06-21T11:46:54.178100Z

Perhaps we can figure out a good pattern to build only clojure classes at build-time

2021-06-21T11:46:56.178300Z

> but personally I would just go with all build at runtime for everything and figure out the exceptions well to be fair that is how any approach in clj will eventually end up working — the main difference would be starting from a point where you didn’t picking the wrong default for java libs.

2021-06-21T11:49:26.178600Z

@borkdude: Yeah I was going to say the issue is that there’s no tooling that knows what a clojure lib is vs a java lib. We’d need something that knew how to biject clj files into their class files…. essentially mapping munge over the .clj(c) classpath.

borkdude 2021-06-21T11:50:41.178900Z

well, that is certainly doable

2021-06-21T11:50:49.179100Z

indeed

borkdude 2021-06-21T11:51:12.179300Z

but I was trying to avoid getting into this, it all works beautifully now

2021-06-21T11:51:19.179500Z

yeah

2021-06-21T11:51:35.179700Z

it would be nice to avoid having to have another step

borkdude 2021-06-21T11:54:49.180Z

btw, I'm trying these flags:

"--initialize-at-build-time=clojure."
           "--initialize-at-build-time=clojure.core.server"
           
but I'm still getting errors about clojure.core.server

borkdude 2021-06-21T11:56:27.180200Z

(trying this in refl)

2021-06-21T11:56:59.180400Z

with graal 22?

borkdude 2021-06-21T11:59:28.180600Z

no, 21

borkdude 2021-06-21T12:04:57.180800Z

ok, for refl this seems to work:

"--initialize-at-build-time=clojure,refl"

borkdude 2021-06-21T12:05:03.181Z

but it's a small project without any dependencies

2021-06-21T12:06:53.181300Z

yeah that’s essentially equivalent to listing all of the top level namespaces you use there.

2021-06-21T12:07:31.181500Z

it’s good to prove what they’re suggesting will work for us… it’s just a shame it’s more clunky.

borkdude 2021-06-21T12:09:42.181800Z

I will try with the httpkit library

borkdude 2021-06-21T12:11:16.182Z

unfortunately there the clojure and java package overlaps

borkdude 2021-06-21T12:11:21.182200Z

:/

borkdude 2021-06-21T12:15:43.182400Z

@rickmoynihan yeah, so this works with httpkit (2.5.3):

"--initialize-at-build-time=clojure,refl,org.httpkit"
           "--initialize-at-run-time=org.httpkit.client"
           

borkdude 2021-06-21T12:16:16.182600Z

which doesn't buy you anything really

borkdude 2021-06-21T12:16:32.182800Z

since you still have to make explicit because of the overlapping package name

borkdude 2021-06-21T12:17:41.183Z

but at least, it seems doable, but annoying

borkdude 2021-06-21T12:18:02.183200Z

I might try for babashka, which is a way bigger project

borkdude 2021-06-21T12:18:06.183400Z

later this week

borkdude 2021-06-21T12:20:08.183600Z

it seems a namespace refl.main makes a package refl and a class main inside of it

borkdude 2021-06-21T12:20:25.183800Z

so you have to use the package name refl to get all the related classes refl.main__init, etc.

borkdude 2021-06-21T12:20:47.184Z

so perhaps a "simple" all-ns with some munging/post-processing could be all that's needed

borkdude 2021-06-21T12:33:19.184400Z

@rickmoynihan Something like this:

user=> (->> (map ns-name (all-ns)) (remove #(str/starts-with? % "clojure")) (map #(str/split (str %) #"\.")) (keep butlast) (map #(str/join "." %)) distinct (map munge) (cons "clojure"))
("clojure" "refl" "org.httpkit")

1👍
borkdude 2021-06-21T12:33:30.184600Z

which is what I used for refl + httpkit

borkdude 2021-06-21T12:34:21.184800Z

for babashka:

("clojure" "sci.impl" "selmer" "babashka.nrepl" "babashka.impl.clojure.java" "babashka.impl" "rewrite_clj.node" "bencode" "rewrite_clj.parser" "babashka.impl.clojure" "org.httpkit" "rewrite_clj.custom_zipper" "rewrite_clj.zip" "borkdude.graal" "babashka.nrepl.impl" "babashka.pods" "cognitect" "babashka" "edamame.impl" "cheshire" "rewrite_clj" "hiccup" "sci" "borkdude" "flatland.ordered" "babashka.pods.impl" "clj_yaml" "babashka.impl.clojure.core" "datascript" "hf.depstar" "babashka.impl.tools" "sci.addons" "babashka.impl.clojure.test")

borkdude 2021-06-21T12:35:20.185Z

(could probably clean this up by looking at the existence of a prefix in others)

borkdude 2021-06-21T12:35:33.185200Z

but you get the gist

borkdude 2021-06-21T12:40:27.185600Z

ok, that leads to:

Exception raised in scope ForkJoinPool-2-worker-25.ClosedWorldAnalysis.AnalysisGraphBuilderPhase: org.graalvm.compiler.java.BytecodeParser$BytecodeParserError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: No instances of com.fasterxml.jackson.core.io.SerializedString are allowed in the image heap as this class should be initialized at image runtime. To see how this object got instantiated use --trace-object-instantiation=com.fasterxml.jackson.core.io.SerializedString.

borkdude 2021-06-21T12:41:23.185800Z

kind of demonstrating that it would be painful to have to do this exercise for every graalvm project

borkdude 2021-06-21T12:50:39.186Z

This jackson thing seems to be the only problem though

borkdude 2021-06-21T12:53:16.186200Z

so here's what I ended up with:

borkdude 2021-06-21T12:53:26.186600Z

"--initialize-at-build-time=clojure,sci.impl,selmer,babashka.nrepl,babashka.impl.clojure.java,babashka.impl,rewrite_clj.node,bencode,rewrite_clj.parser,babashka.impl.clojure,org.httpkit,rewrite_clj.custom_zipper,rewrite_clj.zip,borkdude.graal,babashka.nrepl.impl,babashka.pods,cognitect,babashka,edamame.impl,cheshire,rewrite_clj,hiccup,sci,borkdude,flatland.ordered,babashka.pods.impl,clj_yaml,babashka.impl.clojure.core,datascript,hf.depstar,babashka.impl.tools,sci.addons,babashka.impl.clojure.test"
       "--initialize-at-build-time=com.fasterxml.jackson"

borkdude 2021-06-21T13:04:22.187Z

so it seems it's feasible

2021-06-21T13:08:56.187300Z

Sorry was afk for lunch 🙂 > unfortunately there the clojure and java package overlaps What do you mean? Clojure and java code inhabiting the same package/ns? Meaning the java classes are defaulted into build time init?

borkdude 2021-06-21T13:09:19.187500Z

yes, for org.httpkit for example

2021-06-21T13:09:59.187700Z

I’m guessing for babashka you just ran that at a repl and pasted the output into the shell script; but would plan to automate it at somepoint (or convince the graal folk to do something different)

borkdude 2021-06-21T13:10:18.187900Z

yes

1👍
borkdude 2021-06-21T13:13:13.188200Z

@rickmoynihan are you on linux btw?

2021-06-21T13:13:28.188400Z

macos

borkdude 2021-06-21T13:13:59.188600Z

ok. in #babashka-circleci-builds there are new binaries compiled on the init-at-build-time branch. I wonder if this would impact startup time

borkdude 2021-06-21T13:14:08.188800Z

I don't see a real difference on macos yet

2021-06-21T13:15:08.189Z

> if this would impact startup time In which direction were you thinking?

borkdude 2021-06-21T13:15:23.189200Z

perhaps it's slower if more work has to be done at run time?

2021-06-21T13:18:44.189400Z

Shouldn’t we be expecting for essentially the same coverage? i.e. all clojure code (except the few exceptions) to be initialised at build time?

borkdude 2021-06-21T13:18:58.189600Z

yes

borkdude 2021-06-21T13:19:09.189800Z

perhaps when you're doing interop it's going to be different

borkdude 2021-06-21T13:19:21.190Z

but perhaps it's not really significant

borkdude 2021-06-21T13:19:43.190200Z

so it's good to have a working solution now and be prepared for 22

1👍
2021-06-21T13:21:44.190400Z

yeah assuming both builds behave the same wrt to correctness, I’d expect there not to be a significant difference in startup time… If there were it’d probably mean we weren’t covering everything we needed to.

2021-06-21T13:24:06.190900Z

Do you think any of this changes how the graal thread has been left? > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting?

borkdude 2021-06-21T13:24:25.191100Z

I already responded in that thread

borkdude 2021-06-21T13:24:44.191300Z

He seems to be in favor of that

2021-06-21T13:24:47.191500Z

ah thanks — just refreshed

2021-06-21T13:24:49.191700Z

👀

2021-06-21T13:31:05.191900Z

What are the use cases for the uberjar case thomaswue is pushing for? I’m not even sure for clojure it’s sufficient

borkdude 2021-06-21T13:31:38.192200Z

I usually tend to compile and collect all the code into an uberjar first and then feed that to graalvm

borkdude 2021-06-21T13:31:59.192400Z

you don't have to do this, but I find this easier, since you just know what code you're dealing with after the uberjar step

borkdude 2021-06-21T13:32:12.192600Z

also I distribute the uberjars so people who want to make nixos derivations etc can use them

borkdude 2021-06-21T13:34:07.192800Z

I could also say in case of an issue to a graalvm dev: here you have the uberjar, I do this to compile it, but it doesn't work

borkdude 2021-06-21T13:34:13.193Z

without him/her having to install clojure, etc

2021-06-21T13:34:44.193200Z

Yeah I get that it’s useful for your other requirements (you want uberjars anyway etc). But an uberjar is just a reified/flattened classpath… so why can’t they just take a classpath?

borkdude 2021-06-21T13:35:19.193400Z

You should ask this to Thomas, I don't know his reasoning

2021-06-21T13:35:22.193600Z

I should probably ask them 🙂

2021-06-21T13:35:26.193800Z

jinx

borkdude 2021-06-21T13:35:43.194Z

His reasoning could be:

2021-06-21T13:35:54.194200Z

Just want to check that I’m not arguing against what you want 🙂

borkdude 2021-06-21T13:35:55.194400Z

Libraries aren't allowed to say: everything at build time

borkdude 2021-06-21T13:36:07.194600Z

but if you have a fat jar, you're not a library owner saying this, you are the end user

2021-06-21T13:36:19.194800Z

yeah ok

2021-06-21T13:36:24.195Z

that makes sense

2021-06-21T13:38:16.195300Z

(actually I was meaning to ask you about this for another reason… I’ll start another thread on the channel for it though as it’s a change of topic)

2021-06-21T13:38:37.195800Z

@borkdude: I was wondering if you’ve seen this:

ericdallo 2021-06-22T14:20:56.236700Z

We use that way on clojure-lsp (and cljfmt I think) seems a good way indeed

borkdude 2021-06-21T13:39:12.196800Z

I have seen that

2021-06-21T13:39:20.197Z

Which looks to me like you can essentially generate the graal reflect configs etc and bundle them in library jars

borkdude 2021-06-21T13:39:28.197200Z

yes, true

2021-06-21T13:39:39.197400Z

This seems a fundamentally better way to do things

2021-06-21T13:40:12.197600Z

e.g. httpkit could just bundle that, rather than listing config for users to use in their README

2021-06-21T13:41:08.198300Z

hehe ok always one (thousand) step(s) ahead 🙂

borkdude 2021-06-21T13:41:26.198500Z

If you feel like doing a PR :)

2021-06-21T13:45:37.198700Z

lol 😆 if I had a dependency on http-kit right now I might just

2021-06-21T14:09:42.204300Z

@borkdude: So I’m thinking it would be nice for clojure graal library templates to bundle this sort of thing by default. If every lib created by something like this: https://github.com/seancorfield/clj-new/blob/develop/src/clj/new/lib.clj Included the META-INF/native-image/ as generated config that stated something like --initialize-at-build-time={{package-namespace}}, then most clojure libraries (at least ones without interop) would work in graal apps out of the box with less burden on the app creator

borkdude 2021-06-21T14:10:43.204800Z

yeah. on the other hand, it's brittle to assume that people are going to do this. I think I'll just add that little snippet to bb.edn here: https://github.com/borkdude/jayfu

borkdude 2021-06-21T14:11:50.205200Z

also it's going to be tedious if every clojure maintainer should have to do this, for every new namespace. like documentation it's always going to be out of sync

borkdude 2021-06-21T14:12:30.205600Z

instead I'll probably make a template out of jayfu if I'm satisfied enough with it

2021-06-21T14:14:16.205800Z

yeah I agree we also need tooling… But is it really for every new namespace? Most libs are bundled with all their namespaces inside a common parent, so wouldn’t it be sufficient for most libs to just mention generate that at template instantiation time?

2021-06-21T14:15:15.206Z

jayfu is new to me… If I’d known about this a month ago 😆

borkdude 2021-06-21T14:15:54.206200Z

yeah, that's true, but still. only a small portion of Clojurians are using graalvm native-image so for many people this will just be something confusing maybe

borkdude 2021-06-21T14:16:02.206400Z

there is a talk coming by ClojureD

borkdude 2021-06-21T14:16:09.206600Z

soon online, I mean

borkdude 2021-06-21T14:18:15.206800Z

I suspect DynamicClassLoader can be substituted to "do nothing" https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/DynamicClassLoader.java Perhaps that results in something useful

borkdude 2021-06-21T14:18:28.207200Z

so we're able to do everything "at runtime"

2021-06-21T14:20:20.207400Z

:thinking_face: Yeah just changing to use SecureClassLoader or URLClassLoader might be sufficient

borkdude 2021-06-21T14:21:54.207600Z

ah well, this is a research project for another time