MILESTONE REACHED! gostd2joker
now is able to build with Joker (on my fork's gostd2joker
branch, mind you!) against all of the Go 1.11.2 source code, picking up all those (non-vendor/etc.) packages. I threw the resulting joker/docs
directory over here: https://burleyarch.com/joker/docs/
this is pretty cool! I did notice a lot of empty namespaces, but that should be an easy fix. The main challenge is still dealing with interfaces and methods. I looked into Go reflection capabilities and I think it's technically possible to use it for method calls, but I am not sure this would be the right direction for Joker to take. I want Joker to expose very high level and easy to use API. and Go one-to-one bindings involving reflection might be too low level and cumbersome to use for scripting. This is not all or nothing proposition though. We can give access to (some) Go one-to-one bindings while still exposing higher level API separately. So I really appreciate all the work you are doing in this area!
Thanks! I still think the higher-level stuff can and should be done, but ideally the automatic exposure of low-level "primitive" access to Go would ease that without requiring constant changes to Joker internals.
how do all these bindings affect executable size and startup time?
I'm on vacation next week, but am planning to spend some time this week learning more about the ins and outs of Clojure hosting (classes, types, etc.) to get a sense of what next steps might look like.
I tried measuring startup time but couldn't see any difference. Here's the executable file size difference on my amd64-linux machine:
$ ls -l joker
-rwxrwxr-x 1 craig craig 18785256 Nov 14 00:37 joker
$ ls -l $(which joker)
-rwxrwxr-x 1 craig craig 12593253 Oct 31 18:02 /home/craig/.go/bin/joker
$ joker
Welcome to joker v0.10.0. Use EOF (Ctrl-D) or SIGINT (Ctrl-C) to exit.
user=> (loaded-libs)
#{joker.core joker.os joker.base64 joker.json joker.string joker.yaml <http://joker.go.net|joker.go.net>}
user=>
A non-trivial increase in executable size!More details on size differences:
$ size joker
text data bss dec hex filename
16087577 262753 149864 16500194 fbc5e2 joker
$ size $(which joker)
text data bss dec hex filename
9633270 229505 146288 10009063 98b9e7 /home/craig/.go/bin/joker
$
I do also plan to look into Go reflection capabilities, but have read some discussion of that, and reportedly it's rather slow (not surprising)....
With the current approach I'm using, it's pretty easy to support arbitrary exclusion/inclusion of Go std
libraries via command-line options and such, in case people want that. And of course nothing gets linked in unless they create the GO.link
file to the Go source tree in the first place -- so what I've thrown together would have essentially zero impact on people who want a linter-only Joker that's small and fast.
(But I want a full-featured Joker that's small and fast. 😉 )
I do have a gostd2joker
Issue about code size, so I think it'd be possible to reduce some bloat as it is. But it's tightly related to the typename issue (exposing Go std types to Joker code), which I'd rather figure out before prematurely optimizing the current approach.
the increase in size is not as bad as I expected actually... Is your idea to have these Go bindings as an option at build time? I've thought about this approach. Like you said, it makes Joker small and fast for the majority of users who just want a linter and allows power users to build the "full" version, perhaps even selectively choosing the bindings they want. On the other hand, it makes Joker cumbersome to use as soon as you need something outside of what's included in the base version.... I doubt many people will be willing to build from source...
The current approach is definitely build-time focused: gostd2joker
trawls the Go source tree (which must match the version of Go currently installed), then modifies the Joker source tree to add the bindings (`.joke` and _native.go
files, plus changes to three source files).
Personally, I like the result of that approach: a Joker executable that is (at least in theory) as lean and mean as possible, when it comes to providing (low-level) access to those Go APIs.
But there are other possibilities to consider, such as having go generate
do much of that work, ideally without needing the Go source tree -- perhaps leveraging go.reflect
somehow;
and/or, maybe there's some way to dynamically pull in the necessary bindings at run time, via Go plugins or something?
So the way things currently work is probably not ideal, but were a proof-of-concept based on my existing knowledge plus what I felt I could understand and execute on in a reasonable amount of time.
I think dynamic loading it not possible without going into CGO land, but I am not sure
Yes, I'm also ignorant of CGO stuff at this point.
I think go generate
time might be the cleanest best approach to this, assuming go.reflect
can replace the current Go-source-tree-trawling approach currently used.
I'm glancing at the go.lang
docs, and a blog post to which they link, right now -- and am not seeing anything (yet) about reflecting on Go std
packages.
I understand that gostd2joker works with Go source code, but in theory nothing prevents us from running it at "release time" and shipping Joker executables with all the bindings included, right? This is not necessarily what we want to do, but it's certainly an option...
Go's reflection cannot reflect on packages, only on types and values
Yes, I think you're right: since prebuilt executables already depend on specific arch/OS and Go-version combinations, including the specific Go tree and running gostd2joker
on all that should work.
Ok (re Go's reflection) -- that rules out using that approach, though not moving gostd2joker
(or something similar) into the go generate
phase AFAIK....
Reviewing https://golang.org/pkg/plugin/, I don't think that's quite right for us either (though of course providing access to it from Joker might be pretty cool for somebody else's use case!).
(Plugins also aren't supported on Windows, for what that's worth....)
my understanding is that plugins are an experimental feature, but I have not looked at them in a while
re: reflection. It can still be used for method calls, but not for function calls.
Then, at the moment, I don't see how reflection helps expose these APIs...but I still have lots to learn.
In the meantime, as "empty" as many of those packages are, I did have some fun last night exploring joker.go.syscall
-- not all of which is already provided by joker.os
. Couldn't get anything to actually break, which I guess is good news!
My current operating expectation is that gostd2joker
could be enhanced to inject std
types into Joker, so we'd have things like <http://joker.go.net.MX|joker.go.net.MX>
as an actual type, and joker.go.net.LookupMX()
would return a (two-element vector whose first element is) a vector of ref
objects with <http://joker.go.net.MX|joker.go.net.MX>
instances underlying them.
That wouldn't be strictly necessary for that API (but gostd2joker
would have a hard time figuring that out), but for other APIs, the reason for *<type>
results is precisely so the caller can modify them, and/or observe them changing, after the call completes.
And then having joker.go.net.Resolver
support methods (that translate to Go methods on the Go type as receivers), such that (.LookupMX <instanceofResolver> ...)
would work, would be amazing.
Having played around with adding the Byte
type to my gostd2joker
branch of Joker (in my fork), I realize that's all probably a tall order, and I still don't know Clojure well enough to have a clear idea of how to do it and keep it all reasonably compatible with Clojure as a whole, including its users' expectations.
But my use case(s) involve having significant Clojure code that can be leveraged by Clojure, ClojureScript, and/or Joker (or some variation thereof), which is why I'm so interested in this whole area.
While there are still a bunch of little and big things to do, for now my next step is to resume exploratory testing, to find APIs that crash or otherwise misbehave. (I won't try all of them.)
And, of course, not all of the APIs themselves are converted -- gostd2joker
refuses to convert those that it doesn't support. Plus, it converts some it probably shouldn't, like net.LookupMX()
, because they return pointers to things that are not converted to refs, but instead gostd2joker
reaches through those indirections and pulls out the inner values, which is sometimes helpful, sometimes not.
Plus, some things are just not returned fully due to involving private members, etc.
I think I'll add some basic stats-gathering on APIs converted versus not converted (probably separating out receivers), numbers of hits on different rejection codes ("ABEND"'s), and so on, so it's easier to see what might be the lowest-hanging fruit worth tackling.
Basic stat-gathering added:
Totals: types=964 functions=7135 methods=5623 (78.81%) standalone=1512 (21.19%) generated=321 (21.23%)
That means 79% of the functions in Go are really methods (not handled at all); and, of the remaining 1,512/21% non-method ("standalone") functions, 21% are converted into at least partially usable forms. Not too shabby for about two weeks' worth of work!
(That's on an amd64-linux system; other OS/arch combos can, and some do, produce slightly different results.)
BTW, I did do some work on porting this to Windows over the weekend; it probably still works there. Of course, --no-readline
is needed to run joker
interactively, due to readline
not working properly on Windows.
Note that, while the list of libraries seems large, having gone through and spot-checked some to find bugs (didn't find any yet), many of them are actually empty! It might make sense to have gostd2joker
inhibit generation of empty packages.
oh my
BTW, the latest version of gostd2joker
now (by default) no longer generates empty libraries/packages. I’ve synced the latest generated docs to the above URL — lots fewer packages!