babashka

https://github.com/babashka/babashka. Also see #sci, #nbb and #babashka-circleci-builds .
2021-01-14T08:19:08.002300Z

I’m observing something interesting, maybe you already know though. When I run a babashka script I noticed that subsequent runs are faster. When I make some whitespace changes this first time slowness comes back. As if something is caching something. Could it be graalvm underlying Babashka?

2021-01-14T08:20:11.003100Z

(it’s 25 vs 60 ms, so 30 ms difference on my machine for a hello world program)

borkdude 2021-01-14T08:20:35.003800Z

Maybe file cache? Babashka itself isn’t caching anything

1🙏
2021-01-14T08:20:57.004100Z

ah yeah true, many possibilities

2021-01-14T08:21:40.004800Z

I was benchmarking my program manually on startup time and then i noticed this. If you are not careful you think it’s the program that is faster and not this caching

borkdude 2021-01-14T08:58:11.005200Z

@jeroenvandijk for benchmarking I usually use multitime

borkdude 2021-01-14T08:58:33.005700Z

and/or time inside a bb program so you account for the graal startup time

2021-01-14T09:37:24.007300Z

Yeah that makes sense. It caught me by surprise. Hadn’t consciously thought about filesystem caching though, but that explains a few patterns. E.g. the first time you run a graalvm image is also a lot slower than subsequent times. Good to be aware of this

borkdude 2021-01-14T09:42:37.009Z

@jeroenvandijk if you are on macOS: macOS also has some checks for binaries and they do an http request to verify if this binary is allowed to run, the first time. I would be surprised if this affects bb scripts as well, but maybe if you are running them using a shebang? Turning internet off should help ;)

borkdude 2021-01-14T09:43:49.009500Z

Bb starts way faster on linux too btw, because of dynamic linking on macOS being slow

2021-01-14T10:45:25.012Z

I didn’t test with shebang yet. This was all via bb myscripthere.clj . So i like to believe in your hypothesis about the file cache

jumar 2021-01-14T19:16:30.014600Z

Without knowing specifics about Bb and file cache Id be very surprised if a whitespace change invalidated OS page cache (or rather made it slower)

borkdude 2021-01-14T20:23:17.015900Z

@jumar Not that I know anything about this, but why would a space not invalidate a page cache? Is the OS cache this clever that it checks on non-significant whitespace changes? Now that would surprise me.

jumar 2021-01-15T10:07:38.019600Z

"Invalidate" was a bad term. The purpose of the OS page/file cache is to speed up potentially slow IO operations so instead of a few hundres MB/s you can achieve several GB/s or more - every read and write goes through the cache. Just because you write a new whitespace to the while shouldn't make it any slower. Now, many editors actually copy the file (create a new inode) so I was wondering how that affects caching - a quick experiment suggests that both files have similar portions cached in the RAM (`RES`):

dd if=/dev/zero of=file.txt count=1024 bs=1048576

fincore file.txt
   RES  PAGES SIZE FILE
700.8M 179393   1G file.txt

cp file.txt file2.txt
fincore file2.txt
   RES  PAGES SIZE FILE
688.6M 176282   1G file2.txt
Do you know about any resources that mention how OS cache affects startup time of (java) programs? I'd happy to learn more about this stuff.

borkdude 2021-01-15T10:09:05.019800Z

@jumar isn't the idea of a cache that it invalidates on a change?

borkdude 2021-01-15T10:09:40.020Z

or does it "cache" files that you have recently changed into a faster part of the OS?

borkdude 2021-01-15T10:09:46.020200Z

so the next change will be faster?

jumar 2021-01-15T10:09:52.020400Z

I said that "invalidates" was a bad term 🙂. It doesn't matter - your changes will be written to the underlying storage at one point and it shouldn't affect speed much - especially if it's such a simple change

jumar 2021-01-15T10:10:04.020600Z

It caches files in RAM

borkdude 2021-01-15T10:10:32.020900Z

but also for writes?

borkdude 2021-01-15T10:11:08.021200Z

so if my laptop battery would go dead, chances are it wouldn't have been written to disk yet, on save?

jumar 2021-01-15T10:11:27.021400Z

Yes, it's even more important for writes - reads may (initially) go to the underlying storage but writes always go to the cache first (unless you use direct/raw io)

jumar 2021-01-15T10:11:36.021600Z

Yeah, that's possible

borkdude 2021-01-15T10:11:53.021800Z

ok, thanks, TIL

jumar 2021-01-15T10:12:52.022Z

It's so called "write-back" cache

lread 2021-01-14T23:15:45.016600Z

Writing all my project scripts in babashka gives me a satisfying GitHub stat of:

861☝️2