cljdoc

https://cljdoc.org/ & https://github.com/cljdoc/cljdoc
lread 2019-07-17T19:13:58.222600Z

hello cljdocists! On a fresh pull of cljdoc from github, when I run clojure -A:test all tests pass, but then I get RejectedExcutionException from kaocha.

unit:   100% [==================================================] 18/18
Exception in thread "main" java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3a2741f9 rejected from java.util.concurrent.ThreadPoolExecutor@3e3de00b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8]
	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
	at clojure.core$future_call.invokeStatic(core.clj:6893)
	at clojure.java.shell$sh.invokeStatic(shell.clj:124)
	at clojure.java.shell$sh.doInvoke(shell.clj:79)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invokeStatic(core.clj:657)
	at clojure.core$apply.invoke(core.clj:652)
	at kaocha.plugin.notifier$run_command.invokeStatic(notifier.clj:80)
	at kaocha.plugin.notifier$run_command.invoke(notifier.clj:70)
	at kaocha.plugin.notifier$notifier_post_run_hook.invokeStatic(notifier.clj:113)
	at kaocha.plugin.notifier$notifier_post_run_hook.invoke(notifier.clj:91)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.core$apply.invokeStatic(core.clj:659)
	at clojure.core$apply.invoke(core.clj:652)
	at kaocha.plugin$run_hook_STAR_$fn__1233.invoke(plugin.clj:42)
	at clojure.lang.PersistentVector.reduce(PersistentVector.java:341)
	at clojure.core$reduce.invokeStatic(core.clj:6747)
	at clojure.core$reduce.invoke(core.clj:6730)
	at kaocha.plugin$run_hook_STAR_.invokeStatic(plugin.clj:40)
	at kaocha.plugin$run_hook_STAR_.doInvoke(plugin.clj:39)
	at clojure.lang.RestFn.invoke(RestFn.java:445)
	at clojure.lang.AFn.applyToHelper(AFn.java:160)
	at clojure.lang.RestFn.applyTo(RestFn.java:132)
	at clojure.core$apply.invokeStatic(core.clj:663)
	at clojure.core$apply.invoke(core.clj:652)
	at kaocha.plugin$run_hook.invokeStatic(plugin.clj:51)
	at kaocha.plugin$run_hook.doInvoke(plugin.clj:50)
	at clojure.lang.RestFn.invoke(RestFn.java:425)
	at kaocha.api$run$fn__2929.invoke(api.clj:97)
	at clojure.core$with_redefs_fn.invokeStatic(core.clj:7434)
	at clojure.core$with_redefs_fn.invoke(core.clj:7418)
	at kaocha.api$run.invokeStatic(api.clj:88)
	at kaocha.api$run.invoke(api.clj:71)
	at kaocha.runner$run.invokeStatic(runner.clj:118)
	at kaocha.runner$run.invoke(runner.clj:68)
	at kaocha.runner$_main_STAR_.invokeStatic(runner.clj:136)
	at kaocha.runner$_main_STAR_.doInvoke(runner.clj:122)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.RestFn.applyTo(RestFn.java:132)
	at clojure.core$apply.invokeStatic(core.clj:657)
	at clojure.core$apply.invoke(core.clj:652)
	at kaocha.runner$_main.invokeStatic(runner.clj:147)
	at kaocha.runner$_main.doInvoke(runner.clj:145)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.RestFn.applyTo(RestFn.java:132)
	at clojure.lang.Var.applyTo(Var.java:702)
	at clojure.core$apply.invokeStatic(core.clj:657)
	at clojure.main$main_opt.invokeStatic(main.clj:317)
	at clojure.main$main_opt.invoke(main.clj:313)
	at clojure.main$main.invokeStatic(main.clj:424)
	at clojure.main$main.doInvoke(main.clj:387)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.lang.Var.applyTo(Var.java:702)
	at clojure.main.main(main.java:37)
If I delete tests.edn (turfs koacha reporter and plugin notifier), and try again, I see no problems:
[(.)(.)(.)(..)(............)(....................)(.................................)]
18 tests, 70 assertions, 0 failures.

lread 2019-07-17T19:14:22.222800Z

I dug up a recent build on circleci https://circleci.com/gh/cljdoc/cljdoc/2200#tests/containers/0 and don’t see evidence of this issue:

lread 2019-07-17T19:15:00.223600Z

Any other cljdoc developers seeing this exception?

martinklepsch 2019-07-17T19:16:32.225300Z

I think I recently had a similar thing in a different codebase and the issue was that shutdown-agents was called at some point when the process was not about to exit

lread 2019-07-17T19:17:26.226Z

thanks for the lead, I shall poke around.

lread 2019-07-17T19:22:21.227500Z

yup, there is a call to shutdown-agents in cljdoc tests, which kaocha is just not happy about!

seancorfield 2019-07-17T22:18:18.228100Z

Looks like http://cljdoc.org is down...? @martinklepsch

martinklepsch 2019-07-17T22:38:40.229700Z

Should be back up. I’ve spent some more time looking into this problem recently and it seems it has little to do with Consul and/or Nomad and instead it’s just the instance that’s crashing completely. It also then fails to restart properly but I guess these are two different problems.

martinklepsch 2019-07-17T22:41:16.231900Z

I’m not super well-versed in Linux server admin stuff so definitely winging it a little bit over here 🙂 Looking at the files in /var/crash seems to indicate that the instance runs out of memory but the DigitalOcean monitoring agent pretty consistently reports 50% memory utilization :thinking_face:

[  768.241566] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[  768.243343]   cache: kmalloc-64(1:65d047087234dff92ea3990405519ae3b3d48b4417448cccced03e009003d1d9), object size: 64, buffer size: 64, default order: 0, min order: 0
[  768.248041]   node 0: slabs: 207, objs: 13248, free: 0
[  809.694213] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010

martinklepsch 2019-07-17T22:44:21.232700Z

It’s a CentOS server btw, if anyone has experience debugging this kind of stuff, “welcome to my crib” 😄

seancorfield 2019-07-17T22:44:45.233Z

This is why we have managed servers 🙂

🙃 1