where does clj/clojure download dependencies?
your maven .m2
directory
it also caches deps in your local project under .cpcache
@jeeq ^
Ah. Thank you @marshall
It caches the computed classpath in .cpcache (not deps)
oh right, sorry 🙂
http://http-kit.github.io/migration.html#reload why does it suggest running #'all-routes
instead of just all-routes
? if wrap-reload
reloads the whole namespace on every change, then won'
t all-routes
contain its new definition by the time -main
is also reloaded?
is this just for repl purposes? so you can redefine all-routes at the repl and not necessarily reload the entire namespace?
also, why is there ring-reload
but not (at least i couldn't find it) something like http-kit-reload
? why is reloading the namespace be done at the ring level and not server level?
@ozzloy_clojurians_net My recommendation is: do not use any of these auto-reload things. Just learn how to write code that's amenable to the REPL -- which is why you'd write #'all-routes
because that's a Var reference which introduces a layer of indirection so if you re-`defn` the function (via the REPL) the new definition will be picked up immediately.
This gave me an aha moment @seancorfield thank you
I've recently been trying to figure out the difference of using a symbol vs a var reference in a repl and how it interacts with redefining a function -- are there any resources that dive into that a bit more? Probably help me understand the var indirection mechanism a bit more too then...
@matthew.pettis https://clojure.org/guides/repl/enhancing_your_repl_workflow#writing-repl-friendly-programs talks specifically about #'
Thank you! I read that link. A follow-up question... in the 4 code examples they have at that link/anchor, is #3 not REPL-friendly because the value of print-number-and-wait
is inlined into the future call, and so cannot be redirected, while #2 is REPL-friendly because print-number-and-wait
is not inlined into anonymous functions, and has to be looked up upon every invocation of the anonymous function? #4 seems to work on that same principle, in that the var has to be looked up and resolved upon every invocation, and so if you change what's in the var, it will use the new value you redef the var to after the change... does this sound correct?
If this is so, I can see why this is REPL-friendly for development. I can also see that if you want to keep your functions as pure as possible, you will not use #'
to call functions inside of other functions because that exposes your function to the possibility of becoming impure, as now your function depends on vars, which are mutable, rather than the function values they would get...
Re: #3 -- correct; #2 -- this works because the function appears in the "call" slot, not in a "value" slot, and calls are always dereferenced (so the current binding of the function is always used); #4 -- works because what is passed in is a Var
: the "box" that contains the function's value, so when you invoke a Var
it always resolves to the current binding.
Purity really isn't relevant here. Whether a function is pure or impure is about side-effects.
Passing #'some-func
and (fn [x] (some-func x))
are pretty much equivalent from a REPL redefinition p.o.v. You could also say (var some-func)
(since that's what #'
is shorthand for).
About the only time you will be "mutating" a Var
is via redefinition in the REPL -- so it's not like #'
is going to make your code impure because someone is nefariously modifying top-level bindings behind the scenes. There are a handful of valid uses for alter-var-root
, for example, but Clojure programmers don't treat Var
s like other languages treat "variables".
Hope that helps clarify @matthew.pettis?
The 'call-slot' and 'value-slot' distinctions are very helpful, thanks. I was thinking about purity not in the sense of side effects, but in the sense that if you call a function twice with the same arguments in a program, you should get the same result. This is not the case if, for #2 or #4, you change out the definition of print-number-and-wait
between calls. So I am probably abusing the idea of "pure function", but those are the notions I have about them...
Given the prevalence of 'call-slot' usage, almost all code would be "impure" by your definition 🙂
(defn foo [x]
(+ (bar x) 13))
that would be susceptible to people redefining the bar
function inbetween calls to foo
-- but that's not how we think about it.user=> (defn bar [x] (* 2 x))
#'user/bar
user=> (defn foo [x] (+ (bar x) 13))
#'user/foo
user=> (foo 1)
15
user=> (defn bar [x] (* 3 x))
#'user/bar
user=> (foo 1)
16
user=>
But this is the behavior we want in the REPL/while developing.Which is why it is really helpful now to be aware of that distinction. I really do think that helps me think about immutability, at least which things are and are not immutable. I've read that 'functions are values', but now I'm not sure what to think when that means that things in the call slot can point to different things. What makes a function a value if in the program, sub-components in the call-slot can get rebound and change the behavior of a function?
Note that there is also a compiler option called "direct linking" which effectively prevents the 'call-slot' indirection. Clojure itself is compiled that way, so you can't redefine core functions on the fly, but you can also compile all your own code that way (we do it as part of AOT-compiling our code when we build applications for deployment as uberjar files: but it does have the "downside" that you can no longer patch code running live in production via a REPL, which, yes, we do occasionally for one process where we do not compile it with direct linking).
yep, that makes sense to want that REPL behavior. I am definitely not a purist on function purity ( 🙂 ), but I just want to make sure I have a solid grasp of what immutability means for values when it comes to functions, and what that means when sub-components can change and change a functions behavior (when a function is a value).
so, to be precise, in your foo/bar example above, I'd figure that the function that foo
points at is an immutable value, as values are functions (right?), but by redefining bar, you changed the behavior of a function value...
Right, in particular, (defn bar [x] (* 2 x))
is shorthand for (def bar (fn [x] (* 2 x)))
-- so (fn [x] (* 2 x))
is the value here (actually an object with an .invoke()
method) and bar
is a symbol that is bound to a Var
and the content of that Var
is a reference to the actual value.
Then when you have (defn bar [x] (* 3 x)))
you get a new value (fn [x] (* 3 x))
and because bar
is already bound to a Var
, the content is updated to be a reference to the new value.
So bar
's binding to the Var
is essentially immutable, and each of the different function values are immutable. Only the Var
itself is mutable.
See https://clojure.org/reference/vars for a long description of Var
and its siblings.
So def
either creates a Var
(with a reference to the value) and binds the symbol to it (if no such binding existed) or it just updates the Var
to contain a reference to the new value.
Makes sense. So, to restate what you said, the symbol -> var mapping is immutable (`bar` to a particular Var), and any function is a value, and further, any given Var can change what function (value) it points to. I think I get this all (via this discussion, thanks). The rest I guess is more philosophical, and not practical, but of interest to me -- it seems more appropriate to call functions 'values' in the case that you have 'direct linking' in force, when the behavior of a function value truly cannot be changed. Again, I am not at all versed in type theory to really grok values... I'm just trying to map out all of the online and book descriptions of what a value is to how it is being used here.
In a very practical sense, however, though, really, the call-slot/value-slot distinction is a huge thing to have learned.
One more clarification -- once def
binds a symbol to a Var, it stays bound to that Var for the life of the program, correct? Except in the cases where Vars are shadowed with a binding
form?
@matthew.pettis Sorry, was deep in code... unless you explicitly unbind the symbol in that ns, yes, the def
binding stays bound to the same Var
. You can ns-unmap
a binding and you can also remove a namespace completely.
no apologies necessary -- thanks, I forgot about ns-unmap
...
There's a subtle issue around def
vs defonce
and reloading namespaces (`def` will recompute the value and update the Var
on reloading a ns, defonce
will not). Again, tho', you can still remove the ns to force defonce
to recompute.
And then binding
changes the contents of the Var
box (not the symbol binding) and restores it later (to the previous value) -- but there's a subtlety there in terms of thread local bindings etc.
And then there's with-redefs
which affects multiple threads (and therefore is not thread-safe).
(`binding` can only be used with Var
s that have been declared as ^:dynamic
)
cool. good to have confirmation
yeah, i suppose leaving the #' doesn't hurt the reload and does allow for redefining at the repl
I see folks get into all sorts of trouble with reloading namespaces... 😞
makes sense
i could see that being annoying to troubleshoot, and the behavior being surprising. something in figwheel caught me with reloading... i think it was with an on-click
thing...
thanks for the responses, @seancorfield you've been helpful for me a few times
if i have the same file in both resources
and target
, then the one from target wins. is it up to me to make sure i don't have 2 files with the same name on those different paths?
"same file" -> file with the same name
It is up to you
Target is the kind of scratch space lein uses for generated stuff like the results of builds (jars, uberjar, etc) you shouldn't be putting anything in there
yeah, i'm worried about creating a file that gets shadowed by the build process and then wondering what i did wrong when the page doesn't load right. might be difficult to tell that that's what's going on when it happens
@ozzloy_clojurians_net lein
is probably putting files from resources
into target
. I don't use lein
any more so I never have to worry about target
folder but, in general, never put anything in target
yourself and then just ignore it 🙂
i'm not using lein either. figwheel.main puts stuff in target though
actually, idk what's putting stuff into target, but i think it's figwheel.main
same thing applies to src and resources though. and relying on myself to know that there's a resources/a/b
and a src/a/b
is a known buggy process
but knowing that that is a potential thing is good enough for me. i was curious if there was some tool to address this. sounds like there is not. i can live with that
Are you using figwheel.main via the Clojure CLI then? https://figwheel.org/ shows lein
and clj
invocations (but I'm not doing ClojureScript so I haven't tried Figwheel -- I've only used shadow-cljs a bit).
yep
well... i'm using cider, so ... i think it's using clj under the hood. there's no project.clj
and there is a deps.edn
in my projects so far.
yep, cider calls "clojure"
how is this affecting you? (the actual command will be at the top of your repl). figwheel needs to compile your files. and for each namespace.cljs
you'll end up with a namespace.js
, namespace.js.map
and namespace.cljs
in the compiled out directory for figwheel to serve and hot load
@dpsutton at the moment, this is hypothetical. i don't currently have a resources/a/b
AND a src/a/b
. so ... it's affecting me by making me worry about a future me that has a hard to diagnose bug. and present me doesn't think that guy is likely to exist, so doesn't care too much.
not sure what to say. every cljs project will have compiled files which mimic the source tree during dev. it's never been a problem for me. those files are almost always gitignored (as compilation output should be) and therefore never a problem with ag/grep. Not sure what issue you think you'll have but i haven't had it in 4 years of clojurescript development
yeah, you're right. it will almost certainly not be an issue. and it's nice to have a concrete example of it never coming up in 4 years
Thanks for all of the valuable feedback. I know my question was very broad, still... Some great resources and perspectives. I really appreciate it!
Just looking over Clojure Applied. That seems to be right in line what I was looking for. Thank you!
what is everyone using to profile your programs? i.e. to see where the time is being spent. My experience over 35 years of programming is that your usually surprised where the time is being spent.
Jim Newton's 3 rules (of thumb) of programming: 1. every unoptimized program can be doubled in speed, (this rule is not recursive) 2. every untested program has a bug. this is especially true for one line programs. 3. some problems are really hard, but the problems are less hard if you take a break and have some ice cream.
I'm trying to experiment with https://github.com/clojure-goes-fast/clj-async-profiler . The documentation says
;; The resulting flamegraph will be stored in /tmp/clj-async-profiler/results/
;; You can view the SVG directly from there or start a local webserver:
(prof/serve-files 8080) ; Serve on port 8080
If I start a web server, how can I view the graph? I'll need some URL right? which URL?@jumar what is OOM?
localhost:8080
I usually just open the /tmp/clj-async-profiler folder
OOM = out of memory
oic
So here is the file I generated. I don't really understand how to interpret the results. can someone help?
what is it telling me?
There's not much clue about why I'm getting out of heap space?
197/500: trying (:and :epsilon :empty-set :sigma :sigma)
198/500: trying (:or (:cat (:cat (:cat :empty-set) (:cat :empty-set)) (:* :epsilon) (:and (:not (:cat)) (:* (:not (:or))))) (:* (:or :epsilon (:cat (= a)))) (:not (:not (:or :sigma))) (satisfies decimal?))
199/500: trying (:cat (:+ (:not (:and (:and)))) (:or (:not :epsilon) (:not :epsilon) (:+ (:and (:cat)))) (:not (:cat (:or (:or)) :sigma)) (:or :epsilon (:or (:not :sigma) (:and (:or))) (:cat (:cat (:+ (:? (:cat)))) (:? (:? (:* (:? (:not (:* (:cat))))))))))
Execution error (OutOfMemoryError) at clojure-rte.util/fixed-point (util.clj:216).
Java heap space
clojure-rte.rte-core=>
However, fortunately it does seem that clj-async-profiler
does dump its results even if the expression being profiled encounters and OutOfMemeryError
exception.
it’s a flame chart showing where CPU is being spent
there is a lot of GC in native code (around 90% of CPU)
tower on the left is your code, you can click on nodes to focus on them
I wonder whether anyone might be keen to help me look into this? Particular someone who can run the tools on a non-mac?
git clone <https://gitlab.lrde.epita.fr/jnewton/clojure-rte.git>
cd clojure-rte
git checkout 7088418cb7078032309959ebfd01a88afbe7f380
lein repl
(require 'clojure-rte.rte-tester)
(clojure-rte.rte-tester/-main)
I clicked in the middle of the tower so it showed only that, and it seems a lot of CPU in user code is spent in clojure.core/memoize
hmm, even when I comment out all the calls to memoize, I still get out of memory error
in my experience with various lisps, you need an allocation profiler to find what is overburdening the garbage collector. I suppose that's the same with clojure ?
more specifically, a lot of CPU regarding memoize is in clojure-rte.util/fixed-point
I think the memoizing of fixed-point
was an attempt to fix the problem. there is a recursive function which tries to keep simplifying an expression until it stops changing.
I think this function often simplifies the same almost-leaf-level expressions .
In my testing code I generate lots of random expressions, and give them to the simplifier. The thing that I see is that the more expressions I give, the more likely the OOM error is. Either my memoization is really causing the problem (which I doubt, because I see the same problem when I comment it out) or i'm allocating to much for the Java GC to keep up, which I also doubt because the java GC is arguably the best in the history of the world.
After every simplification there should be no remaining allocation.
is there a possibility of endless loop of simplification that just grows and grows?
this is indeed possible, and I've asked myself that. I dont think it is the case. Here is my reasoning. I print the expression before simplifying. when I get an OOM error, I can then simplify the offending expression in a fresh repl and it simplifies without problem.
although putting a debug counter in the fixed-point
function which warns or errors after 10000 iterations might not be a bad idea.
I wrote the simplifier functions in a very wasteful way assuming the GC was good. I.e., In some cases it copies a list using (map ...)
and then tests whether it got a different result. I could change those to first try to find an element of the list which would change under the map, and only then allocate a new list. This would mean I iterate over the list twice. ---- trading speed for space. I hoped to avoid that as it makes the code uglier.
However, that is shooting in the dark, since I really don't know the culprit without an allocation profile.
I am making an assumption after asking clojure experts. The assumption is that if i compute a function such as (fn ...)
within a (let ...)
which allocates lots of temporary variables whose values are huge allocations, but the returned function does not explicitly reference any of those huge allocations, then the GC will de-allocate them. For example.
(defn create-funny-function [n f g]
(let [x (allocate-huge-list n)
y (count (filter f x))]
(fn [] (g y))))
In this case create-funny-function
allocates a huge list and returns a function which only references an integer which is the length of that list, but does not reference the list itself. Will the GC deallocate x
or not? I am supposing that it will.I should be able to construct an experiment to confirm this supposition. create longer and longer lists of the return values of create-funny-function
until OOM, then do it again with a different size of allocate-huge-list
See if the length of the list of function is shorter when allocate-huge-list
is longer, that would falsify the claim. right?
What specific OOM error you get and how do you configure memory for the JVM (especially "Max heap" size). Did you try the flag I suggested? (`-XX:+HeapDumpOnOutOfMemoryError` - looks e.g. here: https://stackoverflow.com/questions/542979/using-heapdumponoutofmemoryerror-parameter-for-heap-dump-for-jboss)
JVM GCs are often quite good, but they cannot free memory that is still being referenced, of course. I believe you should only get OOM exception if there is more still-referenced memory than the configured max heap size when you started the JVM (or the default max heap size the JVM calculated by default when it started, if you did not specify one). I don't think any JVMs give that exception because you are allocating memory "too quickly" that it cannot keep up -- instead your program slows down when GC is running a lot.
Regarding your create-funny-fn
above, I have not looked at the JVM byte code generated by the Clojure compiler for that to confirm, but I have looked at a version of that JVM byte code that was decompiled to Java source code (that gives less certain results, because in some cases the Clojure compiler produces JVM byte code that have no good representation in Java source code). It appears that the function (fn [] (g y))
is given references to g
and y
when the JVM object representing that function is created, but not to x
. If that is true, then I do not see anything that could be holding on to a reference to x
there.
In your create-funny-fn
, if you passed it a function f
that was memoized, and memoized in a way that it never removed entries from its cache, then f
's memoization cache size will grow proportionally to the number of distinct elements in the list x
and retain references to those elements of x
Again, I have not used it myself for this purpose, but YourKit Java Profiler advertises having a memory profiler that can help analyze the current non-GC'ed objects in a running JVM. Sure, it can be a pain to learn new tools, and it is difficult to know in advance whether they will end up saving you more time or costing you more time, ...
@jumar, Re "Did you try the flag I suggested? (`-XX:+HeapDumpOnOutOfMemoryError"` , I don't know how to do what you're suggesting. Is that something in the project.clj
file?
@andy.fingerhut, my experimentation seems to confirm what you're saying. If I make the funciton also reference x, then memory fills up an order of magnitude quicker.
If you run your app via java ...
(as JAR e.g.) then you just do java -XX:+HeapDumpOnOutOfMemoryError ...
With leiningen you can use :jvm-opts
: https://github.com/technomancy/leiningen/blob/master/sample.project.clj#L295
@jumar I installed the flag as you suggest, and then triggered the out-of-memory error. But I don't find any heap dump file anywhere.
Is your code published somewhere that others could try it out, e.g. to confirm whether the flag is set up in a way that it is actually being used when the JVM is started? Or have you used a command like ps axguwww | grep java
while your process is running to confirm that the command line option is present?
I suspect most JVMs implement that option, but there are many different JVMs from different providers, and I don't plan to check if they all do. What is the output of java -version
on your system?
git clone <https://gitlab.lrde.epita.fr/jnewton/clojure-rte.git>
cd clojure-rte
git checkout 72fe231debcc45095a78fcecc07310bcbf725071
lein repl
(require 'clojure-rte.rte-tester)
(clojure-rte.rte-tester/test-oom)
@andy.fingerhut i've tried to prepare a branch for you
if you'd like to commit to this repo, I can give you permission. I just need your email address.
I tried those steps on a macOS 10.14.6 system, Leiningen version 2.9.3, AdoptOpenJDK 11.0.4, and it gives an error when clj-async-profiler tries to do some initialization. Not sure if you saw that already and worked around it, or perhaps you are using a different JDK version that works better with this.
Ah, I avoid that error with AdoptOpenJDK 8 -- will try with that.
you can comment out the profiler code, you'll still get the oom error
[geminiani:~/Repos/clojure-rte] jimka% lein repl
nREPL server started on port 64093 on host 127.0.0.1 - <nrepl://127.0.0.1:64093>
REPL-y 0.4.4, nREPL 0.7.0
Clojure 1.10.0
OpenJDK 64-Bit Server VM 11.0.7+10
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e
Confirmed I get OOM with those commands after a while.
Will try adding the extra -XX
option mentioned above, and maybe reduce the heap size a bit so the resulting heap before OOM is smaller
OOM exception occurred after I added that option, and it created a file named java_pid27470.hprof
in the clojure-rte
directory, i.e. the root directory of the project where I ran the lein repl
command.
The only change I made was in the project.clj
file, which I changed the line containing :jvm-opts
to the following: :jvm-opts ["-Xmx512m" "-XX:+HeapDumpOnOutOfMemoryError"]
Even though I changed the max heap size to 512 Mbytes, the dump file created was nearly 1 Gbyte in size, probably due to some extra data it writes that isn't simply a copy of what is in memory at the time of the exception
It is pretty easy with a free tool like jhat
to determine that most of the memory is occupied by objects with java.lang.Cons
and java.lang.LazySeq
, but I do not yet know if there is a convenient way to determine where in the code most of those objects are being allocated from.
Out of curiosity in trying to narrow down the place where OOM occurs, I tried putting a cl-format call inside of the function -canonicalize-pattern-once
. It prints many times merely because of doing require
on any of several of your namespaces. Is that intentional?
And it probably has nothing to do with the OOM, but it is pretty unusual to do as many namespaces that all in-ns
and defn things in other namespaces. Not sure if you felt you needed that for some reason, or just prefer it for some reason.
Do you expect canonicalize-pattern
to take the same amount of work on the same expression given to it no matter what calls have been made before? Or is there state left behind from each one that can affect how much future calls do?
I ask because I can run (clojure-rte.rte-tester/-main)
in a REPL session, see the OOM and the output of this uncomment print statement in my copy: (cl-format true "canonicalizing:~%")
, and copy and paste the last expression printed before the OOM, quit that REPL, start a new one, and do the following things only:
(require 'clojure-rte.rte-tester :verbose)
(require '[clojure-rte.rte-core :as co] :verbose)
(def maybe-troublesome-pattern3
'(:cat :empty-set (member a b c a b c) (:cat :empty-set (:not :empty-set) (:* (:* (:and)))) :epsilon))
(co/canonicalize-pattern maybe-troublesome-pattern3)
and it typically return very quickly, with very few calls to canonicalize-pattern
, whereas when it was going OOM, it makes millions of calls to canonicalize-pattern
It seems like for the same parameter value, in some situations calling canonicalize-pattern
goes into infinite recursion, but in other situations, it does not.
Here is a file of REPL inputs I used on a slightly modified version of your code, with a little bit of extra code to count how many times a few functions were called, and print out debug output on every 100,000 calls to canonicalize-pattern
, which before an OutOfMemory exception occurs, is called millions of times: https://github.com/jafingerhut/clojure-rte/blob/andy-extra-debugs2/doc/scratch.clj
It seems that some inputs can sometimes cause canonicalize-pattern to call itself recursively with lists that approximately double in length, looking similar to this: https://github.com/jafingerhut/clojure-rte/blob/andy-extra-debugs2/doc/repl-session1.txt#L3004
Because of the random generation of inputs, it doesn't always happen in 500 runs, but it usually does. The common theme I saw when it does cause OOM is that the parameter to canonicalize-pattern
is a list ending with many repetitions of the subexpression (:* :sigma)
, or something that contains that.
I do not know why your code can call itsef with lists that get about twice as long as one called recently -- hopefully this extra debug output might give you some ideas on where that might be happening in your code. Eventually the list gets so long that you run out of memory.
I made a copy of your repo on my http://github.com account, which you can clone and check out the branch I created with my additions here:
git clone <https://github.com/jafingerhut/clojure-rte>
cd clojure-rte
git checkout andy-extra-debugs2
lein repl
(require 'clojure-rte.rte-tester)
(clojure-rte.rte-tester/-main)
@andy.fingerhut no doubt that is really useful information. the curious thing though is that even if certain lists trigger the out of memory error in the random tests, the same lists cause no such problem when I call it directly from the repl.
about the multiple occurrences of (:* :sigma)
there is a reduction step in the canonicalization code which recognizes these duplications and reduces them in different ways depending on which context they appear in. This is mathematically elegant, but in terms of computation effort may need to be refactored to reduce sooner, or avoid such generation in the first place
I can also think about how to reduce the number of re-allocations of (:* :sigma)
this is a pair which appears extremely often. rather than reallocating it in the code, I could define it as a constant and try to reference it rather than copying it.
I did not attempt to learn why sometimes an expression goes to infinite loop and sometimes finishes quickly. If you think the code is deterministic function of its input, clearly it is not, for some reason
It is really curious to me that i get the OOM error trying to canonicalize an expression which canonicalizes very easily when I try it stand-alone. For example: On a recent atempt I got OOM on the expression: (:cat (:+ :sigma) (:? (:* (:cat (:and)))) (:or (:+ (:+ (:cat))) (:+ (:and :sigma)) (:cat (:? :epsilon) :sigma)) :epsilon)
but when I call canonicalize-pattern-once
5 times it reduces to the concise form: (:cat :sigma (:* :sigma))
it really looks to me like something is filling up memory before it gets to 118/500
118/500: trying (:cat (:+ :sigma) (:? (:* (:cat (:and)))) (:or (:+ (:+ (:cat))) (:+ (:and :sigma)) (:cat (:? :epsilon) :sigma)) :epsilon)
Execution error (OutOfMemoryError) at clojure-rte.rte-core/canonicalize-pattern (rte_construct.clj:613).
Java heap space
clojure-rte.rte-core=> (clojure-rte.rte-core/canonicalize-pattern-once '(:cat (:+ :sigma) (:? (:* (:cat (:and)))) (:or (:+ (:+ (:cat))) (:+ (:and :sigma)) (:cat (:? :epsilon) :sigma)) :epsilon))
(:cat (:cat :sigma (:* :sigma)) (:* :sigma) :epsilon)
clojure-rte.rte-core=> (clojure-rte.rte-core/canonicalize-pattern-once '(:cat (:cat :sigma (:* :sigma)) (:* :sigma) :epsilon))
(:cat :sigma (:* :sigma) (:* :sigma) :epsilon)
clojure-rte.rte-core=> (clojure-rte.rte-core/canonicalize-pattern-once '(:cat :sigma (:* :sigma) (:* :sigma) :epsilon))
(:cat :sigma (:* :sigma) :epsilon)
clojure-rte.rte-core=> (clojure-rte.rte-core/canonicalize-pattern-once '(:cat :sigma (:* :sigma) :epsilon))
(:cat :sigma (:* :sigma))
clojure-rte.rte-core=> (clojure-rte.rte-core/canonicalize-pattern-once '(:cat :sigma (:* :sigma)))
(:cat :sigma (:* :sigma))
clojure-rte.rte-core=>
I have seen the last expression before OOM occurred be at least 5 different expressions, and all of them I tried in a fresh REPL finished quickly. That behavior isn't unique to one input expression.
Filling up memory before calling canonicalize-pattern would not explain why it seems to go into infinite loop though
It is creating longer and longer lists, doubling in size, when it goes on
Oom
When the same expression does not go oom it does not call itself with those expressions that double in size
I do not know the reason for the different behavior in different circumstances, but I noticed you are using binding
, and often lazy sequences. Are you aware that those often combine unpredictably?
The definition of type-equivalent?
might not be relevant for the behavior of canonicalize-pattern
, but it seems to be written in a way that you know that subtype?
is not a pure function. Why is it written that way?
Not sure if you prefer these things not to be pure functions of their inputs, but trying to make them so should make their behavior more predictable.
I have a strong suspicion that the cause and effect relationship here is NOT: "high memory usage causes canonicalize-pattern
to behave badly", but instead "`canonicalize-pattern` calling itself with list lengths that double, indefinitely, leads to high memory usage".
I can write a trivial function that calls another function with a list of length 1, then 2, then 4, then 8, etc., and you would easily reason "don't do that, you will run out of memory". Determine why canonicalize-pattern
sometimes does that, and prevent it, and you will almost certainly solve the OOM problem.
What do you mean by subtype?
is not a pure function? Do you mean the fact that it binds the *subtype?-default*
function? I probably could refactor that away now that I understand better the problem I was originally trying to solve. The issue is that sometimes it cannot be determined whether a subtype relationship holds. the 3rd argument specifies what to do, whether to return true, or false, or raise an exception. The caller of subtype?
must specify which action to take.
Off my head I can't think of any reason canonalize-pattern
would be calling into type-equivalent?
or subtype?
yes, if I can identify why canonicalize-pattern
is doubling the length of its argument on recursive calls, that would indeed sound like an error.
Yes, there are indeed scenarios where dynamic variables do not play well with lazy sequences, and vice versa. I've tried to unlazify the lazy sequences, but I'm sure I've missed some of them. Clojure tries really hard to make sequences lazy.
(doall <expr>)
is one general purpose way to force any <expr>
that returns a lazy sequence, to realize all of its elements eagerly, without having to define separate eager versions of functions like filter
, map
, etc.
Regarding my comment about subtype?
, look at your definition of type-equivalent?
. It calls subtype?
twice with the same parameters, once with delay
wrapped around it, once without, and then compares the return values of the two. If subtype?
were pure, there doesn't seem to be any point to doing something like that. Did you write type-equivalent?
believing that subtype?
does not always return the same value given the same arguments?
The :post condition expression in your function subtype?
will always be logical true, because it is an expression that returns a function, and functions like all other things that are neither nil
nor false
are logical true in Clojure, so that :post condition will never assert, ever.
Closer to what you probably intended would be :post [(#{true false :dont-know} %)
, but that is also probably not what you want, because looking up false
in a set like that returns the found element, false
, and that would cause an assert exception for failing the post-condition.
Likely what you actually want there, if you ever want it to catch returning a value other than true
, false
, or :dont-know
, is :post [(contains? #{true false :dont-know} %)]
That is unlikely related to your OOM issue -- just something I noticed while looking around.
When I look at the :post function, maybe I don't understand the semantics of :post. What I'd like to do is assert that the value returned from subtype?
is explicitly true
, false
, or :dont-know
. I think I may be confused about using sets as membership tests. I'll change the post function to:
(fn [v] (member v '(true false :dont-know) v))
I already have a member
function in my utils library defined as follows: perhaps I should replace the final call to some
with a (loop ... recur)
which checks equivalence until it finds one? I suspect that would be faster than rebuilding a singleton set, and then checking set membership many times. I suspect a small (loop ... recur)
would compile very efficiently? right?
(defn member
"Determines whether the given target is an element of the given sequence."
[target items]
(boolean (cond
(nil? target) (some nil? items)
(false? target) (some false? items)
:else (some #{target} items))))
@andy.fingerhut You commented: It calls `subtype?` twice with the same parameters, once with `delay`wrapped around it, once without, and then compares the return values of the two.
Thanks for finding that. I believe that is a bug. ITS GREAT to have a second set of eyes look at code. It should call subtype?
within the delay with the arguments reversed. I.e., two types are equivalent if each is a subtype of the other. But don't check the second inclusion if the first is known to be false because such a call may be compute intensive and unnecessary. Looks like i'm missing something in my unit tests. :thinking_face:
The semantics of type-equivalent?
are if either of s1
or s2
are false
, then return false
(types are not equivalent). If both s1
and s2
are true
, then return true
. Otherwise call the given default
function and return its return value if it returns.
@andy.fingerhut WRT your comment: >>> (doall <expr>)
is one general purpose way to force any `<expr>` that returns a lazy sequence, to realize all of its elements eagerly, without having to define separate eager versions of functions like `filter`, `map`, etc.
I don't completely follow. my eager versions of filter, map etc simply call doall
as you suggest. Are you suggesting that it's better just to inline the call to doall
? As a second point, I don't think do doall
really forces all lazy sequences, rather only the top level one. For example if I have a lazy sequence of lazy sequences, then as I understand doall
will give me a non-lazy sequence of lazy sequences. Unless I misunderstand, If I want to use dynamic variables, then I have to fall everywhere in my code which is producing a lazy sequence and somehow force it with doall
.
You are correct that doall
forces the top level sequence, not nested ones.
Set literals are constructed only once by the compiler's generated code, if they contain only constants, I believe. I would expect set containment to be faster than either member
or some
or an explicit loop, since sets use hash maps to check for membership and thus do not iterate over all elements, but for a 3-element set I doubt you will notice much difference in the context of your application.
I've done a few experiments putting debug prints in a few places here and there trying to determine why the code sometimes creates lists that get twice as long, but I don't have any good clues yet. I doubt I will spend much more time on it. I suspect there is some kind of mutable data structure being used somewhere, but that is just a guess without evidence.
I may have found the root cause of the problem: Your implementation and use of ldiff
relies on identical?
for equality testing of two lists, but Clojure's sequences are not Common Lisp sequences of cons cells.
Lazy sequences consist of objects that can be mutated in place, but depending upon the operations you do on them, all you should really ever count on is =-value-equality, or you are asking for subtle problems, IMO.
Clojure =
does have a short-cut quick test that if two given objects are identical?
, it quickly returns true
Thus your attempted use of first-repeat
, ldiff
, and concat
to either remove one element from a list, or return the same list, can actually return a longer list than given.
Here is a proposed fix that avoids the use of ldiff
, instead using a function dedupe-by-f
that I wrote by making small changes to Clojure's built-in dedupe
function: https://github.com/jafingerhut/clojure-rte/commit/6ee771559e344ed612be0b20cd2ba7bbc46dec79
I have run -main with 500 random tests multipe times with no OOM exception, with only those changes.
At least starting from your commit 50683709b7fa29ea7b53fe607a7376a7b1c32bb2, not your latest code. But it probably applies just as well to your latest version.
In general, I would think three or four times, very carefully, before ever relying on identical?
in Clojure. There might be cases in Java interop where you need to know whether two JVM objects are the same object in memory, but in Clojure about the only time I recall seeing it used to good effect is to create a unique "sentinel" object that is guaranteed not to be identical?
with any other existing object, and then using identical?
to check whether that sentinel object was returned, as a "not found" kind of situation.
I have not done anything to examine the code in your file bdd.clj other than to search all of your source files to look for other occurrences of identical?
. They might be perfectly safe as you use them there (not easy to tell from a quick glance), or they might have the same danger of bugs lurking there, too.
@andy.fingerhut Re: Thus your attempted use of `first-repeat`, `ldiff`, and `concat` to either remove one element from a list, or return the same list, can actually return a longer list than given.
This is really interesting. I don't get exactly the same results as you, but your suggestion seems to improve the situation. I don't completely understand the issue with ldiff
. I'm curious whether you might be able to give an example of where it fails? (more below...) However, when I replaced the call to identical?
with a call to =
(as you suggested) one of my OOM errors went away, but there are other tests which still exhibit the OOM error.
However, w.r.t the following code which uses ldiff
when I also replaced concat
with concat-eagerly
other OOM errors in my test suite seemed to go away, but when I run the tests again and again, they still occur.
(let [ptr (first-repeat operands (fn [a b]
(and (= a b)
(*? a))))]
(if (empty? ptr)
false
(let [prefix (cl/ldiff operands ptr)]
(cons :cat (concat-eagerly prefix (rest ptr))))))
more about ldiff
, The purpose of ldiff
, you may already know, is that if you've already identifies a tail of a list which verifies some predicate, you want to copy the leading portion of the list. In the case of cons cells, you can retrace the list doing pointer comparisons. You're claim is that in the case of lazy lists, these pointer comparisons won't work. Is this because the tail might be some sort of lazy object, and evaluating it modifies the sequence, replacing the tail with a different object which is no longer identical?
to the previous tail?
I'd love to see an example.
does backquote create a lazy sequence?
I've updated my member
function to work on non-lists
(defn member
"Determines whether the given target is an element of the given sequence."
[target items]
(boolean (cond
(empty? items) false
(nil? target) (some nil? items)
(false? target) (some false? items)
:else (reduce (fn [acc item]
(if (= item target)
(reduced true)
false)) false items))))
arguably, I probably really only need the call to reduce, not the call to boolean
, cond
, some
, nil?
, false?
and empty?
I have a different suggested change that does not use ldiff
at all, but rather removes an element from a list with a different approach completely. You could try that to see if you still get OOM errors.
The change I linked earlier that uses a new function dedupe-by-f
I do not yet have a short example that shows ldiff
returning a surprising value because it uses identical?
, but it definitely very repeatably did in the context of running -main
Turns out I was able to find a small example of ldiff
behaving not as desired with identical?
:
;; This commit SHA is in Jim Newton's clojure-rte repository
;; and is the one just before he made a change to the ldiff function
;; to use = instead of identical?
;; git checkout 302628d04af207d4969396533456376b4c80e263
(require '[clojure-rte.cl-compat :as cl])
(require '[clojure-rte.util :as util])
(def l1 '[(:* :a) (:* :b) (:* :b) (:* :b) (:* :a)])
(defn rm-first-repetition [coll pred]
(let [ptr (util/first-repeat coll pred)]
(if (empty? ptr)
false
(let [prefix (cl/ldiff coll ptr)]
(println "prefix=" prefix)
(concat prefix (rest ptr))))))
(rm-first-repetition l1 =)
;; Running the sequence of expressions above in a fresh REPL, I see:
;; prefix= [(:* :a) (:* :b) (:* :b) (:* :b) (:* :a)]
;; ((:* :a) (:* :b) (:* :b) (:* :b) (:* :a) (:* :b) (:* :b) (:* :a))
The issue with using identical?
there is not only because of lazy sequences -- it can probably occur if the collection you are dealing with is anything except a list of Cons
cells. You can have lists of Cons
cells in Clojure, but they tend to occur only if you know you are explicitly constructing them in your code, and sequences produced by core functions like filter
, map
, etc. typically do not.
With the latest version of your clojure-rte repository (commit fde964831c9907e7b87f791afeb43f18686d5f56, which is after you modified ldiff
to use =
, plus other changes), macOS 10.14.6, Oracle JDK 1.8.0_192, I can do lein repl
, (require 'clojure-rte.rte-tester)
, then (clojure-rte.rte-tester/test-oom)
20 times in a row, with no OOM occurring, even if I change the project.clj file to use -Xmx64m
instead of the -Xmx1g
you have there now.
Ahhh, is the problem that util/first-repeat
is assuming its input is a list, and in that case returns a cons cell, but it its input is a vector, it will return a copy of a tail of the vector?
Can I do the computation easier? What I want to do is ask whether there are two consecutive elements of a sequence which obey a given binary predicate. If so, remove exactly one of them, and if not return false ???
This is what my code currently does. In the code operands
is the tail of a sequence which has already been verified to start with :cat
, that's why I cons :cat
back on at the end.
(let [ptr (first-repeat operands (fn [a b]
(and (= a b)
(*? a))))]
(if (empty? ptr)
false
(let [prefix (cl/ldiff operands ptr)]
(cons :cat (concat-eagerly prefix (rest ptr))))))
Here's what I'm trying. I think it could probably be done more cryptically but more efficiently computation-wise with a call to reduce/reduced
(defn remove-first-duplicate [test seq]
(loop [seq seq
head ()]
(cond (empty? seq)
false
(empty? (rest seq))
false
(test (first seq) (second seq))
[(reverse head) (rest seq)]
:else
(recur (rest seq)
(cons (first seq) head)))))
I sent a link with a proposed change earlier that does not use ldiff
at all, but instead a new function dedupe-by-f
that is a modified version of Clojure's built-in dedupe
. Here is the link to that proposed change again: https://github.com/jafingerhut/clojure-rte/commit/6ee771559e344ed612be0b20cd2ba7bbc46dec79
There are more straightforward ways to write dedupe-by-f
that are easier to understand for me and most Clojure readers than the way dedupe
is implemented (using transducer machinery).
Your remove-first-duplicate
is one way. Most Clojure programmers do not typically use the accumulate-and-reverse technique, because Clojure vectors make it efficient to append things at the end, but it looks perfectly correct, except I think you should return not [(reverse head) (rest seq)]
but (concat (reverse head) (rest seq))
Interactively writing small test cases for new functions can help quickly catch things like that, versus trying to debug them in the context of the entire application
Yes. I'll write some test cases for remove-first-duplicate before I try to refactor it.
It is simple enough that 2 or 3 small test cases should give good confidence it is working
question about appending to the end. Is appending the the end of a vector n times a linear operation or n^2 operation in clojure? because cons
-ing to the beginning and reversing is definitely a linear operation. Right?
The problem, I think, in my previous implementation wasn't really a problem of testing the small function. In fact I explicitly wrote the function to work on lists, but used it on non-lists in the context of ldiff
.
@andy.fingerhut. Many thanks for all your help. That's kind of you.
Like Common Lisp, cons
-ing on the beginning of an accumulator and reversing is definitely linear.
conj
-ing to the end of a vector is O(1) 31/32 of the time, and O(log_32 N) 1/32 of the time, so O(log_32 N), but often called "effectively constant time", in a loose sense.
I was surprised by the behavior of ldiff
once I found out that was the cause, and only realized that identical?
is the likely cause of trouble from years of Clojure experience, with the end result the summarized statement that identical?
is almost never what you want to use.
One of the issues with having a language like Clojure that has so much clear influence in its design from Common Lisp, is the differences it (intentionally by design) has that can trip up someone with Common Lisp deep knowledge.
we learn by doing things the wrong way. we=mankind. it's painful, and some don't survive.
I probably spent a bit too long figuring out the ldiff
thing -- I can get a bit obsessed sometimes when a problem gets me wondering what is going on.
The long part wasn't figuring out why ldiff
and how it was used might cause the problem -- it was narrowing the problem down to that part of the many lines of code
ldiff
is a rarely used CL function, but when it is useful, it is quite useful.
On the other hand, I'm still wondering whether the dynamic variables are causing problems with the lazy sequences.
Most of your use of binding
that I looked at seemed to be binding the same values that the dynamic Vars had already, before binding
. Those should not cause problems. It is when binding
binds a different value.
But I did not exhaustively check every use of binding
that way.
If you don't mind passing extra explicit parameters around, instead of using dynamic vars, it can be annoying in the extra parameter 'plumbing', but lets you use lazy sequences without worrying about that interaction.
I do not now how common it is relative to other profiling tools, but I have heard good reviews of YourKit Java Profiler: https://www.yourkit.com/java/profiler/features/
I do not believe it has anything Clojure-specific built into it, so you will need a bit of practice in figuring out how to parse the names of JVM classes created by the Clojure compiler.
They have reduced pricing options (and maybe free?) for open source and educational developers
@vlaaad, it looks like it's better to put this depenency in ~/.lein/profiles.clj
rather than in the project project.clj
, right? because it is only supported on Linux and Mac, not windows. Having the dependency in my project would prevent any windows user from loading the project. Is that correct?
clojure-rte.rte-core=> (require '[clj-async-profiler.core :as prof])
Execution error (FileNotFoundException) at clojure-rte.rte-core/eval13090 (form-init3039314401959532405.clj:1).
Could not locate clj_async_profiler/core__init.class, clj_async_profiler/core.clj or clj_async_profiler/core.cljc on classpath. Please check that namespaces with dashes use underscores in the Clojure file name.
I https://github.com/clojure-goes-fast/clj-async-profiler/issues/19 for help.
Yeah, I would say all dev-only stuff belongs to user profiles outside of repo. You use your tools, I use my tools, we don’t have to dump them all in the project we work on.
A few days ago I asked about OutOfMemory Metaspace exception and what could be the cause of this issue. I was pointed in the general direction that the code could be using eval
. After a few days of research, I believe I know what is going on but I am not quite certain if I am right. To give some context, we have an API that takes in inputs and runs the inputs on a model. The model does some calculations and returns it results. These models are written in a DSL but at the end of the day it's just Clojure code (A bunch of macros). These models are not in the same project as the API, they are in their own project. When we deploy, we package up the src/models directory into a .tar.gz
and upload to an s3 bucket. In the jenkins job, that tar file is downloaded and unpacked and moved into the /resource directory. We create a war file and its uploaded to s3. When a request comes in, it contains inputs + a model-name. The API looks at the model-name and then looks up the file (A .clj file) and slurps
the contents and loads into the process via load-string
. I downloaded the war file and examined it and noticed the API's code are .class files in the WEB-INF/classes/
directory but the models code are still .clj files. Does this sound like something that would essentially act like eval
but on a much greater scale? Causing the metaspace memory to get full and barf?
Another questions, If I call load-string,
on the same clojure file, twice, does that cause two different classes to be created? Even though the file is the same?
I've been following the figwheel https://figwheel.org/docs/npm.html for using NPM modules and it says I should write my `dev.cljs.edn` as follows:
{
:main demo.core
:target :bundle
:bundle-cmd {:none ["npx" "webpack" "--mode=development" :output-to "-o" :final-output-to]}
}
When I build dev `clj -A:dev` I see that `:output-to` and `:final-output-to` are replaced with these values:
[Figwheel] Bundling: npx webpack --mode=development target/public/cljs-out/dev/main.js -o target/public/cljs-out/dev/main_bundle.js
This seems weird to me because my `index.html` looks for `dev-main.js` (using hypen instead of slash)
<script src="cljs-out/dev-main.js"></script>
Does anyone know why this discrepancy is happening?Is the model of async-profiler that I have to run something which finishes cleanly in order to profile it. One problem I have is that some of my randomly generated test-cases occasionally run out of memory and aborts. If I run this code within (prof/profile ...)
will the profiler show me anything?
If I understand correctly, clj-async-profiler is for profiling function/method calls that end. I believe that YourKit lets you attach to a running JVM at any time and start collecting performance data from it on both memory and run-time, and can do so while the code is running, even if it eventually ends up crashing, running out of memory, or into an infinite loop.
I benefitted from YourKit's license for open source projects. The flamegraph functionality was very helpful in figuring out which parts of my Clojure project were slow.
yeah hyphens get destroyed sometimes for file names, you may need to use an underscore / underline. Someone more knowledgeable might have a better answer
i don't know exactly why but it is somewhat common and can be a head scratcher if you've not seen it before, could be what's going on there
Clojure does not attempt to try to remember what the previous contents of a file were, and see if it is changed, or the same.
When you load
code containing defn
forms, new classes are created.
load-string
and load-file
are similar in that they are effectively calling eval
on every top level form of the loaded string/file.
If in the context of your application, you know that some strings that you call load-string
is identical to one you did load-string
on earlier, or if you know a file in the file system has not changed, you could perhaps detect that and avoid redoing load-string
on the contents of that file.
Depending upon what you allow in those files, avoiding doing load-string
on a file's contents might cause problems.
For example, if a file contained top level forms like (def foo 1)
and then later might do alter-var-root!
on foo
to change its root binding, then the final value of foo
can be something other than 1. Doing load-string
again on that file will repeat the evaluation of (def foo 1)
, changing foo
's value back to 1.
There could also be top level forms like (def my-atom (atom 1))
, and then later swap!
calls on my-atom
that change the value stored within that atom. Later doing a load-string
again on the contents of that file, even if it has not changed, will redefine my-atom
to point at a freshly created atom that contains 1 again.
If you for some reason know that these kinds of things never happen, and doing load-string
repeatedly on a file should end up in the same state, and running the code won't mutate any of that state, then you can safely skip doing load-string
on the same file more than once.
I do not know of any static checker you could run on such a file that would be able to tell you whether a file does such mutations, or not. Doing so in general is not a thing that one can write a program to solve (it is as hard as the halting problem).