When would I use unsynchronized-mutable over volatile-mutable in a deftype?
the only reason I can imagine is if you know the code is single threaded in all cases and need the performance boost of not propagating the value
I can't imagine that coming up much in normal clojure code
see "Java Concurrency In Practice" - best source I know of for what these terms imply design wise with the jvm
the doc string says basically "if you don't know what these mean don't use them", for good reason
... Fields can be qualified
with the metadata :volatile-mutable true or :unsynchronized-mutable
true, at which point (set! afield aval) will be supported in method
bodies. Note well that mutable fields are extremely difficult to use
correctly, and are present only to facilitate the building of higher
level constructs, such as Clojure's reference types, in Clojure
itself. They are for experts only - if the semantics and
implications of :volatile-mutable or :unsynchronized-mutable are not
immediately apparent to you, you should not be using them.
Yeah I find that doc insulting tbh. I’m plenty experienced enough to burn myself and recover if I have a good explanation to fall back on
"I have a marvelous explanation of these features but this docstring is too narrow to contain them"
I’ve been writing Clojure professionally for years but only touched Java concurrency primitives a handful of times. Understanding how they work in the context of Clojure’s higher level abstractions would be lovely
Anyway I’ll just litter my code with volatile-mutable
to replace my CLJS mutable
notations and hopefully someone will open a PR and explain why I shouldn’t of it’s wrong
Explaining volatile, even metaphorically, is tricky, it pulls in the java memory model, and now you need a metaphor for how you determine the order of reads and writes to memory in a concurrent program
Maybe, like, imagine you have a set of all the reads and writes a concurrent program does to some field. For the reads and writes that happen on the same thread you can establish an order easily, just the order the reads and writes in the order they occur in your program
For reads and writes across threads, you can't establish an order, any interleaving, any concurrent execution could happen
Volatile fields (and I encourage corrections) can be thought of as saying there is a total order of those reads and writes. It doesn't say what the order is, but there is some order
So one place they are used is inside transducers that hold mutable state, so if you use that transducer in a go channel, where the state is passed the between threads, it ensures that when one thread manipulates the state it sees all the changes that the previous thread made
I’m probably wrong, but I was under the impression you could explain volatile in terms of informing the low level bits that two threads might be reading/writing to a variable in memory, and as such, avoiding generating code to keep said variable in a CPU cache, where another thread running on another core, would not be able to read the latest value.
In other words, always read/write from RAM.
That is an explanation of the observed behavior, but I don't believe that is the actual mechanics
The mechanics of the generated code depends on the memory model of the cpu architecture (I believe x86's is pretty loose say compared to arm's)
Maybe I have that reversed
http://gee.cs.oswego.edu/dl/jmm/cookbook.html is a decent place to start digging deeper
with volatiles all threads will see writes (non volatile don't without some other synchronization that creates a happens-before constraint)
with volatiles, reads and writes can't be atomic though - you can't read and then write assuming that the value hasn't changed
Which is problematic when people start sprinkling vswap! around thinking it's like swap! But better
I would not recommend using :unsynchronized-mutable unless you have an external synchronization mechanism (like you are using locking
around all calls to the field)
(vswap! Is not technically something you can use with a volatile mutable field, but clojure now exposes volatile two ways, and that is the other)
there really is too much to say in that docstring other than "here be dragons, you're outside normal Clojure semantics". the real answer is to read JCIP and/or the JMM
It doesn’t have to read from RAM. It just have to reflect the value wrby the other thread. The cache coherence protocol might ensure that without forcing cpu caches to be written to the main memory
I appreciate the explanation from both of you 🙂
I think a slightly edited version of what you wrote above: > with volatiles all threads will see writes (non volatile don’t without some other synchronization that creates a happens-before constraint) > with volatiles, reads and writes can’t be atomic though - you can’t read and then write assuming that the value hasn’t changed Would be a pretty good summary.
I read both JCIP and the JMM a long time ago, and just remembering basically that has served me for pretty much all uses of volatile since then - I’ve forgotten all the details about the mechanics.
We’re having an issue with our production web app that has me stumped. It’s a clojure app built with http-kit as the web server and connections to external datomic and mongo db’s. Occasionally the whole server will just hang for a minute or two, and the logs are showing the last operation before the hang was to dispatch a potentially long-running Mongo query. I can understand how that might lock up the thread that was dispatching the query, but how can it be locking up the whole server so that it doesn’t even respond to the heartbeat query from our monitoring system? It’s not even making more than a ripple on the CPU usage, according to monitoring on the host.
I’m talking with the IT folks about installing some additional JMX based monitoring (New Relic) on the server, but I’m not sure what I should be looking for in the output.
From the description, could be either GC pause or that the thread blocked on the query is holding a lock that others are waiting on. For the former, I’d turn on the jvm gc logging - its designed to be run under load and you can shunt them off to their own log. If the latter or something else, getting a thread dump (kill -3 on the pid or use jstack) will tell you what threads are doing when it locks up and what monitors are held
Ok, thanks much. I hadn’t thought much about the possibility of a shared lock, so this plus the GC logging gives us something to pursue. This issue has happened 3 times that I know of since July, so data gathering is slow, but now we know where to spread the nets.
http-kit is built on core.async, so if you're locking up the core.async thread pool (e.g. doing I/O inside of a go
), that could cause problems too.
I'm not sure how http-kit feels about I/O inside of handler functions.
core.async now has a system property you can set to catch use of blocking async ops in go blocks. Might be worth turning that on in dev (not in prod!) to see if that’s happening
See doc at top of http://clojure.github.io/core.async/
We’re not using core.async in the app itself, not sure if that matters.
Thanks for the link, I’ll definitely check that out
Easy thing to check out at least
Oo, yes, this is going to come in very handy
@dominicm What makes you think http-kit is based on core.async?
The haziness of memory 😁
https://github.com/http-kit/http-kit/blob/059deac93b1662077d52e1eb81bef9db8c89d746/project.clj#L12
That's got to be a lie, right 😁
No, http-kit is a light weight web server in Clojure + some Java without any deps
admittedly that sounded weird to me too :)
@manutter51 are you using the async part of httpkit?
http://http-kit.github.io/http-kit/org.httpkit.server.html#var-as-channel Hmm, I'd always thought the channels in http-kit were core.async channels. I guess I'm just crazy :)
@manutter51 Maybe this issue will have some relevant info. Not sure: https://github.com/http-kit/http-kit/issues/345
Is the request or response body very large as in, would it be problematic to hold it in memory at once? This can also be a problem with http-kit
We’re not doing anything fancy with http-kit, just
(http/start
(-> env
(assoc :handler #'handler/app)
(assoc :max-body max-body)
(assoc :threads server-threads)
(update :io-threads #(or % (* 2 (.availableProcessors (Runtime/getRuntime)))))
(update :port #(or port %))))
Hmm, that’s actually calling start
from luminus.http-server
which is basically calling http-kit/run-server
, wrapped inside a try/catch
Yeah, looks like the response could be pretty large sometimes. :thinking_face:
http-kit doesn't have an optimal scenario for this. I have made notes about that here: https://github.com/borkdude/babashka/wiki/HTTP-client-and-server-considerations#http-kit-holds-entire-requestresponses-in-memory I might switch to jetty in the future for babashka because of this. Although for babashka scripts it may not be a dealbreaker, I don't feel comfortable about it.
Also see: https://github.com/http-kit/http-kit/issues/90 Again, not sure if this is your issue, but it could be.
This is really helpful, thanks much
Wait, I take back what I said about a large response — the code I’m looking at pulls potentially a lot of data out of Mongo, but then it batches it out to offline processing (via future
) for a later download. The http response itself is basically empty.
ok well, that's not it then.
what version of http-kit are you using?
2.5.0
ok. not sure what the issue is and if it's related to http-kit then, sorry
np, I appreciate the input and I learned some stuff, so all good
What mongo client do you use?
It's probably worth getting metrics in place for JVM memory stats (heap, memory use, etc)
we’re at monger 3.5.0
sounds kind of like https://github.com/michaelklishin/monger/issues/166
That doesn’t seem to be our issue, it’s happening for us when we’re querying for data.
@manutter51 You won't get a lot of insight with New Relic on http-kit I'm afraid. We were using http-kit in production and we ended up switching (back) to Jetty because that has official support in New Relic and you get a lot more information.
We worked with New Relic quite a bit to try to get http-kit supported but in the end they just considered it too niche to expend any effort. We tried configuring transaction recognition and wrote some middleware to help with transaction boundary recognition, but in the end we just gave up.
All our apps are on Jetty now, except one built on Netty and we don't get a lot of the core web transaction metrics from New Relic on that either -- but we have a lot of custom metrics that we added ourselves, via New Relic's (now obsolete) metrics plugin library.
@seancorfield out of curiosity, do you use ring-jetty or pedestal or ...?
ring-jetty
-- we try to keep things as "stock" as possible so we get the best out of New Relic.
@seancorfield Wow, that's good to know, thanks for sharing that. I'll look into switching our app back to ring-jetty
+1 newrelic and ring jetty also worked well for us, never saw a reason to pursue alternative http servers. even if you're using websockets there's good support for that in the sunng87 jetty adapter
We went with Netty for our websocket app because our frontend is JS and uses http://Socket.IO so we needed to support that on the backend and it's "easy" with netty-socketio (and we use that with Netty directly via Java interop).
anyone worked through Eric Normand's http://PurelyFunctional.tv series and can maybe shed a little light on a problem I'm working on? I'm on video LispCast - Intermediate Property-Based Testing with test.check - 3 Strategies for properties: generate the output
and towards the end there are two functions lines
and words
that both fail on the input ["" ""]
because the naive implementation of these functions use the built-in str/join and split functions which lose the empty string values. Eric gives some hints that we might want to slightly modify the generator in the defspec
and that the solutions for the functions themselves would be recursive. I suppose I could just not allow the generator to produce empty strings, but I don't think that is the intention. My current line of thinking is to replace the empty strings with some kind of placeholder, but I don't really have any way to verify that this is the correct approach as I'm still trying to wrap my head around what it means to write a proper property-based test. Any guidance or insights would be greatly appreciated!
Hi. I have a record that implements this interface:
public interface MyAsyncService extends Function<Object, Future<Result>>
[...]
(defrecord MyService
MyAsyncService
...)
but when I try to pass it to a Java method that expects Function
I get a
java.lang.ClassCastException: class MyService cannot be cast to class java.util.function.Function
Is this not something that can be done? Do I have to implement Function
separatly?are you sure Function in the interface refers to java.util.function.Function ?
and are you sure the source for the interface definition you are looking at matches the bytecode you are running (was the bytecode compiled from the same version, was the source changed and not recompiled, etc)
@hiredman Yep, just confirmed on both sides. But your second point is a good clue, I'll double check that. Thanks
@hiredman rebuilt everything cleanly and it's all good. Many thanks!
I've been through that course but I don't recall the details of that example. When I get time later today I'll pull that lesson back up and see if I can provide guidance based on that.
@seancorfield sounds great and looking forward to it. I'm really digging property-based testing so I'm hoping to learn as much as I can.
The three courses are really good. Some of the Advanced PBT stuff is pretty mind-blowing.
I have Advanced PBT queued up next.
I haven't seen any updates on spec.alpha2 / alpha.spec in a while. Is it still under active development? Trying to decide whether to embrace spec1, wait for spec2's release, or start using spec2 now despite its unreleased status
Spec 2 is still being worked on in design mode. Rich is thinking about function specs and how those should work (differently to what's currently in Spec 1 / Spec 2 probably).
Spec 2 is definitely not ready to be used: it is quite buggy.
gotcha, good to know.
What is the best way to filter a vector based on the properties of multiple events?
For example, take all the a
and b
whose f(a,b)
holds true
You’d need to filter the Cartesian product
Yeah, compute the Cartesian product first I guess
If you want to see an example there’s a lambda island screencast of the solution to the first advent of code problem which can benefit from this approach :)
Right now I'm doing some sort of weird reduce
:troll: Ahah that's where the question originated
This does not look cool, but it does the job.
spec 2 is still under active design. I would recommend using spec 1 for now.
thanks. It might be too soon to ask, but are there plans for some kind of compatibility shim or tool to ease the spec1 -> spec2 transition?
too early, but we will presumably have a guide, a tool, compatibility layer, something
that's what I figured. Looking forward to getting my hands on spec 2 whenever it's ready!
OK, just re-watched that... The "recursive" part is there already: the generators for lines (and for words) are built on top of lower-level generators (not strictly recursive except in the sense that the generators call other generators). The main issue he leaves open to the reader is the question of whether the generated output is what you would expect from processing the input: and the empty lines (or words) case is quite a tricky one to think about. Consider this behavior of clojure.string/split-lines
:
user=> (str/split-lines "\n\n")
[]
user=> (str/split-lines "\n \n")
["" " "]
user=> (str/split-lines "\n \n\n")
["" " "]
user=> (str/split-lines "\n \n\n ")
["" " " "" " "]
Is that really what you want from a lines
function? If so, how do you formalize that such that you can correctly generate that type of output? And then, how would you turn that output into the appropriate input? I think the real problem here is that splitting lines is "lossy" by which I mean that there are multiple inputs that can generate the same output (see the middle two items).(a less charitable way to couch that is that he picked a really hard problem, ran into difficulty trying to specify the behavior, and punted on it 🙂 )
But the good part about this -- and he says this in the lesson -- is that trying to generatively test the function immediately uncovered several bizarre corner cases that are very hard to reason about! @rttomlinson
I think the part I'm getting caught up on is determining what I actually want the lines
function to do. Presumably I don't want it to be lossy since I lose the inverse property and this would also make it equivalent to the existing str/split-lines
function.
If lines
is not lossy, then I suppose I'm checking before \n
to determine if I should add an empty string?
hmm, okay. I think I'll try that out and see where I get. Thanks for running through that, @seancorfield. Hopefully I'll have a satisfactory answer soon.
Yup, and that is exactly the hard part: should lines
(or words
) be lossy? Since words
will throw away whitespace between items, it is definitely lossy: "foo bar"
and "foo bar"
must both yield ["foo" "bar"]
And then the question is how you convert from the generated output to an appropriate input, since one output should really generate a random set of inputs that should all map back to that one output.
And at that point, you'll be generating inputs from a seed output values (and now you're layering another set of generators on top of the two-or-more you already have).
This is not perfect, but anyway:
(for [x (butlast report)
y (rest report)
:when (= 2020 (+ x y))]
(* x y))
Is there a function or macro to get the line\number file when called?
(defn my-func
[]
(let [{:keys [file line]} (file-name-line-number)]
(println file line)
(do-something important)))
(defmacro info [x]
(let [f *file*
md (meta &form)]
(assoc md :file f)))
(info "hi")
=> {:line 10, :column 8, :file "/path/to/x.clj"}