Is there a way with agent to await-any
, like say I want to wait until any one of the agents have finished their queued up tasks, not until all of them have?
Send a task to all the agents that delivers to a promise
You could copy the current implementation of await
, except initialize the CountDownLatch count to 1 instead of (count agents)
A promise is a countdownlatch(1) internally
Ya, I was thinking that, but hoping there was maybe a core function I was not seeing
I think copying the await impl might be better, if I remember correctly, there's a trick with agent to figure out the tasks till the point of calling await, and also having them play nice with STM
Maybe I'm wrong though and its as simple as what you're saying hiredman
I'm just starting to look at agents more seriously, I feel they're a bit of an underdog. Like the more I look into them, the more I feel they can be used pretty simply to model a lot of concurrent, async and parallel scenarios, and you can send-via a custom executor as well, so they give you good control.
I was also thinking, if you do something like:
(defn send-msg [actor-fn message] ...)
(defmulti actor-foo ...)
You can actually use them for actor modeling too. Just create an agent, send it a send-msg
partial with the actor. Where send-msg calls the actor-fn with the message and the agent state. And where the actor-fn is just the implementation of the actor.Another question I had though is, how are the agent scheduled on the Executor? Say I pass them a Single Threaded executor. Will the first agent run all its tasks until its queue is empty before moving to the other agent? Or will they like round-robin?
Re: actors -- I thought the whole point of that model was that they were internally stateful, so I'm not sure how passing an actor-fn
would model that?
Oh, you'd encapsulate the state inside the actor-fn
? And pass the same (stateful) function repeatedly to the agent?
my post request to the reitit routes arenβt working even though the same get request is working
Hereβs the code:
(defn home-routes []
[""
{:middleware [middleware/wrap-csrf
middleware/wrap-formats]}
["/" home-page]
["/api"
["/sign-in"
(fn [req] (prn "sending response ")
(r/response "Okay"))]]])
and the request is made using http-xhrio
like so:
(reg-event-fx
:sign-in
(fn [coeffects [_ params]]
{:db
(assoc (:db coeffects) :signed-in true)
:http-xhrio
(http-post "/api/sign-in"
{:id "id"}
[]
[]
#_(fn [] (js/console.log "success"))
#_(fn [] (js/console.log "failure")))}
))
(defn http-post [uri params on-success on-failure]
{:method :post
:uri (str "<http://localhost:3000>" uri)
:params params
:on-success on-success
:on-failure on-failure
:response-format (edn/edn-response-format)
:format (edn/edn-request-format)
})
How to fix this error?
@ps That would be a lot more readable if you used triple backticks around code. But you might consider asking in #reitit
also it would help to know anything about the error
Hum, I didn't think of that, I guess this could work, but no I was thinking just using the agent itself to hold the state. The actor-fn is basically the polymorphic (on the message) dispatching function for the actor. So like, actors have a "mailbox" where they are sent messages, which they process one after the other synchronously (in the order the messages arrive). Each actor has internal state maintained between "sent message". And each actor runs independently from one another and can only be coordinated through message exchanged between them (handshakes and the like).
And the message passing of actors is async, so when you send a message, you get no response, the response must come as a message they might send back to you.
So I feel agent could be used pretty simply to build this model.
But agents don't have "message handling" as part of them. So the actor-fn
would be the message-handling-fn
for some given actor.
I like the idea of embedding the state in the function though, that's actually what Erlang does. Maybe the associated agent state can store the function for the actor, and as messages are sent to the agent, they are passed to that function which returns a new function.
thereβs no error raised.
The request is just not made
Is`:sign-in` event dispatched for sure? What about modifying (reg-event-fx :sign-in (fn [_ _] (println "dispatched")))
to check it?
yes itβs dispatched for sure
this is evident because the same request when done as a get request works
Asking again, since last time I asked it mixed in a wall of text. Anyone knows how are the agent scheduled on the Executor? Say I pass them a Single Threaded executor. Will the first agent run all its tasks until its queue is empty before moving to the other agent? Or will they round-robin? Or something else?
The Clojure reference says: > The actions of all Agents get interleaved amongst threads in a thread pool So I'm guessing this means round-robin in some way? Its not super clear
@didibus pardon my ignorance, in what context would you pass them an executor?
execute uses a final static executor owned by the agent class
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Agent.java#L87
oh.. that's in class Action not Agent
dispatch? https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Agent.java#L234
You can use send-via
which takes a custom Executor. But also, just with normal send
it uses a FixedSizeThreadPool, so also wondering in this case how the actions are scheduled on those
(send-via executor a f & args)
> Dispatch an action to an agent. Returns the agent immediately.
Subsequently, in a thread supplied by executor, the state of the agent
will be set to the value of:
(apply action-fn state-of-agent args)
right - that file is not so large - each send is submitted to the executor, it decides how to handle the pending ones
the executor owns a queue
agents are not scheduled, actions (sends) are scheduled. there is no coordination across agents.
So the Executor handles scheduling them?
right, that's what an executor is
I see, so most likely they are scheduled in arrival order
"arrival order" is concurrent of course
That means each agent doesn't get an equal share of the ThreadPool
yeah, that's not a thing
Would have been nice π
there are executors that can prioritize submitted tasks right?
I don't know if the send carries anything that the executor could use to impose agent fairness though
what would fairness even mean though
Hum.. I don't know, but also it would need somehow to know which submitted task come from which agent, if it wanted to say round-robin them, and I don't know if it has that information
from an executor pov, an action is just a thunk to invoke
it's opaque
Like if I send 10 actions to Agent1 followed by 10 actions to Agent2, and say Agent1 first action takes 10minutes, when its done, it doesn't move to Agent1 second's action, but instead gives Agent2 a chance
yeah, that's not how it works
Basically it would have made the end of an action a yield point
Where execution moves to another agent if there is one
why would you be using an executor that doesn't expand or provide multiple threads?
the OS CPU scheduler will mix running tasks
all of the things you seem to want are not what agents intend to provide. maybe you should make your own thing if you want those features
The default Agent Thread Pool is fixed size for example. So if you want to prioritize responsiveness, that could have been nice
then use send-off instead of send?
it's fixed size because it only serves computational tasks
you only have so many cpus that can be doing things
I don't have a fixed goal though, I was kind of just exploring.
send-off expands and can do io things
@didibus I think most of what you are concerned with is in the domain of the OS kernel
Ya, but say I compute values and display them in real time, like some gauge. Each in their own agent, but Agent1 keeps hording the pool, so the second gauge response time are hurt
Also, Loom I think will have an Executor that schedules tasks on fibers, so I was wondering if then we could use these with agents.
sure
@didibus I don't know if this matters, but I would think of it as "spamming the executor queue" or "blocking executor threads" more than "hoarding the queue", both of these are mitigated by using a thread pool that can expand and an OS that is smart about CPU scheduling
if you need fine grained allocation of procesisng power and predictable latency, you shouldn't even be using a language with a gc
(depending on how low those latencies really need to be and how bad the worst cases are allowed to spike...)
I see what you mean, but my thinking was that maybe we could use an executor that is backed by fibers. In which case, you wouldn't want agent2 to be waiting for agent1 to be done to kick off your IO, as that would limit your concurrency gains
Hum... or I guess maybe it doesn't matter here, since the fiber itself would yield quickly... hum
I mean, ForkJoinPool is designed for exactly stuff like this so you could use that now (but I kinda doubt you'd actually need it)
@didibus waiting on IO actually works great with normal threads, the OS will wake the thread up with a signal when data is ready (or when it's ready for more data) and is smart enough not to check back in until then (unless some other event interrupts your thread, which is usually something like a timeout)
But not with say 1 million IO calls, then you need to multiplex or something
the only disadvantage is the extra bookkeeping data size threads need beyond what fibers provide
Or you run out of memory for all those threads
one thread can wait on N io events, nio allows this doesn't it?
Anyways, I haven't fully wrapped my head around the implications or not of this, but somehow it seems like it be useful (or just cool?) if agents actions round-robinned their use of the ThreadPool with one another.
with core.async you can implement logic for choosing between pending things more directly
but even then usually what you really want is a big list of the things you care about, and you get woken up for the first one that's ready
manifold also has a lot of features for things like throttling and buffering which are effectively ways of managing contention over processing power
Doesn't core.async round-robin go-tasks on the ThreadPool?
Or is it similarly in arrival order?
no actually it tends to check if the channel it just woke up is immediately ready for more work (to reduce coordination overhead in typical usage patterns)
it doesn't wake up go blocks, it serves channels which carry the state machine from your go block as a callback
Hum, that's interesting, I guess cache locality could also be improved with this somewhat, so maybe even for agents, there's benefit to process a few actions for the same agent before moving on to the next
there's actually a bug (or was, maybe it's fixed) in cljs.core.async where there's only one thread available and the logic that checks if the same channel is ready to do more work leads to go blocks that read and write the same channel to hard loop and steal the CPU from all the other go blocks
I know what Erlang does, is that it actually counts how many "task" an actor has been executing, and after it reaches some treshold it'll give execution to another one. That's how it guarantees soft real time
@didibus but anyway, I think the questions you are asking / experiments you might want to try are more in the domain of queues and threads and don't really match up with what agents were designed for
That said, Erlang also kind of makes it that no tasks can be that long, because each iteration in a loop is one task, so if an actor loops 100 times that counts as 100 tasks
erlang isn't good at throughput, it's good at reliability though
and with a decision like that I can kind of see why
if you want a great talk at the insanity that underlies some of the jvm concurrency stuff like forkjoin, I was at this one and it blew my mind: https://chariotsolutions.com/screencast/phillyete-screencast-7-doug-lea-engineering-concurrent-library-components/
like setting threads to doing nonsense work just so they won't go to sleep
Thanks, I'll have a look. ForkJoin is still something I've yet to explore.
WorkStealing... π
Ya, but I don't know if the scheduler is the issue, I mean, depending on the configured treshold, it might not yield anymore than the OS will yield your thread.
I think it's the message passing and no global state that adds a lot of overhead to Erlang
now I'm imagining it modeling the behavior of that coworker who always grabs the easiest jira tickets at the beginning of the sprint
Currently reading through the rant/explanation of EatWhatYouKill and why jetty uses that instead of work stealing
Yielding in Erlang I believe is actually much faster than a thread context switch
moral of the story continues to be for 99% of things just worry about the semantic structure first and performance second
and leave it to very angry and bitter people to worry about the 1%
taken on its own, out of context this is a political tweet
Another question about agents π If you send, send-off and send-via actions to the same agent, are the order of execution still guaranteed?
agents all get events "in order"
so if you are producing events from thread T1
T1: A, B
then agents will get the events in order
if you produce events from two threads
T1: A T2: B
Ok, so if I send-off something that takes 1 second, then immediately send something that takes 1ms, the 1 second action will still first run and complete before the 1ms action?
then the agents will get events in order w.r.t the threads
I want to say yes
Like I guess I'm wondering if that's true accros Executor, so if you mix send, send-off and send-via
well, lets look at the source code for this
volatile Object state;
AtomicReference<ActionQueue> aq = new AtomicReference<ActionQueue>(ActionQueue.EMPTY);
volatile Keyword errorMode = CONTINUE;
volatile IFn errorHandler = null;
final private static AtomicLong sendThreadPoolCounter = new AtomicLong(0);
final private static AtomicLong sendOffThreadPoolCounter = new AtomicLong(0);
volatile public static ExecutorService pooledExecutor =
Executors.newFixedThreadPool(2 + Runtime.getRuntime().availableProcessors(),
createThreadFactory("clojure-agent-send-pool-%d", sendThreadPoolCounter));
volatile public static ExecutorService soloExecutor = Executors.newCachedThreadPool(
createThreadFactory("clojure-agent-send-off-pool-%d", sendOffThreadPoolCounter));
final static ThreadLocal<IPersistentVector> nested = new ThreadLocal<IPersistentVector>();
first, gross
second, we can see that there is a single queue maintained, but two epoch counters
so lets look at where those counters are used
only for making the thread names
cmd)user=> (do (send-off a (fn [_] (Thread/sleep 1000) (println "send-off"))) (send a (fn [_] (println "send"))))
#object[clojure.lang.Agent 0x59429fac {:status :ready, :val nil}]
user=> send-off
send
two different threads, it waited for the long running one before the short running one could start
Ok, so the agent have their own queue. So that mean they somehow wait before submitting the next action from their queue to the executor
(where "order" is ambiguous for concurrent sends from multiple threads)
I don't know why i prefer detective work to either testing it or asking someone who knows
https://clojure.org/reference/agents says this
both send and send-off call .dispatch
"Actions dispatched to an agent from another single agent or thread will occur in the order they were sent"
By ambiguous you mean due to the thread racing to complete the send invokation?
dispatch calls dispatchAction
yes
but from any single thread's perspective, sends to an agent will occur in order
Ok, ya the reference said that, but it wasn't clear if like it implied from the same dispatch function, like those using send and send-off would be different, or from whichever
dispatch action either enqueues the action on something called a "LockingTransaction", or on an agent itself
or on something just called nested
the ordering is enforced by the agent queue, not the executor
final static ThreadLocal<IPersistentVector> nested = new ThreadLocal<IPersistentVector>();
which has stuff
agents are aware of the STM
and vice-versa
sends that occur during a ref transaction are held and only sent once the transaction succeeds (so after any retries)
and sends that occur during an agent send are held and only sent after the agent update completes
the first part is actually an incredibly subtle and helpful tool
to let you can do a transaction that does IO
by putting the IO in a send-off agent action
Do you know if while STM send actions are held, non STM send actions can go through?
there is no such defined thing
"non STM send actions" is not a thing
I think he means "things not sent in a transaction"
you can't do that
I found the old Clojure Programming book (the bird one) had very clear explanation of agents, etc
and i don't think a transaction can happen over multiple threads
transactions should only invoke pure functions on immutable refs
I havenβt used agents for years, but I remember the book made them very clear when I read it and tried using them
b/c they're pure (no side effects), b/c immutable, the changes can be thrown away
if you do anything else, bad things may happen and it's on you :)
Like if inside a transaction you send to agent1, it hasn't committed yet, and in another thread outside a transaction we send the agent an action. Which action will happen first? Assuming the STM transaction did sent it first but didn't commit yet. Is everything held back in the agent?
so basically that logic is about throwing out stuff enqued from failed transactions?
@didibus there is no ordering defined there
@didibus well thatβs kinda non-deterministic, isnβt it? π Depends
@emccue there is no stuff to throw out - the send doesn't occur until it succeeds
Ok I see. So the agent isn't like "blocked" waiting for the STM to commit or rollback. It's just that the sent actions from the STM will happen only when it commits
yeah but send was called with values - right?
no
send is a function to invoke on the agent's state
like inc
like with all stateful constructs in clojure
it may be a function that closes over state produced in the transaction though
@didibus one real-world analogy about agents I had back in the day, and I believe itβs still valid is: think of an agent like itβs a βsecretβ π agent that does work for you, but that agent can only do one thing at a time, and only once it finishes with the current task (aka updates its state), it can proceed to the next one; I think itβs as simple as that
yep
(def a (agent 0))
(def r (ref 0))
(send a inc)
(dosync
(send a inc)
(alter r inc))
this is starting to feel totally orthogonal to @didibusβs questions
but
is that an "okay" thing to do in a dosync?
this is an incredible and subtle tool though - you can "send" a database commit from inside a ref that only occurs when the transaction succeeds
the agent's "state" can just be nil and be ignored
sure
Let's talk about send-off instead, since that's the cool thing about agents and STM.
You'll send-off some IO within your STM transaction.
Or at least that's what I understood was the "cool" part?
so thats what i meant by "does the stuff enqueued get thrown out"
The transaction itself is keeping a record of what it needs to enque when it finishes
yes
and if it fails it will "throw out" sends and see what the new run of the transaction will do - right?
right, every time I've used agents it's been because I wanted a thing that acted async, that wouldn't retry, that used immutable data to do some side effect
right
so in retrospect it's now clear how useful they'd be in transactions
the only other reason I've ever used them was to really build a simulation (specifically simulation testing apps)
Hum... Or let me try again. Say I have an STM transaction that runs and sends (fn[_] 10) to agent1 and then the STM literal just goes in an infinite loop because why not. So it never commits. Now say I have another thread that uses an agent as a counter of seconds, and I render the agent count to screen. This other thread loops and sends inc to the agent then waits 1 second, and repeats. Once the STM sends the action, will the agent stop executing the inc actions? Until the STM commits or rollbacks?
if the STM is in an infinite loop there is no send
why would the action get sent before the commit?
the send only happens after commit
Ok ok, that is the answer I was looking for.
Just weird that it's the agent that holds onto them and not the ref in the implementation
it's the ref transaction
not the agent or the ref
I might be confused with the implementation. Have to look at it some more
What is the polite, project priority respecting way to work towards the clojure source code being more legible?
you mean clojure itself?
I don't know, just wanted to be sure it wasn't hehe.
yeah
I dunno
like, i know no one is complaining about it on the yearly surveys
Like a patch that just refactors things to be more readable?
@emccue maybe a starting point would be a compromise of learning how whitesmith style works - it's not a totally accidental layout
no, i don't mean the indentation and stuff
there is no chance we would accept such a patch
thats all whatever
i mean - without running commentary from alex or outside blog posts or a book, i would have had no clue what was going on in the source code with that bit that touched the STM
I would far more interested in improving docs than changing code
thats what i mean
the clojure reference book by reborg is excellent in this regard
(if i follow what you're asking for)
just the huge doc comments and variable name changes would be a good start
final static ThreadLocal<IPersistentVector> nested = new ThreadLocal<IPersistentVector>();
`I mean, the Clojure team has all contributed to the content of Programming Clojure if you want longer form writing than the clojure web site
oh you mean external documentation
@emccue thatβs the nature of most software projects, isnβt it; show me any even moderate size library/framework thatβs easy to understand for a newcomer; thereβs this saying βwriting code is easier than reading itβ and I think itβs very true
@raspasov sure
It would be cool if the Clojure code base was in literate style for educational purpose. But I guess I'd find all open source project in a literate style cool for learning purposes hehe
I have a hard time understanding most of whatβs in React Native, and Iβve been using it since 2016
@raspasov a version of that that I'm particularly fond of: "inventing goes downhill, learning goes uphill"
compare clojure's PersistentVector class
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java
there's an implicit narrative of how things work and what they mean, that's tied to the process of making it
the chance that I'm going to spend time modifying the source code of Clojure for explicit purpose of making it easier for external devs to read the code is honestly pretty low.
to the documentation for vavr's bitmappedtrie
https://github.com/vavr-io/vavr/blob/master/src/main/java/io/vavr/collection/BitMappedTrie.java
if working on it for some other reason and that makes sense to enhance, then maybe
now that my job is writing clojure and i'm not just a total schmuck, i would be open to dedicating my time to it
but i am hesitant to do anything for fear that my time would be wasted with a "won't do" on patches
I had an idea to keep a clojure fork with up to date changes but annotated with commentary, maybe do that?
well, I think it's unlikely that anyone that's not on the core team would write the docs that Rich would approve of is low
I think you're better of with a fork where you do that, and then that could be a nice project people can use to learn it. There's still considerable effort from the reviewer, any refactor can introduce subtle behavior change so...
I used to have a fork of core.async that was basically like that for my own sanity
Ya that would be a cool project
@didibus thats still external docs though
just far less useful
True, but the code base doesn't change that often. So it be easy to be like I don't understand let me check the annotated fork
> well, I think it's unlikely that anyone that's not on the core team would write the docs that Rich would approve of is low Is that a challenge or a statement on his personality?
it's a recognition of the quantity and strength of his opinions :)
π
as someone that has lived in that crucible for 8 years
The man wrote his own language, I think wanting things in his own peculiar way is probably a little bit part of his personality. And that's also a good thing to have for a designer, it means originality and vision too I'd say.
Most of the docs in core are really good; they hit just the right balance between conciseness and exhaustiveness (IMO)
@raspasov Yeah, but you still program clojure
there is survivorship bias in that
@emccue hahah sure
Good point.
I find Javadoc is pretty good as well, and is in a similar style
the people who i showed clojure to who weren't totally turned away by parentheses really didn't like the docs
Hmmm.
Python docs are like a tutorial for beginners. And sometimes I find that's like too much.
and personally I still have issues with them that i have trouble putting into words
But the fact http://clojuredocs.org is indispensable might be proof something is lacking in the current doc?
we have talked about enhancing the docs of the main Java interfaces and I think that's probably the area of greatest interest, in particular understanding how they underlie the clojure core library
Well I find http://Clojuredocs.org as an excellent extension; it provides examples, and helpful snippets; You shouldnβt have that in a doc string;
like clojure.lang.Counted
<-> counted?
etc
I tried to do a pass of that in Clojure Applied -there's a diagram in ch 2 I spent a lot of time on
but the question is always - do you want the team to be writing docs or making more things? :)
@raspasov there are literally test tools designed to run examples given in documentation
@emccue Iβd agree clojure.set is not the strongest candidate for perfection; http://clojuredocs.org really saves the day there
Remember that Clojure functions often do a lot though, by nature, they are like mini DSL, and can often take various options, arities and even different types for the same param that can alter behavior.
More things π Though a beginner or newcomer might say more doc
https://clojuredocs.org/clojure.set/rename-keys just a quick look through gets me going towards solving a problem
yeah but look at the examples
clojuredocs is great
Which in turn means it's harder to document. So generally I learn by repl exploration, source reading, doc and clojuredocs notes and examples
the docstrings on clojure api functions will never be able to say everything there is to say
I vote we strip all the docblocks from java.util.concurrent for Rich
I do wish that higher order function parameters were more clearly described in the doc-strings though
combining those with other sources of information (examples, explainers, source ,etc) is an excellent idea and clojuredocs does it well
the source code in juc is truly world class with some of the best comments I've ever read
I always forget if the function passed to reduce takes the accumulator first or second, and it's kind of hidden within the doc string
gold standard
I have spent an evening or two in my life with glass of scotch reading it
clojure source needs glasses of scotch for the other reason
Back to agents though. So after all this convo. If I understand, actually the agent won't just submit tasks to the executor as they are submitted to the agent. It'll wait for the prior submitted task to be done before submitting the next one, and keeps actions in its own queue until then. That means that agents actions will be more fairly distributed over the executor pool than I first thought. Which is great.
> but the question is always - do you want the team to be writing docs or making more things? π
I think the real answer is - the team should be working on what they want to work on
@didibus yes, a task X1 submitted to an agent A1 cannot βjump the queueβ before all tasks for that agent A1 are completed; only then task X1 gets its turn;
otherwise they wouldn't work on anything
it helps me to visualize the individual elements of the last arg colliding with the function and accumulated value
so of course the new element would be the rightmost arg :D
similarly, < and > act as line graphs of the expected magnitude of the args they take
Agents have an unbounded queue, no back pressure, erlang has similar issues with unbounded queues
I was just rewatching ztellmans everything must flow clojure west talk
I only try to use (< a b c) and visualize it as a question like: βAre a, b, c all increasing?β
Consistently only use > or < across a project
Which is much ado about queues backpressure
If possible
@hiredman that is a good talk
its worth noting that in the erlang context queues are a distributed thing, maybe across multiple machines
I think having both is useful, for example if my goal is to ensure I haven't reached a limit as a value goes up, I definitely use <
the analogy in the JVM would be akka's distributed actors
on the other hand, if I am checking a lower limit as something goes down, >
is a natural choice
Fair
at least in english, it's more likely that the first term in a pair of nouns is the one under scrutiny
Clojure's agents are a local thing
Either way, the lack of backpressure is problematic
"is an apple better than an orange"? - you are asking a question about an apple
to use a minecraft metaphor, you can't fill a hole higher than the top with the dirt you dug out of it
so OOM on task queues local to a machine feels like it means the machine legitimately has too many tasks to do and not enough time to do them
Yes, which is why you need back pressure mechanisms to reject work
Nothing scales forever, and no scaling is instantaneous
@hiredman as an alternative to agents, one can mix core.async primitives, transducers, etc to achieve backpressure; thatβs a whole long discussion in itself; but Iβd personally use that over agents in most cases, I agree
I see the queues in front of agents more as the mechanism for making sure things go in logical order
you can put a regular java bounded queue in front of the logic that actually calls send
and use the normal mechanisms for backpressure there
if you have distributed actors, that single queue in front is all there is
@emccue true, or a core.async channel, thatβs a good point
so things like backpressure need to be solved there
If you have to involve your own workqueues it kind of undermines the case for agents
When an agent dents to another agent, does it also go through this queue? How will the new workqueues interact with the stm and pending sends in agent actions, etc
Huh?
Thatβs a good point as well, it gets more complex; almost depends on the specific use case; in any case, the problem of βtoo much work scheduledβ should be addressed one way or another (in a well designed system thatβs expected to be under heavy load);
I'm talking about their underlying Executor. Like imagine they run via a single thread. Agent1 A11, A12 Agent2: A21, A22 ExecutorThread: A11 So we submitted two actions to two agents. We did so synchronously in the order: A11, A12, A21, A22. The question I have now is how is the agents going to schedule those actions unto the ExecutorThread? And from what I understand with everyone's answer from before is that, Agent1 will submit A11 first and then wait for it to be done. In the meantime, agent2 will submit A21 and wait for it to be done. Now, if say A11 is done first, then agent1 will submit A12, and then when A21 is done, agent2 will submit A22. But if say A21 was done first, it would be that agent2 would submit A22 first.
What is the reason to have both drop
and nthrest
? Is it just the different order of arguments?
In Clojure collections are seqable, so you can use sequence functions on them as well. Which is why your examples all work.
Also range here is a special case
(def foo (nthrest (range) 5))
(type foo)
;> cljs.core/Range
(def foo (drop 5 (range)))
(type foo)
;> cljs.core/LazySeq
So nthrest will eagerly return a new range
Where as drop will lazily wrap the existing range in a lazyseq and only once you take from it will it "drop" elements
Try this instead
(def inf
(iterate
(fn[a]
(println a)
a)
0))
(def foo (drop 5 inf))
; Nothing gets printed cause drop is lazy
(def foo (nthrest inf 5))
; Print 0 five times, because nthrest is eager
Oh right, so the nthrest is somewhat lazy in that it doesn't force more elements than what it needs while drop is even more lazy where it's not realized at all until you ask for it.
Ya, nthrest will be eager up to nth
Yea, in Erlang you put an actor in between which acts as a bounded queue
Cause in theory, you should always assume that the message you are sending is over a network to an actor on some other node
But ya, this is one aspect where CSP is nicer, but harder to distribute CSP processes across nodes though.
Thanks I'll have a watch, don't remember seeing that one
I think drop is lazy and nthrest is eager?
That's why the order of argument differs as well, because drop is a sequence function, so the collection is passed last, where as nthrest is a collection function, so the collection is passed first
Well, "too much" depends on a lot. The load balancer itself can be configured to drop load, so if you've load tested, and configured things accordingly, you might not need backpressure inside your application service, you already know your fleet can handle max load, and that the LB will drop additional load
This seems to work fine:
(take 10 (nthrest (range) 2))
Also this:
(first (nthrest (map pr (range)) 2))
012nil
(first (drop 2 (map pr (range))))
012nil
I just find agent kind of fascinating. Actors and CSP have been formally studied for capabilities, but I feel Agent are unique to Clojure, cooked in Rich Hickey's brain, and I'd be curious to see someone like formally study their capabilities. What can and can't you model with them? They're an intriguing concurrency construct.
@didibus yes; This discussion often tends to cross the boundary between the practical/engineering and the theoreticalβ¦ Thereβs many ways you can solve the problem, I agree; A purist (I am generally not one of those) would respond to this as saying that itβs βkicking the can down the roadβ; ideally all levels (load balancer, application, etc) would handle backpressure because even if youβve load tested, performance is probably not entirely deterministic
Ya, in theory you can have certain payload that take different paths than in your load test, and possibly that causes something to be overloaded, but like you said, in practice it's often damn good enough π
The reality is that majority of applications never get enough usage for all of this to matterβ¦ but when it does - it really doesβ¦ (and Iβve dealt with cases where load is a problem and you are literally at a loss where the problem originates - it becomes a guessing game)
In my case it went something like βadd more servers to load balancer + increase MySQL capacity till everything works + enough βsmartβ memcacheβ (this was not a Clojure system, it was before I even knew what Clojure was)
Well, I wouldn't be surprised if 90%+ of services out in the wild does not perform periodic load testing and scaling exercises either
They just donβt need to.
Ya, and if they get a surprise load increase, they'll neither have the LB drop the load they can't handle, nor have any backpressure handling anywhere. Which is fair, your business teams normally only let you put time behind these operational requirement only after they learned the hard way
Some lessons you can only learn the hard way π, no amount of senior engineer or engineer "warning" and "I told you so" cuts it, you got to feel the pain to realize
Yea, in any case, the simpler the system, the easier it is to reason about; I think here the general Clojure philosophy of keeping things truly simple is worth everything.
I guess to @hiredman's point, if he was referring to core.async or other such thing which handle backpressure by design, I can see that. If you've got something that is as convenient as agent or actor to model concurrency, and it gives you backpressure handling just as easily, you could say it's all around better
Which I know CSP does in some ways.
I still find agent intriguing. I don't know, nobody uses them, but I wonder if part of that is because there's no theory or history around them the same way CSP and actors have.
At least core.async has all the primitives to detect when things are getting overloaded; channels have buffers, you can blocking-ly put on a channel (and have that blocking put have a timeout, etc) When your puts start timing out -> Clear signal youβve reached backpressure
I guess the fundamental problem with an agent from a backpressure perspective is that you can just βsendβ to it - thereβs no clear way to infer how big the queue is and if itβs growing, shrinking etc; perhaps there is some hacky JVM way; but thereβs no API for it; so no matter how many backpressure mechanism you put in front of a non-backpressure capable construct, once you have to forward your work to the non-backpressure service, all bets are off; the fact that you put a blocking queue BEFORE it doesnβt really tell you much about how the next thing will behaveβ¦ unless thereβs some heuristic mechanism to determine that behaviourβ¦ but it getsβ¦ complex π
Ya, but it begs the question, could you have a backpressure-send for example? That say counts up to 30 sent task and on the next call awaits them before sending another for example
Or could you have a bounded-agent that is similar but with a bounded queue that you can pass in a limit, probably, etc.
So I still feel you could easily build on them
You probably couldβ¦ but itβs just a different API;
Because both send and send-off return immediately
You definitely could write something else that is effectively βboundedβ agents with blocking βsendsβ
Once that upper bound is reached
Ya, I'm imagining it be a new send function that is possibly blocking
Or if using a bounded agent, then send and send-off would still return immediately, but they'd return an error saying you reach the max and need to try again later
Yea, but then again, you can recreate most of that functionality in core.async pretty easily
Have a global channel with a transducer + an atom; the transducer just updates the state of the atom
Basically agentsβ¦
But also with backpressure capability
A little experiment I put together β¦ https://gist.github.com/raspasov/e0081d6de5f61648311c3a598e1f8941
Has anyone else had this happening with the clj
REPL? Notice how after pressing return, the previous instance of the prompt (`user=>`) disappears.
It happens in both iTerm and http://Terminal.app. I thought my customised ~/.inputrc
file might be affecting rlwrap
. But moving that file aside made no difference.
clojure
command work as expected so this should be connected with rlwrap or clj
script itself
Ah - just found this: https://github.com/hanslub42/rlwrap/issues/108
Adding the following to ~/.inputrc
is a workaround:
$if clojure
set enable-bracketed-paste off
$endif
Looks like rlwrap
hasn't had a release in almost 4 years. It might be some time before a working-by-default rlwrap is in package managers.For my 10% day at work, I want to set up a basic GraphQL API using pedestal and lacinia-pedestal, but before I start creating from scratch, is there a tools-deps based template for such a thing around? I couldn't find one, but my google-fu may be poor. There's a lien template, but I've become accustomed to deps.edn lately and would like to stick with that.
Hey are expired values garbage collected when using a ttl-cache from core.cache
?
it seems that the cache still holds the expired values in a way so I want to make sure I am understanding this right as I see there is also soft-cache-factory
so I was going to compose the two and do something like
(-> {}
(c/soft-cache-factory)
(c/ttl-cache-factory {:ttl 1000}))
to create the cache and I want to make sure thatβs the right way of doing itCan you add a question and this info at https://ask.clojure.org so others can find it in the future?
If thereβs some way to set this on the command line we could bake it into the clj script
Ok will do
https://ask.clojure.org/index.php/10025/have-prompts-previous-lines-command-started-disappearing
thx!
The clj
script could export INPUTRC=/path/to/some/file/you/provide
, but that has potential to cause worse annoyance than the actual bug. e.g. my existing ~/.inputrc
file content would then no longer get picked up and I'd no longer have working word-movement (alt-left/right) for my keyboard.
yeah, don't want to do that but rlwrap has a bunch of command line options, not sure if any of those cover
this has been bugging me so much! thanks for asking this question. it made me doubt if i had ever seen the "proper" behavior @duncan540
Yeah, the way I normally use rlwrap+clojure involves the prompt getting coloured green, so it was a bit easier to spot by the scrollback history suddenly lacking any colour
I hope an expert weighs in, but my vague recollection from using the lib / reading the code a while back is that the ttl-cache evicts, but only when you do lookups (maybe insertions?)
also the printed version of the cache can be misleading (even regarding what elements are present)
@thomas.ormezzano core.cache
expects the cache to be stored in an atom and when expired values are evicted by an operation, they will be GC'd -- but if you are just looking up values and not doing any additions (or explicit removals), you are just working with an immutable value.
In other words, expired values are only evicted when a new value of the cache is created inside the atom -- swapping out the old value (which still contains expired items) for the new value (which has evicted them).
core.cache
is a bit counter-intuitive to use (because it's an "immutable cache value") which is why there is core.cache.wrapped
which has the same API but automatically wraps the cache in an atom for you.
Ya, but I'm talking about the user interface.
Agents give users a different set of capabilities, sometimes they make things easier then core.async. but it all depends
And it's nice that agents let's you choose your own executor as well
But they don't let you choose your queue, which is the issue with backpressure
I might try to add an await-any
and a bounded-agent
if I have a chance.
I think with that, it open up some nice new use cases for it
I have a sorted-map and i want to replace the values with the reductions of a function over the values, like this:
(let [sm (into (sorted-map) (map vector (range 100) (repeatedly #(rand-int 10))))
xs (vals sm)
new-xs (reductions + xs)]
(reduce conj sm (map vector (keys sm) new-xs)))
is there a more efficient way to do this?(i guess assoc might be better than conj, and calling the map pair constructor would be better than vector, wondering more if there's another way to approach this though)
You want every key in the map to have the same associated value?
no, i want to replace the vals with (reductions + vals)
oh, sorry, I see that now.
I wouldn't be surprised if there are constant-factor time improvements to the code you show above, but nothing that is going to be O(n) faster or anything like that.
gotcha, cool
Sorted maps do tend to be an order of magnitude or so slower than non-sorted maps, because of binary branching rather than usually much higher branching factors in non-sorted ones.
But if sorted map is a requirement for other reasons, then yeah, I would recommend doing a little micro-benchmark between your code above versus an approach that does assoc
on each key once, to see which of those two is faster, or whether they are about the same. I don't know of any sorted maps/sets that have transient implementations that might speed this up, neither built into Clojure nor in a 3rd party lib. If this is 90%+ of the run time in an inner loop of your system, then you may want to consider mutable data structures from Java lib instead.
interesting, yeah I'll have to see if I can get away without the sorted map
@jjttjj surely something with (into sm (<some transducer here>) ...)
If the time isn't a significant factor in your application, I wouldn't worry about it, as mutability leads to other concerns of application correctness, quite often. Measuring performance, and determining which parts are actually contributing to any perceived slowness, is always a good idea before ripping and replacing data structures.
it's easy to forget that the first arg of into doesn't need to be empty, and most (reduce conj ...)
constructs should be into
calls
Yeah I guess I was mainly wondering because it felt a little awkward, I don't have an immediate need to improve performance but thought I could be forgetting some core function or something.
@noisesmith oh yeah good call on into
It might be faster if you make the map transient, and then use update over the values?
@didibus that is what into is
As an aside, I was trying out the Neanderthal library for double-precision floating point operations on vectors/matrices, etc., and the author has written a nice blog article and Clojure namespace benchmarking about 7 different ways to do a vector dot product. They only differ in constant factors, not big-Oh differences, but I was a little surprised to see the differences were about 250-to-1 for slowest vs. fastest. I wouldn't expect that large a difference for the variations we are talking about above, though. Probably at most 4-to-1 (as a semi-educated guess).
Oh, ok if into already does that under the hood. Wouldn't into be doing assoc into a new transient map though? So it is slightly different than a transient update in an existing one?
@didibus I might have misunderstood you, but the desired result isn't replacing the content under the keys, but deriving new ones to add to it
into
uses transients if the into-ed data structure has a transient implementation, yes. Clojure's built-in sorted maps and sets do not have a transient implementation.
Oh, ok I misunderstood the problem
transients aren't useful for long term values, they are useful for patterns of building up results in a pipeline
I thought they wanted the values of the existing keys replaced by a new value
They do, IIRC.
If sorted-maps don't have transients though it don't matter
@andy.fingerhut oh good point - I overlooked the fact that literally every key was guaranteed to be replaced
but anyway, into does the right thing regardless, and uses transients when possible
Well, every key will remain in the final result from the original map, but most or all will be assoc'd with a new value.
right
But I am curious of the difference between:
(reduce (fn[m [k v]] (update m k (process v))) (transient existing-map) existing-map)
;; and
(reduce (fn[m [k v]] (assoc m k (process v))) (transient {}) existing-map)
more GC work in the 1st
since iirc you need to copy the whole tree for the existing map to make it transient
What APM product has really good support for clojure?
New Relic and Datadog seem to have some clojure wrappers but they're not great
I think you are mistaken
@emccue The first update
or assoc
won't copy the whole tree, but since here we know we are changing the value assoc'd to every key, it will by the time it is done.
yeah, creating the transient is just a boolean field flipped, it's the persistent! call that you pay for
iirc
The first update
or assoc
on a transient tree-based data structure will only alloc new nodes for the path to that leaf.
that is also incorrect
"In fact, it mostly is the source data structure, and highlights the first feature of transients - creating one is O(1)."
"... you can create a persistent data structure by calling persistent! on the transient. This operation is also O(1)."
At least for the vector version of persistent/transients, which I have grokked fully at this point from looking at it too long, you can think of transients as just as "slow" as persistent operations for the very first change, because it allocs just as many nodes as the persistent operation.
There's also https://github.com/metrics-clojure/metrics-clojure which wraps dropwizard
And I just found this: https://www.instana.com/supported-technologies/clojure-monitoring/
The advantage of transients is that all of the newly allocated nodes for the transient are "owned" by the transient, and can be mutated if they are updated again by a later operation on the same transient, before you call persistent!
But I'd go with Riemann :man-shrugging:
The calls to persistent! and transient really are O(1), guranteed, worst case.
Okay then I'll say the real thing you pay for is in developer understanding
because i did not know most of that
If you are going to update a significant number of elements in a collection all at once in your code, I can't think of any way using transients could be slower. They can be faster if you do enough operations that "touch" the same internal tree nodes of the data structure multiple times.
If you are doing only one operation each time you do transient then persistent!, you will pay O(1) on those conversions that will be slower, but I'm pretty sure the docs have some mention that they are intended for batches of updates.
So transients are useful if you modify the same element more than once?
We do not use any "Clojure wrappers" -- I didn't even know there were Clojure wrappers for New Relic.
Transients can speed up that case, yes.
because for that sequence of operations, persistents would do path copying in the tree for both 'modify' operations to the same element, but transients won't.
Transients also reduce memory allocation if multiple operations share significant numbers of nodes in the tree paths they are updating. The case you mention of same element more than once is the best "overlap", because the tree path being touched is exactly the same one as the transient touched before.
Any intuition into what else would "touch the same path"? Things like elements that are nearby?
I admit, I googled wrt to datadog apm and immediately came across one, but it has had no updates in two years
In Clojure-land, no updates in two years might mean "works great and doesn't need updating"
the thing I always want to see in APM support for an app is the ability to have custom spans + well supported at tracing libraries out of the box
But as another example, if you create a transient, and the first update walks 4 nodes in a tree to get to the element to be updated, it will allocate 4 new nodes, just as the persistent operation would. If the second operation on the transient was at a leaf that had 3 nodes in common on the path from root to its leaf, and a 4th node that was not shared, the second operation on the transient would (at least usually) mutate the 3 common nodes in place, without allocating new ones, and only allocate a new node for the 4th node that wasn't in common with the first path.
For vectors, indexes that are close to the same as each other.
@didibus I admit, I'm a newbie in this space again.
Rust, golang and python on the teams I previously ran
For hash maps, keys that have hash values close to each other (but that isn't something you typically want to care about in an application)
Especially with wrapper libs. Most likely New Relic doesn't break their APIs all the time. So the wrapper just uses the New Relic APIs, why would it need to change all the time? Its New Relic itself that gets updated
Same for hash sets as hash maps.
the last time I fiddled with clojure was in 2011
Yeah, the docs are good, if only they were more widely read
That's fair. But its true, wrapper libs often don't get many updates, but they mostly still work fine. The only reason to update a wrapper lib is if the underlying lib added new features, or if they broke their existing APIs.
If you are doing any sequence of N operations on some persistent collection all together in time, I think the worst that could happen for doing transients instead of persistent operations are the two O(1) time conversions for persistent! and transient ops. The transient ops will in the worst case allocate as much mem as the persistent ops would, but should never be more, and will in many cases be less.
I'm not sure, but depending upon the reader, they might be left wondering how it works under the hood before they actually believe it π
Makes sense, thanks
For a bunch of conj's growing a vector, transients are amazingly lower at mem allocation.
Riemann will have the best Clojure support of anything, since it is itself implemented in Clojure. Its free and open-source as well. That said, if you want something with like better doc, UIs, etc. New Relic or any other Java based solution should be pretty easy to integrate into your code base. If I were you, I'd quickly glance at some of those wrappers lib you saw, and you'll be amazed how small their code must be, and simple what they do probably is.
Right, and I guess that's where into gets most of its speedup as well
yes, by its taking advantage of transients whenever the target data structure implements htem.
Riemann is pretty cool though π
But out of transparency, I actually never used it or New Relic, etc. My company has its own internal APM that we use.
This is my (old) blog post, if you decide to investigate New Relic: https://corfield.org/blog/2013/05/01/instrumenting-clojure-for-new-relic-monitoring/
(so at this point we've been using New Relic in production for about 7 1/2 years)
A nice thing about Riemann, I don't know if New Relic is similar, is that you can use it to monitor everything, JVM, AWS services, other programming languages, hosts, logs, exceptions, application level metrics, etc. Since fundamentally, its just a stream processor. You send events to it, it runs some custom filter and route function on it, and then publishes that back to wherever you want, ElasticSearch, Graphite, PagerDuty, etc.
I find its the OpenSource Clojure project that has the most potential to "blow up", kinda like how Scala had Spark blow up for example.
The "pull metrics" aspect is not that interesting, storing the data, making queryable and have a decent UX is why people pay for New Relic (or DataDog), rather than hosting their own Graphite+Grafana server - I'm talking about business of course.
Well, anybody could start a SAAS startup offering a hosted Riemann if they wanted. That be neat actually
But you can do all of that with Riemann, but ya, you need to host it yourself. Maybe there's an AWS CloudFormation for it or something though? Or at least someone could probably do one
Like this: https://github.com/threatgrid/kiries
Gives you a ERK stack out of the box: Elastic, Riemann, Kibana
Also, I don't know about New Relic, but Riemann does real-time monitoring
Like if all you have setup is Riemann, and your app and hosts send metrics to it (you can just use one of the existing riemann host and JVM clients if you want). Then you get a dashboard of the real-time metrics from Riemann itself, and you can query its live running index
Hi guys! I don't know if it's the right section to write this but, I d like to learn Clojure. I'm a Java developer and I'm going to try a functional language. I had to chose between elixir and Clojure and I think that as Java Dev could have sense to try Clojure. I will start to read "Clojure for the brave and true", is it a good book to start? And what about web development (frameworks, libraries...) and the future of clojure in the industry? Thank you very much
@mircoporetti You'll do better asking for recommendations etc in the #beginners channel -- that's where people have opted in to helping new Clojurians in depth.
And then you can configure it to "monitor" these real-time metrics, and on some condition have it send something to somewhere, like say sent to slack, pagerduty, email, etc.
Correct me if I'm wrong, but riemman only handles collection and short term buffering of metrics, can it do long term storage and integrate with Grafana?
And if you want historical, you can forward events (or filtered events, or pre-aggregated events) to some store like ElasticSearch, Influx DB, Graphite, etc.
Sorry, I will do it. Thank you
Well, you'd forward your Riemann events to ElasticSearch, and then use Graphana to visualize it
The nice thing about using a commercial service like New Relic is that a) pretty much everything you've mentioned is already available out of the box b) it requires no infrastructure of your own to support it c) you get professional support from New Relic tech folks (and they're really helpful and really good).
Right, which means you cannot do what NewRelic offers with just riemman, there's a whole stack needed behind it.
@seancorfield are you using any of the new offering from NewRelic? They've recently moved off the dreaded per-host pricing
Another possible (commercial) service to look at is Elastic Cloud which offers a full ElasticSearch/Kibana stack with log ingestion etc etc -- which you could use with Riemann I expect.
Well, ya. I mean, its not a Managed hosted solution. So obviously it can't provide that. But its pretty simply to use it to put together your own hosted solution.
Yes, Elastic has a Riemann plugin that can replace logstash
oh, that's neat
https://www.elastic.co/guide/en/logstash/current/plugins-outputs-riemann.html
@lukaszkorecki I don't know what we're doing from a price/plan p.o.v. since I don't deal with that side of the house -- I'm a "user" of the New Relic features and, as a developer, an integrator so all of our systems send metrics to it, and are properly monitored.
Hum... nah sorry, thats logstash to Riemann lol
Was going to say π
I guess you'd just have Riemann forward things to the ElasticSearch provided to you by your Elastic cloud hosted
@seancorfield ah makes sense, we're still a small shop so everyone does a bit of everything. Including cost control π’
We're small, but I'm lucky enough to be spared some of the day-to-day business minutia. Our front end team is also heavily into New Relic (but they also use Google Analytics and FullStory and...).
Its just:
(def elastic
(elasticsearch {:es-endpoint "<http://localhost:9200>"
:es-index "riemann"
:index-suffix "-yyyy.MM.dd"
:type "event"}))
(streams
elastic)
You can also pay for a hosted Graphite, and have Riemann forward to Graphite: https://riemann.io/howto.html#forward-to-graphite