In the java world there seems to be a trend with a lot of buzz around reactive extension and implementations like rxjava/reactor. I have a proprietary and fairly large java api which uses this "reactive pattern" and I would like to write a clojure version of it. For context, the java api is really just a wrapper around a web service api and all actual actions result in http calls over the wire. As an example, we could have code on this shape (in this case using the jvm CompletableFuture class, but the shape applies to rx as well):
var pipe = userService.getUserByKey("some-user-key")
.thenCombineAsync(
userService.getUserGroupByKey("some-group-key"),
(user, group) ->
userService.assignUserToGroup(user, group))
.thenComposeAsync(this::checkUserForFraud)
.exceptionally(throwable -> {
<http://logger.info|logger.info>("log some error", throwable)
})
pipe.get()
where the last get
would trigger the execution of the "asynchronous chain of operations" defined above (the get is contrived, just here to indicate that the pipe is only executed if somebody asks for the result).
How would one go about translating this pattern into an idiomatic clojure api?
Use core.async pipelines? Use plain core.async? Transducers? Plain old fp? Use something else entirely? Java interop with the completable future (yeah I'm not going there but figured I'd throw it into the competition)? Pros, cons?
In a perfect world I would like to retain the ability to declaratively define chains of operations as in the above, but if the downsides drown you, living without it might be an option.@ben.sless thanks for the pointer, this is useful. Will add it to my list of things to consider. Though I am starting to lean towards the downsides with the reactive pattern weighing more than the upsides, i.e. perhaps a clojure api would be best served staying away from this pattern.
I share the feeling, however, until we get project loom (unless you want to run experimental JDK in production) your options are: - synchronous code - buy in to async world which colors all your code - buy into streaming abstraction (such as core async pipelines) which will dictate your entire architecture If all of this happens in a web handler perhaps you could try interceptors as well. I also saw you mentioned synchronous db access due to jdbc. You can try to square that circle with core async, or alternatively give the vertx client a try. I saw metosin used it in porsas but haven't tried it myself. (http://github.com/metosin/porsas)
The sync db comment was not from me. In my particular case for this particular api it would be making calls to rest apis to assemble a response to a client http request
We can for the sake of this example assume only single valued things in the pipe, i.e. not reactive streams like Flux
et al
Looking at the code above, I'm struggling to understand how you benefit from async operations, if you're calling get
straight after?
get is contrived
OK, so I'd probably just reach for future
and deref
at first, but core.async
for coordinating results might be useful in larger examples.
so for example in micronaut which is a web service framework, you can return a reactive type (Flux or Mono, but the shapes look similar to the above) from a controller method and this lets the web server use the request thread for something else, have the lengthy operation run on some thread pool and get notified when the long running operation is done
ok
I get the sneaking feeling that we would somehow drop to a lower level of abstraction with future and friends vs letβs say project reactor
We have some macros that wrap CompletableFuture https://github.com/worldsingles/commons/blob/master/src/ws/clojure/extensions.clj#L124-L150 but I don't think we're using them at work right now.
(those exist because the callback version of future
never got added to the language)
my somewhat uninformed read is that the whole point of CompletableFuture and later the reactive extensions of which reactor and rxjava are implementations is the composability, i.e. we're not just saying here's a future value but defining a chain of operations which can be passed around
I think that unless you're genuinely getting performance benefits from async code, you're just making stuff more complex.
yeah I'm leaning in that direction as well
The whole point is to juggle N tasks between M threads where N >> M
my exposure to the api is not large enough for me to say with certainty but the code looks horrible and what it is doing is really not rocket science
If you don't need that performance, don't do it
well this would be living in a cloud micro service environment so not tying a thread to each request might be relevant
If you mapped the above example onto futures and derefs, you'd get something like:
(check-for-fraud @(assign-user-to-group @(get-user-by-key "some-user-key") @(get-user-group-by-key "some-group-key")))
(assuming all those functions returned future
values)I find it hard to believe it's really worth running such low-level tasks on threads...
and error handling?
Just wrap it in try
/`catch` π
When you have a reactive framework, the temptation is to write everything in that style which often doesn't make sense, IMO.
Some things definitely benefit from async execution but the granularity of that is important.
There are kinda two things going on here 1 - Doubt at you needing to do the reactive way 2 - How to do the reactive way anyways
right and I totally hear you with the doubt part, I have a healthy dose of it myself so could be it's where this ends up.
But to get the options clear, going with 2 you would look at future
as a building block with possible some async channels and coordination thrown in?
Given that I/O is going to be synchronous, you're going to either be blocked on that or use up threads to try to get them async. But are two simple "get" calls (presumably to a DB? or maybe an API, I guess) going to be so slow that using two threads to run them and then joining the results is faster than just running them sequentially?
well, that is one way
you can also just use the CompletableFuture api like the code already does
and just interop
right, yeah, that is an option...perhaps one I was hoping to avoid
(or some syntactic sugar to make it easier to read)
and there are two issues with doing that
1. The unavoidable async-ness of the code
2. The syntax
2 is eminently solvable
1 isn't
what is really important to note regardless is that your database access is going to be synchronous
no matter what you do
(thanks jdbc)
so if most of what your app does is talk to a db, there is reason to believe that the actual performance benefits of "async-ifying" small tasks like that are gonna be pretty small
in this case the api-to-be wraps a rest service and the java side decided to build it based on CompletableFuture
so the scenario would be that there is a user request to a cloud server, the code we are talking about sits on that server and is used to talk to another web service over http. Scale could potentially be quite large so performance might come into the picture.
I'm simplifying but that's essentially the scenario
so most of what the app would do is talk to other web services and then serve a http result
Hitting a REST API for every "low-level" operation could be slow enough to warrant async/futures but that's what our obsession with microservices has led us to π
and it is now that we all say a prayer for the swift arrival of project loom amen
fibers?
Even with Loom, you're still going to have code that is full of threads and derefs. They're just cheaper.
I mean, yeah but in the context of a web service you can just thread-per-request and then the core logic wont need it
guess loom talks about tco as well but alas they seem to want to do most everything else first, tco last (totally beside the point here, I would just love to see tco)
anywho, thats probably not going to come before economic collapse from exponential global warming
so you are stuck with futures/completable futures
slash core.async channels
so let's assume we are, how does the web server situation look in clojure? can you tell say ring to handle io things on a separate pool of threads somehow? How would you actually do that part or would you have to code that yourself?
(and apologies, I have little production exposure with clojure)
would you even run a clojure web server at scale?
oh i have no real world experience with clojure, i'm just a vocal idiot - but the "simplest" way is just to run a Jetty server
and your core logic works with the ring request map
and returns a ring response map
FWIW Ring supports async handlers
right I was wondering about the "return a reactive type" or "return a future" and have the server realize what to do with it
@lukaszkorecki ah ok, yeah that is what I was fishing for
I know in pedestal (which is a few more layers of kool aid deep) you can have handlers return a core.async channel
https://www.booleanknot.com/blog/2016/07/15/asynchronous-ring.html
and I guess in this context a core.async channel would be analogous (Rich will kick me in the nuts) to say a reactor Flux
the ring system seems slightly different, but fundamentally similar - you call a callback with the value at some point in the computation
but! We run a service doing some DB calls and processing every single user request at 15ms 95p at 130 req/s. Not sure if that's a scale
The async model in Ring is kind of hard to use and is sort of "fake". But running Jetty in production -- via the default Ring adapter -- scales pretty well without needing fancy async stuff.
βοΈ this
well that is nice news, I have some mileage with jetty from previous lives
Personally, I'm still dubious about core.async in general - maybe the code we write just doesn't need it or something, but our IO heavy services just use good old thread pools (with sugar provided by claypoole).
assuming your user requests can be either paritioned predictably or do not have any node affinity, you can always scale horizontally, not sure about the perameters here yet
in that case it really is a "where do you place value" problem
lets say you can horizontally scale, but it costs the company 5x as much than if you wrote it high performance
yeah, like I said, at this point not sure if horizontal scaling will be simple or not
is living with worse code worth your life energy
: )
@mbjarland At work, almost all our services and apps are stateless and we run three instances of most of them. I wouldn't say we are "high scale" but we have thousands of concurrent users 24x7. And that's on Jetty, without much multi-threaded assistance.
when in a year or 2 you get loom for free
(or more, eternal pessimism)
I'm old enough to value my life energy
and all your async work is now useless
@seancorfield very useful information, thank you
(assuming requests do just delegate to other servers, that would probably be the case)
also that model seems alluringly simple and we would love to not complicate things if not necessary
I remember a relevant talk, uno momento
Old ops mantra is to have two of everything, no matter how performant ;-) Your super scalable is not so scalable if it has to go down when deploying new code
I was thinking of the cognitect aws api as an example of a idiomatic api
have not looked at it much but I have to say I love the data only approach...and no completable future in sight
also very repl-self-documenting which is nice
An anecdote about concurrency: years ago, when we were early on our Clojure path, we built a process that scanned our DB for updates, ran searches against a (proprietary) search engine, and then produced HTML emails. It worked nicely but we were curious about how much volume we could run through the system so I turned a few map
calls into pmap
calls... and crashed the search engine because it couldn't handle the volume. We ended up standing up two more search engine instances, just for this process. When we started to analyze the effectiveness of sending (by that point) millions of emails a day, we figured out that we got better "bang for our buck" by sending about an order of magnitude fewer emails and being more targeted about the audience (and the content) -- and at that level, we didn't need the pmap
's (and, ultimately, we didn't need those extra search engine instances). Sometimes, "scale" is not what you really need π
exactly; from a technical perspective it is cool and interesting, but from a business perspective it is not always what you're after - it is key to get your engineer to think towards the business focus
I don't buy the "boring code" argument. This problem is not essential to reactive/functional programming, it's purely about syntax, and it's only an issue in languages with poor metaprogramming support. The infamous snippet shown around 9:33 could be written like this :
(require '[missionary.core :as m])
(m/ap
(let [user (m/? (find-user-by-name ws name))
cart (m/? (m/aggregate conj (load-cart user)))
uuid (m/? (pay (transduce (map get-price) + cart)))]
(m/? (send-email (m/?= (m/enumerate cart)) uuid))))
It's arguably as much readable and maintainable as the "boring" version, concurrent email sending is made explicit, and it's still fully non-blocking.yeah this is the one^
ok I have an interrupt, thank you everybody, this was enlightening. The clojure community is really one of the many reasons to love this language
Thanks!
@emccue That talk looks interesting. Added to my "Talks to Watch" collection π
oh and I will watch the above, thanks @emccue
The only reason I pass as halfway competent is that I watch a bunch of talks
This one is by the person who wrote Reactive programming with RxJava
(I had to skim to find the reason I thought it had some authority)
https://github.com/funcool/promesa aims to work with completable futures idiomatically in Clojure. I think it does the job rather well
doesn't solve the code being async-colored
That is a great talk -- thanks for the pointer @emccue!
Late to the party, but we use manifold for async and it works well. We use yada/aleph for our service APIs, so it was a natural fit. https://github.com/aleph-io/manifold/
Any chance you can share the container model? I.e. are you running this inside of say jetty as well or are you running an aleph server which is world facing?
For our public APIs, the aleph endpoints are fronted by a pretty standard loadbalancer setup (with some machinery to handle dynamically-allocated instances). Aleph is built on Netty, so Iβm not sure how youβd use it with Jetty anyway. Our internal services also use Aleph, but those endpoints get routed to directly by the service mesh.