Unpopular opinion: after working with Kafka for some time, I'm having trouble thinking of it as anything else as a horrible piece of software. Most of the time, it's compared against RabbitMQ or other message brokers. Then, you start to use it, and it's not a message broker - there's no retry or dlx mechanism, nor individual commit of messages (making retries really hard). It's consumer is also not thread-safe, so you have to poll and treat messages in a single thread... Except that this also does not work, because if your consumer is slow, Kafka drops you (even if it's still conected), so you have to poll from time to time; but well, it's not thread-safe, so you have to poll on a thread, send messages to another thread, coordinate things to ack (commit) on the "poll thread", and handle states like "pause/unpause". It also lefts to clients how to handle when new clients connect or disconnect, defaulting to a very poor implementation of "wait everyone to finish"... Am I missing some magic that makes Kafka so popular? If someone knows, please tell me...
@mauricio.szabo I have been knee deep in Kafka for a few weeks looking at rebalancing behavior, and understanding how the timeouts compose with each other is complex
The full lifecycle of consumer groups is never described anywhere
But you can find bits and pieces by reading source code, KIPs and config docs
my understanding was that a consumer group was a mapping from a topic to a known offset, but what you say makes me think I'm missing something
oh, it also maps clients to partitions within that group - that is complex
I never used kafka in a way that relied on that aspect
A consumer group is a set of offsets into topic partitions, but the consumers that constitute a group are dynamic and change during a failure or deploy. How that works is tricky but critical to grok
there are a lot of features in kafka that seem like attractive nuisances - the high level description sounds like a good thing to have for your app, but the way it maps to the system of behaviors in kafka turn them into tarpits (or maybe nobody I've worked on kafka apps with are using the features quite right)
someone more qualified than me should write "kafka: the good parts"
Yes, exactly - everything is incredibly low-level, and all the complexities that other systems did solve are delegated to clients. AFAIK there's no official library that handles all these edge cases, lifecycles, etc...
for example, I had an app that worked great with single partitions per topic and programmatically created topics taking their place, we had a long term plan to work with partitions if we were bottlenecked on throughput but that didn't happen
or maybe I'm remembering it slightly wrong because it was more rewarding working on my own multiplex, rather than trying to figure out the docs and config and missing source code around the one built in
Well, currently I'm facing a problem: I have slow consumers (they need to call a slow API for each message) and Kafka does not like at all slow consumers... I tried multiple libraries, most (if not all) suffer from not implementing something that Kafka wants... And I also have no idea how people work with Kafka on single-threaded envs like Node, or ones that have GIL like Ruby or Python (considering that you need to keep polling in background)
my understanding is that confirming read isn't meant to be a high level "my app state is that this message is done with", it's a low level "I'm letting the broker know that my offset can move forward" with a limited time to reply this is not just a bad design kafka wise (though they could maybe abstract it better). if you didn't have these sorts of timeout limits it provably could not provide the correctness guarantees that it pretends to offer this is annoying to deal with, but still easier than doing distributed consensus by hand yourself
It's not just commit. In fact, commit is the least of the problems. Is the low level API, the non-threadsafe but you need to use threads, the pause/unpause dance, the API for healthcheck where you need to handle the gain and loss of partitions...
When you combine all of these, and the lack of a higher level API, it's incredible awkward to work with, and seems like Kafka was sold like a Ferrari but in the end is more like a "here are all the pieces, assemble yourself, and the instruction manual is fragmented in multiple places, sites, and fรณruns"
I mean, they use C extensions and run it outside the gil?
I've heard a lot of good things about https://pulsar.apache.org/, and the architecture and usability seems far improved from surface level inspection. E.g. actual separation of storage and serving concerns, pull based messaging support so no more long polling, stateless broker and ease of replication, etc. Definitely less of a community compared to Kafka, but I'm wondering if anyone has had real experience and can chime in about whether or not it's as good of a straight upgrade as a lot of the material suggests it to be
It's already incredibly hard to coordinate everything on a high level language. Using C extensions to run outside the GIL and getting it right seems close to impossible
There is a GraalVM Polyglot implementation for Python which allows you to interop from Java/Clojure to Python. Even in that implementation (implemented on the GraalVM, which is a JVM) they use a GIL because else they could not support C extensions.
Specially because Kafka seems to think that all these manual coordination are a feature that you want to use and customize to your own liking
I really want to try pulsar. I had good experiences with rabbitmq and SQS (sqs only for smaller number of messages).
It is really off-topic, but I wanted to thank the Clojure community. The past two years have been wonderful thanks to you all. My first kid was born last Sunday, and I wanted honor Clojure and name him Rich or Richard, but I got a strong veto from my wife, so I went with the next best thing: Loan-ISaac Pham (whose initial will forever be LISP). :)
Congrats David!!
https://www.youtube.com/watch?t=296&v=UNucRn2uKlU&feature=youtu.be 4m56 that windows progress bar ๐
> 12.ย How important have each of these aspects of Clojure, ClojureScript, or ClojureCLR been to you and your projects? and in the answers "ease of development", together this question answer sounds like the old communist joke: Write an essay answering the following two questions: Who is your role model? Why Stalin?
I'll be taking note of any negative responses and you will be punished
with a JIRA ticket?
Ouch!
Alex knows what I'm referring to probably ;)
๐
seriously though, like many of the questions/answers we have retained them over many years to get longitudinal data and I think this dates back to the very early years when Chas was running it
https://cemerick.com/blog/2011/07/11/results-of-the-2011-state-of-clojure-survey.html - you can see this in the "What have been the biggest wins for you in using Clojure?" question
https://clojurians.slack.com/archives/C03RZGPG3/p1613484780385100 I wish.
that's all relative, I don't have the same experience, never had
I think preferences can be updated, changed
:)
And you must realize how much of a strong signal is to everyone who feels differently. It makes it an either-or question, no room for subtlety. I either already like and say it's important, or I can say that it's not important. Seems weird that one has to reason why a survey question need to cover all possible answers.
There is enough room in the survey for free text. If one looks as this as non-benevolent propaganda, you can probably find what you're looking for.
well, as it says at the top, the only required questions are the first 2, you can skip it
Not going quite there to "propaganda" ๐, it reminded me of the joke, not the oppression ๐ I can totally agree that ease of development is part of the clojure experience, but it's not an overall feature and some things are missing that were not important in 2015.
So if you asked me several years ago if ease of development is something that I gained by adopting clojurescript, I would've said yes because there is much less incidental complexity, but today ease of development means competition with typescript and vscode, that is, that language's sole purpose is "ease of development", while you can do everything they do, I am not sure it's worth the effort. But using typescript it's "easy", so obviously, even though I don't like typescript, I can't say that clojurescript is anywhere close to easy to use compared to that ecosystem.
also, wizards saying that using spells is easy doesn't count.
(abracadabra problem)
;=> "Solved"
I'm expecting this discussion to turn into an argument about simple vs easy now
who I am to argue anyway
I would take simplicity of development over easy any time of the day.
exactly, but it's not really an either or, that's why i dared to mention it ๐
True, it's not either / or, but it can be a priority thing.
@ashnur > wizards saying that using spells is easy rhickey is on record comparing clojure to a cello vs. a piano (a tool that requires much more work to find fluency in, in exchange for a kind of simplicity of mastery)
framing ts as easy and cljs as simple is kinda priming and begging the question here i think. also not how i would frame it
@noisesmith that makes only sense if we forget that clojure is software, its "shape" is much more malleable
of course not all musical problems call for cello (but they don't all call for piano either) I like this comparison because (hey this is "#off-topic" right?) I see a similarity between the way type systems aid and constrict development and the way fixed pitch instruments like piano and and constrict music
@ashnur in terms of resulting behavior sure, but in terms of development flow and experience, languages are not equal or equivalent
otherwise we could all save ourselves a lot of hassle and write everything in the one objectively perfect language
so, Clojure?
sorry, I apologize for on-topic comments in #off-topic
The โWhat is on-topic for #off-topic?โ debate is a classic as well. I personally love it.
who said they are equal? I only said that it doesn't make sense to make this harsh distinction, since the way we use programming languages is completely different than how we use musical instruments. Obviously there are differences, otherwise we wouldn't use different names to them... if they are equal, there is no way to tell them apart, so using different names would be only by personal preference...
Oh, I sorry I misquoted that, that wasn't on purpose.
I didn't mean the distinction as harsh, but rather as extremely important. when I'm turning my ideas into running processes I want the set of abstraction tools that best fits the domain.
or perhaps we are talking about different domains - I was talking about the domain of using the programming language not the domain of the program written (though there is something to be said for making those align)
I removed the message as it didn't add much anyway.
that debate is definitely on-topic for off-topic
makes sense, it was originally phrased like that but then it morphed
everyone here (> 80%) loves the experience working with clojure, but I think this also means that those who don't quite do the same things the same way or need to adapt to other constraints as well, are basically filtered out before they (we) can have a voice that is anywhere close to authentic
the reason I use clojure is that it fits the way I want to work, if I want strong static types I switch to ocaml, for low level stuff I'm loving zig lately
the discussion with static typing is different, typescript is not about static typing because they don't have that at runtime
my only argument is that simple and easy are not mutually exclusive. One is about opportunity cost and the other about maintenance and development costs. They can't even be paid by the same currency, just to drive home that analogy.
static typing is by definition not runtime
if you say so
I never said simple an easy were mutually exclusive, and I'm confused by the idea that "static" could mean something other than "during a build step"
I mean, OCaml or Haskell don't have run time type checks (unlike js and the jvm which do)
and I could be confused, but my understanding was this distinction was what static typing referred to
what are you referring to as "authentic" above?
just the dictionary definition
"credible", "convincing"
oh, so people won't be listened to / won't be taken seriously because they want some behavior or experience clojure doesn't offer?
are you trying to annoy me with that reading of what I said?
I am talking about a delicate shift in perspective in a complex domain. Please don't try to reduce it to some simplistic linear narrative.
no, I'm trying to understand you, instead of being annoyed
Do you know what the difference between complex and complicated is?
I still am not sure what shift of perspective or complex domain you are talking about - I thought I understood but now I'm not sure
no, please explain
pm?
sure - btw based on google complex vs. complicated looks like the sort of thing I'd use systems language to address (first order vs. second order systems)
yes
btw I was confused by the usage of "authentic" above because to me authenticity is about veracity of some type, and I wasn't able to map what would be real vs. fake in that context
I want to be able to dynamically register subscriptions on a stream of incoming messages. Incoming messages are assumed to be a flat map, and subscriptions can match either exact values on the keys of the incoming message, or a set of possible values.
;;; subscriptions
{1 {:x :a ;;:x must be :a, :y must be :b
:y :b}
2 {:x #{:c :d} ;;x can be :c :or :d
:y :b}
3 {:y :b}
4 {:y :b :d 2}}
;;incoming messages
{:x :c, :y :b} => matches 2, 3
{:y :b} => matches 3
{:x :a} => matches nothing
Is there a term for doing this kind of matching and/or a way to keep the subscriptions organized so that I can add more subscriptions while keeping the computation to go from message->subscriptions optimal? I could figure out a way to do this but I have feeling this might be a thing already that I just donโt know the name of?This seems related to and probably a subset of rules engines or datalog/EAV stores but I'm wondering if there's something more basic or specific to just keeping of tree of ANDs and ORs
there's also similarity to graph style execution (eg. what make(1)
and plumatic/plumbing do), except those are single dispatch (deep and not wide) and it seems like you want full traversal
this is also the dominant pattern in audio synthesis: you make a graph of which nodes process the result of which other nodes, you start with your output node, and walk up the graph to find all the things to calculate, then propagate the data back down the graph
rete is sort of like a database in reverse. in a database you have a bunch of data and you want to organize it for fast lookup when you are given a query, with rete you have a bunch of rules and you want to organize them for fast matching when you are given data
cool
http://www.clara-rules.org/ might be something to look at
Daaaang now Iโm looking for an excuse to use thatโฆ I guess itโd be useful for core.logic stuff?