i'm trying to use kafka's command line utils to check the offsets for a topic that onyx is reading from
but i can't get it to return any results, is onyx 0.12 using consumer groups to track offsets?
or am i just missing something obvious?
@chrisblom It does not use Kafka's offset tracking, no. It commits offsets with the checkpoint object to S3.
You can use the resume points API to grab what's in the checkpoint for a job and find the offset.
thanks
No problem -- let me know if you get stuck, happy to help further.
So for my understanding, onyx-kafka performs manual offset control?
hello all!
@eelke Yep.
out of curiosity, how does Onyx relate to or use the “exactly-once” semantics in Kafka 0.11? https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/
We need to control the time of checkpointing and restoration, and also colocate the offset with the rest of the checkpoint for atomicity.
@schmee Onyx supports exactly once aggregations, irrespective of whether you're using Kafka 0.11 or below.
Kafka's "exactly once" supports transactions across topics. Onyx (and same with Flink/Spark etc) exactly once has to do with how data is aggregated.
Bit of an overloaded term tbh
ok, so Onyx does not leverage that feature in Kafka in any way?
Not yet, no. It's on our road map to add. The idea is that Onyx would be able to ingest data from Kafka, and also commit data to another topic transactionally.
sweet, thanks for the info 👍
Sure thing!
Onyx 0.12 has officially been released 🎊
Change notes in here: https://github.com/onyx-platform/onyx/blob/0.12.x/changes.md#0120
Lots of good stuff, a few breaking changes to prepare for the future. These will settle down now.
New onyx/task type: :reduce
.
This is a big deal as it allows you to create tasks that do no not emit their transformed segments downstream. Great for tasks that use :trigger/emit
.