onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
2019-03-03T14:33:25.006200Z

Hi everyone, I am new to onyx Wanted to know what is the appropriate way to synchronize multiple streams in onyx ? (I have multiple streams of data segments whose ids maybe similar across streams and I want to emit a segment that is the merge of the latest received version of all related segments from different streams rather than the original segments themselves)

2019-03-03T14:58:33.006500Z

that's actually easier asked than done 🙂

2019-03-03T14:58:45.006900Z

what you need is basically multi-stream deduplication, right ?

2019-03-03T14:59:10.007400Z

what i found works best in those situations is to use a "staging area", where both streams write to (e.g. s3)

2019-03-03T14:59:21.007700Z

then periodically read from this staging area and apply de-duplication

2019-03-03T15:01:28.008300Z

alternatively, an approach that might work for you is to keep track of "seen" ids