Hi everyone, I am new to onyx Wanted to know what is the appropriate way to synchronize multiple streams in onyx ? (I have multiple streams of data segments whose ids maybe similar across streams and I want to emit a segment that is the merge of the latest received version of all related segments from different streams rather than the original segments themselves)
that's actually easier asked than done 🙂
what you need is basically multi-stream deduplication, right ?
what i found works best in those situations is to use a "staging area", where both streams write to (e.g. s3)
then periodically read from this staging area and apply de-duplication
alternatively, an approach that might work for you is to keep track of "seen" ids