onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
gklijs 2018-05-11T06:48:32.000290Z

I just read the intro. We actually just had a discussion about this at work. Is there any chance to use it in a non-cloud environment?

dbernal 2018-05-11T12:39:28.000427Z

@lucasbradstreet thanks so much! That makes a lot of sense

manderson 2018-05-11T13:47:45.000458Z

šŸ‘ Congrats on releasing Pyrostore. Looks awesome! It's the exact tool to fit the architecture I've been working on for the last couple of years. Wish I had it when I started šŸ™‚ Keep up the great work!

šŸ‘ 2
michaeldrogalis 2018-05-11T14:43:12.000607Z

Thanks @manderson!

nrako 2018-05-11T16:05:37.000203Z

For Pyrostore, I am trying to understand a bit more. I see this on the blog-

Pyrostore's consumer reads records directly out of cloud storage, and it's intelligent enough to cross its reads back into Kafka when records are not yet available in the archive.
I donā€™t think I follow what cross its reads back into Kafka means. Is Pyrostore intended to be a Kafka-history-stream alongside of Kafka to be used by select consumers? Or would you envision all consumers read from "cloud storage" as a scalable stream-with-cost-effective-scalable-storage around Kafka? Both/and?

michaeldrogalis 2018-05-11T16:51:37.000108Z

@nrako Both, sort of. The archive in cloud storage will always be a little behind on what's actually in Kafka because, physics. Consumers can choose the policy for which storage they read out of when the records it wants exists in both.

michaeldrogalis 2018-05-11T16:52:02.000349Z

It let's you trade-off read scalability (better against the cloud), latency (better against Kafka), availability (probably better against the cloud), etc.

nrako 2018-05-11T17:07:49.000357Z

I see. Thanks for the note. So Pyrostore proposes to be an infinite, cost-effective replication of the Kafka stream, and where the Pyrostore consumers are subscribed (whether Kafka itself or archive) is a configurable implementation detail.

lucasbradstreet 2018-05-11T17:11:50.000604Z

Thatā€™s a pretty good summary, yes.

nrako 2018-05-11T17:27:02.000351Z

Sounds great. Just now reading Designing Data-Intensive Applications and thinking through what an implementation would look like. An infinite Kafka stream seems critical. Thanks for the feedback...

michaeldrogalis 2018-05-11T17:50:57.000299Z

Great book šŸ™‚

eoliphant 2018-05-11T18:04:07.000787Z

Hi I have a quick conceptual question. Iā€™ve played around with onyx for some simple use cases. Iā€™m now looking into implementing something along the lines of calderwoodā€™s commander pattern. And just discovered thereā€™s already an onyx example šŸ™‚ My question is more around ā€˜unitā€™s of deploymentā€™ with onyx Letā€™s say I take your commander example, itā€™s my ā€˜accountsā€™ processor, all good. Now I want to add a ā€˜customersā€™ processor to the mix, keeping its state in its own datomic db. Is it simply a matter of a similar project that i jar up and point to the same zookeeper/kafka/etc ?

lucasbradstreet 2018-05-11T18:08:07.000439Z

@eoliphant Onyx is pretty flexible in this respect. The main thing is that the jar is started for a given tenancy contains all of the code necessary to run the jobs for that tenancy.

lucasbradstreet 2018-05-11T18:08:53.000276Z

@eoliphant so you could have two separate jars on two separate tenancies, each from a project that runs its own code. Or you could have a jar that is able to run code for both, on the same tenancy.

lucasbradstreet 2018-05-11T18:09:26.000633Z

Or lastly you could have a jar that can run code for both on separate tenancies, which gives you some more scheduling / node isolation.

eoliphant 2018-05-11T18:36:49.000710Z

ok that helps. In my case these guys are basically microservice/command processors. so yeah they have all the code they need for the commands they handle and processing events they may be interested in. So conceptually they should be relatively independent. Based on what youā€™re saying, it sounds like, for me, each service/processor should be in its own tenancy

lucasbradstreet 2018-05-11T18:37:52.000456Z

Sounds right. Itā€™ll be easier to schedule as you can just add more nodes to a tenancy as you wanna scale up

eoliphant 2018-05-11T18:38:03.000298Z

so beyond that, say these guys are dockerized, etc. Iā€™d just run 1 to n copies for reliability, etc?

eoliphant 2018-05-11T18:38:11.000168Z

gotcha

lucasbradstreet 2018-05-11T18:38:46.000674Z

Yeah, you can add more peers than you need so the job will continue running as nodes fail