onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
lucasbradstreet 2018-04-29T01:10:38.000063Z

ZooKeeper is running on localhost:2181?

sparkofreason 2018-04-29T01:12:02.000014Z

Yes

lucasbradstreet 2018-04-29T01:13:00.000075Z

If you telnet to it and type “ruok” does it respond? Looks like a pretty clear connection issue

sparkofreason 2018-04-29T01:16:14.000043Z

It responds.

lucasbradstreet 2018-04-29T01:19:39.000043Z

K. That’s pretty odd then. Does it ever work?

sparkofreason 2018-04-29T01:20:53.000043Z

It always happens after a certain amount of work is sent through onyx. Since I'm working locally, ZK is used for checkpoints. perhaps related to that?

lucasbradstreet 2018-04-29T01:22:51.000076Z

Yeah, that would make sense, especially if you’re using windows.

lucasbradstreet 2018-04-29T01:23:21.000009Z

There’s a 1MB limit on the ZK znode size, and also if you’re not using onyx.api/gc-checkpoints then you will keep adding data to ZK

lucasbradstreet 2018-04-29T01:25:09.000062Z

There’s a reason that insanely is in the config option to override it. I don’t really have a great way to prevent people from blowing their feet off there

sparkofreason 2018-04-29T01:25:41.000007Z

Any recommendations for other checkpoint storage to use for local dev?

lucasbradstreet 2018-04-29T01:27:13.000076Z

Others have had luck with a local S3 server written in go. I forget what its name was.

lucasbradstreet 2018-04-29T01:27:25.000009Z

This one I think? https://github.com/minio/minio

lucasbradstreet 2018-04-29T01:28:20.000068Z

Maybe we should prevent the ZK override and just push people towards minio

lucasbradstreet 2018-04-29T01:28:50.000011Z

You’ll still want to use gc-checkpoints though, because the checkpoints will keep growing otherwise

lucasbradstreet 2018-04-29T01:30:28.000108Z

You can override the s3 endpoint setting to point to the local minio endpoint http://www.onyxplatform.org/docs/cheat-sheet/latest/#peer-config/:onyx.peer/storage.s3.endpoint

sparkofreason 2018-04-29T01:30:43.000009Z

I'll give it a try. I am running checkpoint GC. but do have a lot of state being held, so it seems likely I'm drowning ZK.

lucasbradstreet 2018-04-29T01:31:06.000108Z

Mmm

lucasbradstreet 2018-04-29T01:45:24.000001Z

Please let me know if it works out. Thanks!

sparkofreason 2018-04-29T01:47:22.000068Z

Will do. Thanks for your help.

sparkofreason 2018-04-29T03:25:27.000014Z

@lucasbradstreet minio worked great. Many thanks, that solved a huge headache.

lucasbradstreet 2018-04-29T03:31:30.000014Z

Awesome. I might just disable the ZK windows feature and force people to use it instead.

👍 1
2018-04-29T09:38:38.000068Z

i have ran into various problems in dev with storing ABS state in zookeeper as well.. took me a while to realize the cause for my crashing ZK nodes

2018-04-29T09:40:38.000029Z

btw i'm reading the onyx-plugin template, and it's still a bit unclear to me what the difference between synced? and completed? means. sementally, i would say that the difference is that completed? means our internal buffer is empty, and synced? means that the remote database/whatever has actually written everything to disk ? still, it's a bit unclear to me what i should do where

2018-04-29T14:24:46.000013Z

in other news: i've just pushed the first version of a Google Cloud Pub/Sub input and output plugin live, available here -- https://github.com/solatis/onyx-google-pubsub would love it if someone gave it a test ride as well, and provide some feedback / review the code. it's modeled after the amazon SQS plugin.

Aleh Atsman 2018-04-29T14:47:35.000052Z

Can someone please explain to me, how does redeployment work in onyx? Should i just build new version uberjar and restart all my nodes one by one? Should i do some work manually, like stoping tasks etc?

2018-04-29T15:18:45.000071Z

@aleh_atsman you most likely want to use ABS snapshots and recover state of your jobs

2018-04-29T15:19:42.000069Z

^ you can do that using resume points

Aleh Atsman 2018-04-29T15:24:41.000194Z

thx i see, so i need to change my job and add resume-point to it before submit

lucasbradstreet 2018-04-29T18:31:55.000009Z

@aleh_atsman yes, you can setup a basic structure when you build job to only add the resume point if one exists

lucasbradstreet 2018-04-29T18:32:15.000109Z

Though for certain jobs you should probably just fail there because it’s probably an error if it can’t find one on a redeploy

lucasbradstreet 2018-04-29T18:32:29.000066Z

I might bang up an onyx-examples sample soon.

joelsanchez 2018-04-29T18:43:19.000103Z

I have a few "paths" in my workflow which don't depend on each other, but I need them to arrive at a certain order in the final step do I need to have just one "path" in order to have this order?

joelsanchez 2018-04-29T18:44:17.000105Z

customers need to be transacted before addresses, for example

lucasbradstreet 2018-04-29T18:47:31.000049Z

That is certainly one way (probably the simplest way) to go. Another way would be to collect messages in a window and flush them as the dependencies are sorted out. I wouldn’t recommend this unless you have other reasons to do so though.

lucasbradstreet 2018-04-29T18:47:56.000027Z

e.g. it’s really hard to handle the dependencies in other ways.

lucasbradstreet 2018-04-29T18:48:31.000058Z

You can scale things out in a linear workflow like the above one by using :onyx/group-by(fn,key) which will make sure all of your messages with a certain key ends up on the same peer

joelsanchez 2018-04-29T18:49:16.000093Z

yes, I was worrying about scalability. I'll follow your advice 🙂

lucasbradstreet 2018-04-29T19:01:43.000070Z

Don’t be afraid to perform multiple operations in one task though. It’s better not to have too many tasks doing small things. Better to have slightly fatter tasks that you scale out in a parallel way

jqmtor 2018-04-29T19:03:55.000059Z

Maybe the docstrings in the protocol definitions help with understanding both functions a little better? https://github.com/onyx-platform/onyx/blob/0.12.x/src/onyx/plugin/protocols.clj#L28-L33

jqmtor 2018-04-29T19:04:16.000046Z

@lucasbradstreet let me know if you have other recommendations for me. I hope I am not being too annoying with these requests 😛

lucasbradstreet 2018-04-29T19:04:56.000084Z

s/annoying/awesome/

🙂 1
lucasbradstreet 2018-04-29T19:05:29.000013Z

I’ll have a look for a next task. One thing I’ve been wanting to do for a while is include onyx-examples for onyx/type reduce, as well as a resume point example.

jqmtor 2018-04-29T19:08:51.000033Z

that sounds good! I'll gladly look into that if you want.

lucasbradstreet 2018-04-29T19:10:08.000082Z

Great! Let’s start with onyx/type :reduce

lucasbradstreet 2018-04-29T19:10:18.000016Z

One sec.

lucasbradstreet 2018-04-29T19:11:15.000015Z

The first type I’m thinking is where you have a reduce task as a terminal node (rather than an output task) https://github.com/onyx-platform/onyx/blob/0.12.x/test/onyx/windowing/reduce_test.clj

lucasbradstreet 2018-04-29T19:11:37.000038Z

The idea there is that you don’t have a plugin set, and you collect into a window and then use something like trigger/sync to output

lucasbradstreet 2018-04-29T19:12:00.000109Z

If we could have an example like that in the style of https://github.com/onyx-platform/onyx-examples, that’d be great

lucasbradstreet 2018-04-29T19:12:20.000127Z

Then I think we can link to it from the reduce docs too

lucasbradstreet 2018-04-29T19:12:45.000104Z

I have another type of onyx/type :reduce example for you, but I’ll wait until this one is done before describing it.

lucasbradstreet 2018-04-29T19:14:12.000098Z

I provide that reduce-test as an example of what’s going on, but you may want to change the example in ways that explain what’s going on better.

jqmtor 2018-04-29T19:18:37.000023Z

Thanks for the explanation! 👍

lucasbradstreet 2018-04-29T19:23:42.000090Z

No worries. Will be around if you have any more questions.

joelsanchez 2018-04-29T20:43:19.000048Z

onyx is fantastic, yesterday I knew basically nothing and today I'm importing data from Prestashop into my Datomic-backed app at the speed of light

💯 7