flambo

mtijerina 2016-11-14T03:46:19.000020Z

I'm fairly new to Flambo and have been working locally in a REPL as well as spark submitting jobs to YARN. Recently, I've been trying to work in a REPL using cluster resources by performing a spark sumbit with either a class of clojure.main or one that starts an nREPL server.

mtijerina 2016-11-14T03:46:43.000021Z

However, I've been running into some issues when doing this. I have found that if I define anything in the REPL that is not a part of the uberjar or data in a RDD and try to use it in a flambo serializable function, it cannot be resolved during execution. I think this has to do with how things are serialized because those REPL defined values do not exist on the worker nodes. Just wondering if anyone has any feedback or suggestions?