flambo

sorenmacbeth 2016-11-15T20:55:24.000022Z

you should be able to define and use flambo in a REPL connected to a cluster

sorenmacbeth 2016-11-15T20:55:30.000023Z

I do this all the time

sorenmacbeth 2016-11-15T20:55:43.000024Z

@mtijerina ^

sorenmacbeth 2016-11-15T20:56:32.000025Z

one key is that if you define it in a repl, you need to use anonymous f/fn's for you operations

sorenmacbeth 2016-11-15T20:56:43.000026Z

those are serializable

sorenmacbeth 2016-11-15T20:57:39.000027Z

if you def or defn in the user namespace, those won't exist on the cluster worker nodes when you try to execute them

sorenmacbeth 2016-11-15T20:57:55.000028Z

so you can do stuff like:

sorenmacbeth 2016-11-15T20:57:56.000029Z

(def whatever (f/text-file sc "<s3n://something>"))
(def res (-&gt; whatever (f/flat-map (f/fn [x] ...)) (f/reduce (f/fn [x y] (merge x y))) f/collect))

sorenmacbeth 2016-11-15T21:02:06.000033Z

something I do as well to make working in a repl on a cluster easier, is I make a namespace in my uberjar that has a bunch of commonly used functions, such as date/time stuff

sorenmacbeth 2016-11-15T21:02:24.000034Z

in my REPL, I switch into that workspace and work there

sorenmacbeth 2016-11-15T21:02:46.000035Z

that way, those common functions do exist on the worker nodes