you should be able to define and use flambo in a REPL connected to a cluster
I do this all the time
one key is that if you define it in a repl, you need to use anonymous f/fn
's for you operations
those are serializable
if you def
or defn
in the user
namespace, those won't exist on the cluster worker nodes when you try to execute them
so you can do stuff like:
(def whatever (f/text-file sc "<s3n://something>"))
(def res (-> whatever (f/flat-map (f/fn [x] ...)) (f/reduce (f/fn [x y] (merge x y))) f/collect))
something I do as well to make working in a repl on a cluster easier, is I make a namespace in my uberjar that has a bunch of commonly used functions, such as date/time stuff
in my REPL, I switch into that workspace and work there
that way, those common functions do exist on the worker nodes