powderkeg

cgrand 2017-03-09T10:35:03.273019Z

@viesti can’t imagine how this Iterable/Iterator regression was unavoidable. Breaking stuff for fun

cgrand 2017-03-09T10:35:10.273602Z

(and employment)

viesti 2017-03-09T10:45:19.326469Z

resisting to say something about Scala in general :)

viesti 2017-03-09T10:58:20.398010Z

another thing that I haven't made clear to myself is Spark runtime version vs version linked into the app

cgrand 2017-03-09T10:58:43.400119Z

?

viesti 2017-03-09T10:59:33.404672Z

can app linked with 2.1.0 run in a cluster running 1.5.0 for example

cgrand 2017-03-09T11:00:19.408906Z

depends on when classes are resolved

cgrand 2017-03-09T11:00:32.410351Z

Working on that right now

cgrand 2017-03-09T11:02:13.419686Z

(defmacro ^:private compile-cond [& choices]
	(let [x (Object.)
	      expr
	      (reduce (fn [_ [test expr]]
                  (when (eval test) (reduced expr))))
	        x (partition 2 choices)]
	  (when (= x expr)
	    (throw (ex-info "No valid choice." {:form &form})))))

cgrand 2017-03-09T11:03:07.424596Z

with that you could ship an app oblivious to Spark version as long as powderkeg is not aot compiled

cgrand 2017-03-09T11:03:28.426565Z

else keg would hardcode the spark used during aot

cgrand 2017-03-09T11:47:08.646166Z

Having to ship powderkeg 2.10 and 2.11 is no fun, neither is asking user to add chill

cgrand 2017-03-09T11:47:13.646456Z

any idea?

viesti 2017-03-09T12:00:36.712704Z

Scala binary compatibility :picard-facepalm:

viesti 2017-03-09T12:02:12.721500Z

flambo seems to support only 2.10

viesti 2017-03-09T12:03:32.728833Z

http://spark.apache.org/downloads.html says: Note: Starting version 2.0, Spark is built with Scala 2.11 by default. Scala 2.10 users should download the Spark source package and build with Scala 2.10 support.

cgrand 2017-03-09T12:07:58.750101Z

raaaaah

cgrand 2017-03-09T12:08:43.753767Z

can we detect scala version at runtime?

viesti 2017-03-09T12:08:54.754524Z

flambo seems to have 0.8.0 for spark 2.x and 0.7.2 for spark 1.x

viesti 2017-03-09T12:08:56.754667Z

😄

viesti 2017-03-09T12:09:18.756428Z

guessing that thy just dropped with 0.7.2 🙂

viesti 2017-03-09T12:09:36.757836Z

hmm

cgrand 2017-03-09T12:09:41.758218Z

(I’m thinking about shading chill twice and using the right one)

viesti 2017-03-09T12:10:47.763312Z

hmm, is it even possible to load chill conditionally?

viesti 2017-03-09T12:16:30.790899Z

user=> (import 'scala.util.Properties)
scala.util.Properties
user=> (scala.util.Properties/versionString)
"version 2.11.8”

viesti 2017-03-09T12:17:03.793520Z

found from http://www.scala-lang.org/old/node/7532

viesti 2017-03-09T12:18:02.798617Z

@cgrand it seems to be the way to detect Scala runtime version: http://stackoverflow.com/a/6968014

viesti 2017-03-09T12:19:31.806060Z

on current powderkeg:

user=> (scala.util.Properties/versionString)
"version 2.10.4”

cgrand 2017-03-09T12:21:58.818087Z

in fact chill is a dep of spark itslef so I can remove it

viesti 2017-03-09T12:23:08.823974Z

ah, neat, was already thinking of classloader magic http://stackoverflow.com/questions/11759414/java-how-to-load-different-versions-of-the-same-class

viesti 2017-03-09T12:24:15.829292Z

this would be quite neat actually, to as a user be able to select spark version

viesti 2017-03-09T12:24:59.832962Z

going to make a snack for the kids now

cgrand 2017-03-09T12:25:34.836033Z

I’m 1h ahead so lunch is long due 🙂

viesti 2017-03-09T12:39:51.910329Z

apropo, saw this related to DataSet/DataFrame http://spark.apache.org/docs/latest/sql-programming-guide.html#creating-datasets

cgrand 2017-03-09T13:22:55.149248Z

yeah that’s what I used

viesti 2017-03-09T13:25:57.168956Z

have to learn to read more carefully 🙂

cgrand 2017-03-09T13:27:55.181259Z

I’m not happy with the fact that I lose schema metadata too often and can’t always reconstruct a spec for the resulting dataset

viesti 2017-03-09T14:29:25.658258Z

yup, but seems promising for taking over DataFrames/Dataset/MLlib 🙂

cgrand 2017-03-09T15:12:37.083935Z

I have quickly looked at Travis CI documentation and it can spawn containers

viesti 2017-03-09T15:30:47.271223Z

haven't used Travis myself, but have heard good things about it

viesti 2017-03-09T15:34:35.311099Z

https://docs.travis-ci.com/user/docker/ and https://circleci.com/docs/1.0/docker/ look similar at start :) (enabling docker service)

viesti 2017-03-09T15:35:22.318892Z

Circleci autodetects clojure projects and runs lein test, travis might do same

viesti 2017-03-09T15:36:11.327242Z

is it better to run tests against a container than in local mode?

viesti 2017-03-09T15:36:54.334693Z

answering to myself, could test 1.x and 2.x Cluster that way

cgrand 2017-03-09T15:43:13.399972Z

yes it’s definitely better because local mode share the VM and most classloaders so it hides bugs

cgrand 2017-03-09T15:44:14.411630Z

PoCing transducers on spark took me one day (never touched spark before) in local mode

cgrand 2017-03-09T15:44:55.419960Z

Everything else was figuring out how to have it run on a cluster.

viesti 2017-03-09T15:47:41.450964Z

yup

viesti 2017-03-09T15:48:33.459991Z

hmm so we could use this https://github.com/gettyimages/docker-spark

cgrand 2017-03-09T16:00:16.590906Z

@powderkeg I merged spark2 and spark1.5 code in https://github.com/HCADatalab/powderkeg/tree/spark2, I had local networking issues today which prevented me from testing. Please try on your own

cgore 2017-03-09T18:44:16.319637Z

I’m getting the following on the spark2 branch, with Spark 2.1.0 running: CompilerException java.lang.ClassNotFoundException: com.twitter.chill.java.RegexSerializer, compiling:(carbonite/serializer.clj:1:1)

cgore 2017-03-09T18:50:12.378019Z

And a bit worse after a lein clean

cgrand 2017-03-09T19:04:07.519669Z

lein with-profile +spark2 repl

cgore 2017-03-09T20:14:02.210027Z

oops

cgore 2017-03-09T20:14:08.211001Z

yeah, that helps 😄

cgore 2017-03-09T20:14:21.213010Z

Now I get this error, further along:

cgrand 2017-03-09T20:16:50.235674Z

Ok it looks like I botched the macro…

cgore 2017-03-09T23:08:45.717096Z

@gene

cgore 2017-03-09T23:36:05.886698Z

That looks like it’s working for me now.