flambo

jrotenberg 2016-05-20T19:31:55.000045Z

@ccann: how are you creating the schema?

jrotenberg 2016-05-20T19:37:46.000046Z

seems like the dataframe api is way better for this on both ends

jrotenberg 2016-05-20T19:38:20.000047Z

like

jrotenberg 2016-05-20T19:38:23.000048Z

way better

ccann 2016-05-20T19:51:03.000049Z

(:import [org.apache.spark.sql.types DataTypes StructField StructType])
(defn my-schema
  []
  (let [coll [(StructField. "id" DataTypes/FloatType true (empty))
              (StructField. "field_a" DataTypes/FloatType true (empty))
              (StructField. "field_b" DataTypes/StringType true (empty))]
              fields (into-array StructField coll)]
    (StructType. fields)))

ccann 2016-05-20T19:51:52.000052Z

^ e.g. @jrotenberg

jrotenberg 2016-05-20T19:57:05.000054Z

cool

jrotenberg 2016-05-20T19:57:27.000055Z

i think i figured out a (really hacky) way to create it dynamically

jrotenberg 2016-05-20T19:58:23.000056Z

using the the json reading stuff

jrotenberg 2016-05-20T19:58:26.000057Z

not pretty

jrotenberg 2016-05-20T19:58:27.000058Z

but then

ccann 2016-05-20T20:03:45.000059Z

ah nice

ccann 2016-05-20T20:04:22.000060Z

working with spark dataframes from clojure has been a nightmare, for what it’s worth 🙂

ccann 2016-05-20T20:04:36.000061Z

I’m being dramatic, but it’s not very pleasant

jrotenberg 2016-05-20T20:17:30.000062Z

yeah

jrotenberg 2016-05-20T20:18:03.000063Z

basically all of our code that gets touched by anyone else is in scala right now

jrotenberg 2016-05-20T20:18:16.000064Z

so there are other nightmares to be had

jrotenberg 2016-05-20T20:18:36.000065Z

a nightmare for every season

jrotenberg 2016-05-20T20:20:30.000066Z

but i just inherited a codebase from a guy who is bailing to do something more interesting

jrotenberg 2016-05-20T20:20:35.000067Z

lucky bastard

sorenmacbeth 2016-05-20T23:36:09.000068Z

@ccann: nightmare because it requires a ton of java interop?