Clojurians Log v2

Clojure programming

Channels

# 100-days-of-code # aatree # admin-announcements # adventofcode # ai # alda # aleph # all-the-channels # announcements # arachne # architecture # asami # atlanta-clojurians # atom-editor # autochrome-github # avi # aws # aws-lambda # babashka # babashka-sci-dev # bangalore-clj # beginners # berlin # biff # bigdata # bitcoin # boot # boot-dev # boulder-clojurians # braid-chat # braveandtrue # brevis # bristol-clojurians # business # calva # capetown # carry # cbus # cestmeetup # chestnut # chlorine-clover # cider # circleci # clara # clj-commons # cljdoc # cljfx # clj-http # clj-kondo # clj-on-windows # cljs-dev # cljs-experience # cljsfiddle # cljsjs # cljsrn # cljtogether # clojars # clojure # clojure-android # clojure-argentina # clojure-art # clojure-austin # clojure-australia # clojure-austria # clojure-bangladesh # clojure-bay-area # clojure-beijing # clojure-belgium # clojure-berlin # clojure-boston # clojure-brasil # clojurebridge # clojurebridge-ams # clojure-canada # clojure-chennai # clojure-chicago # clojure-china # clojure-colombia # clojure-conj # clojurecup # clojure-czech # clojured # clojure-denmark # clojure-denver # clojure-derby # clojuredesign-podcast # clojure-dev # clojure-dusseldorf # clojure-ecuador # clojure-egypt # clojure-estonia # clojure-europe # clojure-filipino # clojure-finland # clojure-france # clojure-gamedev # clojure-germany # clojure-greece # clojure-guangzhou # clojure-hamburg # clojure-hk # clojure-houston # clojure-hungary # clojure-india # clojureindia # clojure-indonesia # clojure-ireland # clojure-israel # clojure-italy # clojure-japan # clojure-kc # clojure-korea # clojure-losangeles # clojure-madison # clojure-mexico # clojure-miami # clojure-mk # clojure-mke # clojure-morsels # clojure-my # clojure-new-zealand # clojure-nl # clojure-nlp # clojure-norway # clojure-poland # clojure-portugal # clojure-provo # clojure-quebec # clojureremote # clojure-romania # clojure-russia # clojure-sanfrancisco # clojurescript # clojurescript-ios # clojure-sdn # clojure-seattle # clojure-serbia # clojure-sg # clojure-shanghai # clojure-spain # clojure-spec # clojuresque # clojure-survey # clojure-sweden # clojure-switzerland # clojure-taiwan # clojure-turkiye # clojure-uk # clojure-ukraine # clojureverse-ops # clojurewerkz # clojurewest # clojurex # clojure-za # clojurian-chat-app # clojutre # cloverage # cloxp # clr # code-art # code-reviews # community-development # component # conf-proposals # conjure # consulting # contributions-welcome # copenhagen-clojurians # core-async # core-logic # core-matrix # core-typed # cryogen # crypto # css # cursive # cz-clojure # d2q # datacrypt # datahike # datalevin # datalog # data-oriented-programming # data-science # datascript # datavis # dato # datomic # defnpodcast # deps-new # depstar # devcards # devops # dirac # docker # docs # domino-clj # duct # dunaj # eastwood # editors # emacs # error-message-catalog # etaoin # ethereum # euroclojure # events # exercism # expound # figwheel # figwheel-main # flambo # fulcro # funcool # functionalprogramming # funimage # garden # ghostwheel # girouette # gis # google-cloud # gorilla # graalvm # graalvm-mobile # graclj # graphql # gratitude # gsoc # hammock-driven-dev # helix # heroku # hispano # holy-lambda # honeysql # hoplon # hugsql # humor # hypercrud # hyperfiddle # immutant # improve-getting-started # incanter # indycljs # inf-clojure # instaparse # integrant # interceptors # interop # introduce-yourself # iot # iotivity # ipfs # jackdaw # jaunt # java # javascript # javelin # jobs # jobs-discuss # jobs-rus # joker # jukebox # juxt # jvm # kaocha # keechma # kekkonen # keyboards # klipse # kosmos # lambdaisland # ldnclj # ldnproclodo # lein-figwheel # leiningen # liberator # liquid # livestream # local-first-clojure # london-clojurians # lsp # luminus # lumo # mail # malli # mathematics # meander # melbourne # membrane # mental-health # microservices # mid-cities-meetup # midje # minecraft # minimallist # missionary # monads # mount # music # new-channels # new-clojure # nextjournal # nginx # nrepl # numerical-computing # nyc # observability # off-topic # om # om-next # onyx # other-languages # other-lisps # overtone # pamela # parinfer # pathom # pedestal # perun # philosophy # phzr # planck # plastic # play-clj # podcasts # polylith # portal # portkey # portland-or # powderkeg # practicalli # precept # prelude # programming-beginners # project-updates # proletarian # proton # protorepl # pulsar # pure-frame # qa # qlkit # quil # random # rdf # react # reactive # reading-clojure # reagent # reclojure # re-frame # reitit # releases # remote-jobs # respo # rethinkdb # reveal # rewrite-clj # ring # ring-swagger # robots # rum # schema # sci # sfcljs # shadow-cljs # _silence # sim-testing # sioux-falls # slack-help # sneer # sneer-br # spacemacs # specmonstah # specter # speculative # spirituality-ethics # sql # startup-in-a-month # sydney # test200 # test-check # testing # thejaloniki # timbre # tmp-json-parsing # tools-build # tools-deps # trading # tree-sitter # uncomplicate # unrepl # untangled # utah-clojurians # videos # vim # vrac # vscode # wasm # web-security # windows # xtdb # yada # yleinen

Apps

onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>

lucasbradstreet 2018-08-14T02:08:53.000113Z

@lmergen great. Having other step in to fill the gap would be fantastic. Plus my hours answering questions are limited to work hours now.

rustam.gilaztdinov 2018-08-14T14:21:30.000100Z

hello! trying to start work with onyx-sql, and for exercise — copy data from one table to another

(def id-column :id)
(def table :generated_data)
(def copy-table :generated_data_onyx)
;; schema of both tables is equal

(def catalog
  [{:onyx/name :partition-keys
    :onyx/plugin :onyx.plugin.sql/partition-keys
    :onyx/type :input
    :onyx/medium :sql
    :sql/classname (:classname config)
    :sql/subprotocol (:subprotocol config)
    :sql/subname (:host config)
    :sql/db-name (:database config)
    :sql/user (:user config)
    :sql/password (:password config)
    :sql/table table
    :sql/id id-column
    :sql/rows-per-segment 1000
    :sql/columns [:*]
    :onyx/batch-size batch-size
    :onyx/max-peers 1
    :onyx/doc "Partitions a range of primary keys into subranges"
    :sql/lower-bound 0
    :sql/upper-bound 100000}

   {:onyx/name :read-rows
    ;; :onyx/tenancy-ident :onyx.plugin.sql/read-rows
    :onyx/fn :onyx.plugin.sql/read-rows
    :onyx/type :function
    :sql/classname (:classname config)
    :sql/subprotocol (:subprotocol config)
    :sql/subname (:host config)
    :sql/db-name (:database config)
    :sql/user (:user config)
    :sql/password (:password config)
    :sql/table table
    :sql/id id-column
    :onyx/batch-size batch-size
    :onyx/doc "Reads rows of a SQL table bounded by a key range"}

   {:onyx/name :identity
    :onyx/fn :sql-data.core/rows
    :onyx/type :function
    :onyx/batch-size batch-size
    :onyx/batch-timeout batch-timeout
    :onyx/doc "identity"}

   {:onyx/name :write-rows
    :onyx/plugin :onyx.plugin.sql/write-rows
    :onyx/type :output
    :onyx/medium :sql
    :sql/classname (:classname config)
    :sql/subprotocol (:subprotocol config)
    :sql/subname (:host config)
    :sql/db-name (:database config)
    :sql/user (:user config)
    :sql/password (:password config)
    :sql/table copy-table
    :sql/copy? false
    ;; :sql/copy-fields [:first :second :third]
    :onyx/batch-size batch-size
    :onyx/doc "Writes segments from the :rows keys to the SQL database"}
   ])

I have this exception in logs:

clojure.lang.ExceptionInfo: Wrong number of args (1) passed to: sql/read-rows
     offending-segment:  {:id 1, :name "name", :price 100, :created_date #inst "2016-04-22T21:00:00.000000000-00:00", :description "description", :in_stock true}
        offending-task: :read-rows
    original-exception: :clojure.lang.ArityException
clojure.lang.ExceptionInfo: Handling uncaught exception thrown inside task lifecycle :lifecycle/apply-fn. Killing the job. -&gt; Exception type: clojure.lang.ExceptionInfo. Exception message: Wrong number of args (1) passed to: sql/read-rows
                job-id: #uuid "5fa02f30-675f-8c9c-8e5d-fd27609f2207"
              metadata: {:job-id #uuid "5fa02f30-675f-8c9c-8e5d-fd27609f2207", :job-hash "2e8adc49564869d2ca4536a0b155de9411e5c55d78b576a4cd13411e444aaa"}
     offending-segment: {:id 1, :name "name", :price 100, :created_date #inst "2016-04-22T21:00:00.000000000-00:00", :description "description", :in_stock true}
        offending-task: :read-rows
    original-exception: :clojure.lang.ArityException
               peer-id: #uuid "cb52b552-5c86-04ed-a94f-f874dfd46aca"
             task-name: :read-rows

Which args I should provide?

2018-08-14T15:43:36.000200Z

@rustam.gilaztdinov that's a very weird error, which version of onyx are you using ?

2018-08-14T15:43:44.000100Z

are you sure your version of onyx is compatible with the version of the plugin ?

rustam.gilaztdinov 2018-08-14T15:44:11.000100Z

[org.onyxplatform/onyx "0.13.3-alpha4"]
[org.onyxplatform/onyx-sql "0.13.3.0-alpha4"]

2018-08-14T15:44:24.000100Z

hmm

2018-08-14T15:50:52.000100Z

wait a minute, this doesn't make sense, the docs are not good

lucasbradstreet 2018-08-14T15:51:24.000100Z

It’s probably a missing lifecycle causing an argument not to be injected into an onyx/fn?

rustam.gilaztdinov 2018-08-14T15:52:12.000100Z

but I don’t have any args, just identity

2018-08-14T15:52:38.000100Z

no, the docs are not corect

2018-08-14T15:52:51.000100Z

you're not supposed to call read-rows anymore since i did that refactoring to 0.10

2018-08-14T15:53:01.000100Z

https://github.com/onyx-platform/onyx-sql/blob/master/src/onyx/plugin/sql.clj#L116

2018-08-14T15:53:11.000100Z

SqlPartitioner now calls read-rows itself

2018-08-14T15:54:33.000200Z

@rustam.gilaztdinov as a matter of debugging, could you do something for me ? instead of this catalog entry:

{:onyx/name :read-rows
    ;; :onyx/tenancy-ident :onyx.plugin.sql/read-rows
    :onyx/fn :onyx.plugin.sql/read-rows
    :onyx/type :function

replace the :onyx/fn with another fuction, and (println ..) the output ?

2018-08-14T15:54:54.000100Z

it should be {:id 1, :name "name", :price 100, :created_date #inst "2016-04-22T21:00:00.000000000-00:00", :description "description", :in_stock true}

2018-08-14T15:55:04.000100Z

docs need to be updated i think 🙂

2018-08-14T15:55:51.000100Z

here you can see the tests also do not use read-rows anymore: https://github.com/onyx-platform/onyx-sql/blob/master/test/onyx/plugin/input_test.clj#L43

rustam.gilaztdinov 2018-08-14T15:58:37.000100Z

yes, I removed :read-rows from catalog and workflow and this works!

2018-08-14T16:07:13.000100Z

👍

rustam.gilaztdinov 2018-08-14T17:09:31.000100Z

actually, another question 🙂 Batch size is equal to 10 After I submit job, only one batch processed, then in log I have this exception

java.lang.NullPointerException:
    clojure.lang.ExceptionInfo: Handling uncaught exception thrown inside task lifecycle :lifecycle/read-batch. Killing the job. -&gt; Exception type: java.lang.NullPointerException. Exception message: null
       job-id: #uuid "7bacb4ad-e7c8-1292-8814-b9e6a9183a7f"
     metadata: {:job-id #uuid "7bacb4ad-e7c8-1292-8814-b9e6a9183a7f", :job-hash "b167932da4fa7ff4b4cff26b921fcf0576cc010fb9fbf5c607022507b3b9f6d"}
      peer-id: #uuid "76fd87b1-64af-254e-1568-96a986a649b7"
    task-name: :partition-keys

If I changed batch size — still have this exception and one batch processed Is that means — I should add :lifecycle/after-batch function or handle exception? Sorry, I’m pretty new to onyx, any tips will be super helpful

2018-08-14T17:29:12.000100Z

perhaps there is an SQL error somewhere ? can you look at your database logs to see which queries are being sent ?

2018-08-14T17:29:29.000100Z

NPE doesn't look too good

2018-08-14T17:30:20.000100Z

to be perfectly honest, I don't think the sql plugin is used that much -- I mostly use it as an output plugin, and I think I'm one of the few people actually using it 🙂

2018-08-14T17:30:30.000100Z

so the chances are that you might run into some corner cases

rustam.gilaztdinov 2018-08-14T17:35:39.000100Z

If i change batch size to 20, all works well. So, no NPE on data. NPE comes with next batch

2018-08-14T18:06:38.000100Z

ok, i wouldnt be able to tell right now. it's most likely a bug in the sql plugim

rustam.gilaztdinov 2018-08-14T18:18:10.000100Z

Oh(

rustam.gilaztdinov 2018-08-14T18:23:27.000100Z

It's bad, I'm data analyst, and thinking about writing complex transformation on sql data( we have huge postgres database, and this is why i peek onyx. Clojure is so great to work with data and in combination with onyx it's promise so much. Which kind of workflow you suggest -- produce data to kafka, and work with kafka plugin and sql-plugin to output?

2018-08-14T18:50:56.000100Z

it completely depends upon what you want to do with the data. I would say that Onyx shines more in production workloads, for exploratory data analysis you want to keep things in a relational database or data warehouse.

2018-08-14T18:51:31.000100Z

in production, you could use Onyx to implement your actual algorithms, for example indeed sourcing from Kafka and writing into PostgreSQL

2018-08-14T18:52:34.000100Z

but it all completely depends on what you plan on doing

2018-08-14T18:52:46.000100Z

and it needs to be properly tailored towards your use case

2018-08-14T18:52:58.000100Z

There is no one size fits all solution here :)

rustam.gilaztdinov 2018-08-14T19:10:30.000100Z

Agree) but when data is huge, not so big for hadoop and spark -- ETL on this size of data on single machine is slow. Think, this totally onyx case.