data-science

Data science, data analysis, and machine learning in Clojure https://scicloj.github.io/pages/chat_streams/ for additional discussions
2020-12-27T23:02:02.131100Z

Anyone know if in geni (Spark) the data frames are typed or untyped?

2020-12-28T08:19:40.133600Z

Does it have an impact when you use datasets? Do you feel the burden of types in comparison to handling a collection of open Clojure maps?

2020-12-28T08:19:55.134200Z

Thanks for your answer and the library!

Anthony Khong 2020-12-28T08:52:58.134400Z

> Do you feel the burden of types in comparison to handling a collection of open Clojure maps? Not really, to me, it still feels like a dynamic language (or library in this case), because it all happens during runtime. But, just like Clojure, it’s strongly typed, so that you get type errors during run time. Also, I wouldn’t compare it to handling Clojure maps. Geni is for a different use case.. If your data is small enough, using collection of maps is probably better, because the reader of your code doesn’t have to learn Spark. But once you’re dealing with millions or billions of rows, you’d want to use Spark or similar libraries.

2020-12-28T22:22:45.134800Z

Thanks a lot for your explanations!

👍 1