data-science

Data science, data analysis, and machine learning in Clojure https://scicloj.github.io/pages/chat_streams/ for additional discussions
2020-02-20T20:54:23.097Z

Did anyone try this library for automatic conversion of C header to Java JNA types? https://github.com/nativelibs4java/JNAerator

2020-02-20T20:56:11.098600Z

I would like to use this library https://github.com/techascent/tech.jna but I don’t understand a single word of Java Types haha

lumpy 2020-02-20T21:13:52.099800Z

anyone have a good tutorial for a netflix score estimator/matrix completion in clojure for a beginner ML person?

val_waeselynck 2020-02-20T22:17:17.099900Z

Woah, I would expect to find a lot of beginner-friendly ML examples in Clojure 🙂 Why not look for that in say the Python ecosystem, make sure you get the algorithms and concepts straight, and convert to Clojure next?

lumpy 2020-02-20T22:20:46.100100Z

I wanted the foundation to be more about how a clojure dev would think about it

lumpy 2020-02-20T22:22:07.100300Z

I've tried using python tutorials as a starting point and translating them, but their preference to use libraries that abstract a lot of the details and leave you with just a few lines of code that only work in a python environment

lumpy 2020-02-20T22:22:51.100600Z

makes it hard to translate

val_waeselynck 2020-02-20T22:32:45.100800Z

Well there's a reason ML engineers tend to leave to work to high-level libs - scientific algorithms are usually HARD to get right. It's much more challenging to implement them correctly than, say, the usual 'business logic' feature of an enterprise app. Leaving aside the challenges of performance and numerical stability, we usually don't know very well from theory when and how fast they should converge, nor the performance they should achieve. These algorithms are usually defined in terms of matrix operations. So a Clojure dev would think about it in these terms: 1) find a lib that can perform such matrix operations reliably, then 2) get the data into the format accepted by this lib 3) make the necessary calls to the library, and 4) feed the result to my downstream business code.

val_waeselynck 2020-02-20T22:34:50.101100Z

The parts most specific to Clojure are related to the logistics of the data upstream and downstream of the 'scientif lib' part.

val_waeselynck 2020-02-20T22:36:12.101300Z

Now maybe what you should do is study the algorithm abstractly to understand the required matrix operations, then find a matrix manipulation lib in Clojure and implement the algorithm.

val_waeselynck 2020-02-20T22:37:24.101500Z

Another approach would be: implement the scientific algorithm in a Python worker, and connect it to your larger system using queues / dbs / storage / RPC calls (libpython-clj might help!)