@blueberry : have you seen situations where bad calls in jcuda crashes the entire jvm ?
I'm running into this issue alot recently, and it's submoptimal as it forces ms to wait + start a new repl and really reduceds on ability to experiment quickly
you can prevent this by using ClojureCUDA, which offers safe functions.
(JCublas2/cublasSetPointerMode cublas-handle cublasPointerMode/CUBLAS_POINTER_MODE_DEVICE) <-- lacking that one line cause cublasSdot to crash
@blueberry : clojureCuda doesn't wrap cuBlas/cuDnn yet, does it ?
I need both of those libraries
neanderthal has a cuBlas engine
@blueberry: oh wow, was this just added in 0.11 ? http://dragan.rocks/articles/17/CUDA-and-cuBLAS-GPU-matrices-in-Clojure
that's right
@blueberry: does "cudastreams" play well with "cublas" ?
if so, how do you handle it in neandertheral ? do you break up big matcies into smaller pieces that you can put into different cudastreams ?
@blueberry : the only exp I found was : https://github.com/uncomplicate/neanderthal/blob/3aeeaa9554fcaefd12799a4d2090067abe7d98dc/src/clojure/uncomplicate/neanderthal/math.clj#L70
is there any support for doing elementwise exp on the gpu ?
or with neanderthal, I have to write a custom cuda kernel for that?
that is a trivial kernel to write in clojurecuda.
i didn't want to clutter neanderthal with a bunch of kernels for every possible simple mathematical function.