I don't need cuDNN performance, but I need reasonable performance 🙂
I have had a brief look yesterday evening and have ClojureCL running with the OpenCL in Action examples, so I will try to start from there with kernels for convolution that I find on the web.