@blueberry: When a BLAS method has a separate output argument, can I assume there is no overlap in memory between the output and input matrices?
@ericlavigne: can you be a bit more specific and give me an example?
Void axpy(Number alpha, Block x, Block y) The product of alpha and x is stored into y. Do x and y sometimes share the same buffer, so that changes to y might change x while calculations on x are still being performed?
@ericlavigne: whether that would work depends on the implementation. ATLAS (obviously) dos not allow this. This is something that relies on the calling code and could not be checked automatically - you should assume that x and y should not overlap.
@ericlavigne: as a side note - isn't it easier to compile atlas for your machine and OS (I assume it is Windows) than to code all the operations. OK, axpy and friends are easy, but mv and mm are much trickier.
Easiest solution is just wait until my computer is repaired. I would have liked for a Java implementation to already exist, though, and will create it for the next person. Doesn't look too difficult yet, and I'm learning a lot. 😊