uncomplicate

sophiago 2017-08-11T15:44:24.620185Z

Hi. I have some questions about transitioning a project to use Neanderthal. Mainly, it seems like it doesn't currently support sparse vectors? That's a big deal for me because the library I'd like to use it for is all sparse using nested n-tuples where the first element of is the coefficient, i.e. [[2 1 1] [3 1 0] [5 0 1] [7 0 0]] == 2xy + 3x + 5y + 7. I have functions to convert to sparse form, but I mainly use that for Taylor series whereas with multivariate polynomials I'd have to generate matrices. But obviously my own arithmetic functions with vanilla Clojure vectors are becoming quite a bottleneck.

2017-08-11T15:49:45.796634Z

@sophiago You're right; it doesn't currently support sparse vectors.

sophiago 2017-08-11T15:50:41.826931Z

Hmm. So I'd probably need to test whether I get a benefit from converting them back and forth. Multiplying vectors right now is quite slow for me.

2017-08-11T15:51:23.850187Z

Myltiplying vectors? You are refering to dot product?

2017-08-11T15:51:45.862059Z

Converting them would be even slower.

sophiago 2017-08-11T15:51:46.862410Z

No.

sophiago 2017-08-11T15:52:25.883632Z

I'm just referring to multiplying two vectors to get a third.

2017-08-11T15:52:40.891998Z

Elementwise?

sophiago 2017-08-11T15:52:47.895566Z

Yup

sophiago 2017-08-11T15:53:05.906050Z

Do you have vector-wise addition, subtraction, and division as well? I only see mm and scal which could be helpful. I'm assuming the mutable versions are faster?

2017-08-11T15:54:06.939755Z

Yes. Not only that, but arbitrary primitive functions too. There are articles at neanderthal website that cover that in detail.

2017-08-11T15:55:37.990473Z

although when you do that, you do not really use linear algebra operations, but only use vectors/matrices as glorified primitive arrays. Might help you, but is a naive way of doing these things, and might not give you the speed that you might hope for.

sophiago 2017-08-11T15:55:58.002404Z

Hmm, ok. My next question was going to be about speeding up scalar math (almost all longs, some ratios...not sure if that would be an issue vs. doubles) by overriding the functions with yours.

sophiago 2017-08-11T15:57:07.040452Z

magnitude from your math library might also be useful to me, except I assume it only works on scalars? In that case I can at least use sqrt vs. Java math I guess.

2017-08-11T15:57:27.051165Z

I am not sure what exactly are you trying to do, but the best way is to try and see. I'm skeptical though.

sophiago 2017-08-11T15:58:04.071824Z

I have pretty good documentation of where I'm at so far. It might be easier for you to take a glance than for me to keep explaining it: https://github.com/Sophia-Gold/Madhava-v2

2017-08-11T15:58:10.075151Z

If you want speed, you have to work with primitive floats/doubles and that's it...

sophiago 2017-08-11T15:59:13.110343Z

Yup, I'm fine switching to doubles. They mostly only show up in integration and division anyway and it makes sense for performance.

sophiago 2017-08-11T15:59:57.135243Z

But I'd like to see how much using your library is even possible for me now anyway. I wonder if I could hack on it to add sparse vectors like how mine are formatted and, if so, how difficult that would be?

2017-08-11T16:01:51.201855Z

Depends on the results you want. In my opinion, it is relatively easy (but not trivial) to create some implementation. It is completely different to create something that is competitive with best available tools.

2017-08-11T16:02:46.232736Z

But the best way is to try and see where you're at and then decide how to proceed...

sophiago 2017-08-11T16:02:49.234187Z

Right. I have little experience with MKL or BLAS and the whole point is to have their raw speed.

2017-08-11T16:03:24.253063Z

Not only that. The point is also to have access to their algorithms, of which there are many.

2017-08-11T16:03:47.265735Z

Reimplementig that in (slow) Java/Clojure code is also a huge task.

sophiago 2017-08-11T16:04:31.289029Z

I think that for me I wouldn't be using a ton of them. It's just basic arithmetic of multivariate polynomials encoded as sparse Clojure vectors is horribly slow.

2017-08-11T16:05:57.336314Z

Why don't you try this then: Use matrices, and use dense matrices (ge). If your data is not really, really sparse, that won't waste much space, but can give you decent speedup.

2017-08-11T16:06:41.360925Z

And, obviously, use operations on matrices, do not loop over collections of vectors.

sophiago 2017-08-11T16:06:53.367341Z

I'd still need to convert back and forth because of the differentiation and vector calculus functions.

2017-08-11T16:07:22.382689Z

But those operations can also be done on matrices, right?

2017-08-11T16:07:34.389210Z

Why keeping all this in vectors at all?

2017-08-11T16:08:47.427312Z

Remember: data movement is typically slower than computation. Whatever you might gain with whatever cool method, you'll loose in these conversions.

sophiago 2017-08-11T16:09:36.453435Z

I have to generate and transform many partials. Using dense matrices vs. sparse vectors makes the size grow exponentially. And they're still all functions I have to write by hand...and already fine tuned for sparse form.

sophiago 2017-08-11T16:10:00.465873Z

I'm actually wondering if you think Vectorz makes sense considering how small most of my my vectors are quite small and it says it works with sparse form?

sophiago 2017-08-11T16:10:56.495311Z

Part of what would factor into that decision is I'm unsure how Neanderthal works for scalar math, i.e. if I can just drop in the functions as substitutes for the Clojure ones in my existing code.

2017-08-11T16:12:11.533736Z

I don't know. From this, I don't see why you need any linear algebra library at all. They are useful when you can describe parts of your operations with linear algebra operations. If you can't, then you are using them as fancy arrays, which will bring more complexity for only small superficial benefit, if all...

2017-08-11T16:12:59.558709Z

if any, I wanted to say...

sophiago 2017-08-11T16:13:43.581218Z

I don't need any linear algebra. I would like two things: 1. Faster scalar math in all my functions, 2. Fast arithmetic of sparse vectors as I have them encoded above.

2017-08-11T16:14:57.620640Z

Both of those things are fastest when called with plain old primitive Clojure functions. Neanderthal can't help you with that, and I doubt any other library can.

sophiago 2017-08-11T16:15:36.641128Z

Damn, ok. Yeah I looked up Vectorz's sparse implementation and it really doesn't seem like what I want.

2017-08-11T16:15:38.642580Z

The point of Neanderthal is to avoid working with scalars.

2017-08-11T16:16:30.669407Z

and to avoid writing your loops

sophiago 2017-08-11T16:16:32.670446Z

Well, mainly scalar operations inside vectors to gain a tiny bit of speed. But my primary concern is just stuff like multiplication and addition of vectors being a huge bottleneck right now.

2017-08-11T16:17:26.698823Z

The thing is that you need really large vectors to see the speedup over plain clojure functions.

sophiago 2017-08-11T16:17:32.702210Z

Like everything I have listed under arithmetic in the README is extremely slow compared to everything else, but used in many of the vector calculus functions.

2017-08-11T16:17:45.708416Z

It seems your problem is in the algorithm itself, not he implementation

sophiago 2017-08-11T16:18:54.744187Z

Well, working with polynomials is necessarily going to be slow...especially when you get to things like division. I assumed I could just farm it out to FORTRAN at this point, though.

2017-08-11T16:20:32.795483Z

It will be equally slow. Calling native functions with scalar arguments is not (much) faster than calling primitive Java functions.

2017-08-11T16:21:54.838521Z

since primitive Java methods are already very, very fast. A nanosecond or four, but not more. That's what you get in C.

sophiago 2017-08-11T16:22:20.851832Z

Well, I'm not even using Java arrays right now. Maybe that would be somewhere to start.

2017-08-11T16:23:12.879068Z

Could be.

sophiago 2017-08-11T16:24:25.917196Z

It's hard to integrate with the rest of the code, though. I may want to consider just taking the bottleneck arithmetic functions and writing custom Java implementations of them.

sophiago 2017-08-11T16:25:21.946618Z

Oddly, I don't get any speedup using vector-of.

2017-08-11T16:35:02.246487Z

It's not odd to me, and I could bet you won't see any improvement from writing arithmetic functions in Java. Clojure functions are already at the speed of Java functions if written properly. For example, Neanderthal is completely implemented in Clojure...

2017-08-11T16:36:38.294109Z

The issue in your code is, it seems to me, completely unrelated to the specific language and platform, but in choice of datastructures and memory access - thus fundamentaly algorithms themselves.

sophiago 2017-08-11T16:40:01.395337Z

I think I could probably refactor the arithmetic functions to use transients and possibly find some optimizations otherwise. It seems like what I want are the custom tuple implementations Zach Tellman wrote years ago, but the problem is they require protocols for everything so I think that's why they were abandoned.

sophiago 2017-08-11T16:43:40.504416Z

Anyway, I did assume I could find a native library that would be much faster so didn't tune everything in the arithmetic.clj file much. I should revisit that. But to be clear, are you saying I should reconsider using nested Clojure vectors? Because I don't see much of an alternative other than custom tuples or something from Java, maybe LinkedLists.

2017-08-11T17:04:24.141598Z

All these things are quite slow compared to typical high-speed computation code. If you are chasing that kind of speed, none of those will help you much. I would hit the papers dealing with the implementations of those things you need, to see what is possible and what people use first.

2017-08-11T17:05:31.175392Z

Bottom line: no library in this area can help you if you do not know specifically what you want to do.

sophiago 2017-08-11T17:10:52.334897Z

I do know exactly what I want to do. I assumed sparse BLAS level one included these functions, but now that I'm looking it seems I was wrong. Apache Commons Math does, but I'm not sure it's worth switching to vs. just optimizing my Clojure code further.

2017-08-11T17:16:51.513558Z

They may contain those functions, but it is questionable whether they will be faster than the stuff you'd write with primitive Clojure functions. That is especially true with sparse vectors. They are applicable when your data is really sparse. And, Level 1 functions are more convenience functions to support working with more demanding stuff, that they can speed up code.

sophiago 2017-08-11T17:30:33.935226Z

I mean I did just look...level one really doesn't have as much in it as I assumed. I think the smartest thing is to just focus on refactoring my Clojure functions as you suggested. They were essentially written in one pass without the thought I put to the rest of the library.

sophiago 2017-08-11T17:30:45.941473Z

Anyway, thanks for your input 🙂

2017-08-11T18:02:23.928039Z

you're welcome. I hope that would be of some help.