specter

Latest version: 1.1.3
jrheard 2017-12-18T00:00:09.000210Z

i’m sure there’s a better and faster way to express this, will stare at it until something comes to mind 😄

jrheard 2017-12-18T00:12:33.000068Z

hell, sometimes the performance seems noticeably slower, sometimes it seems precisely the same or even faster

jrheard 2017-12-18T00:20:02.000063Z

perf is actually pretty acceptable when run through advanced compliation, nice

jrheard 2017-12-18T00:31:55.000087Z

benchmarking stuff now; as you’d expect, that code runs pretty fast (35ms in 1000000 runs) when x1 = x2, but about 50x slower (1400 ms in 1000000 runs) when x1 != x2

jrheard 2017-12-18T00:33:02.000006Z

so the codepath i need to focus on optimizing is

(for [x (if (< x1 x2)
          (range x1 (inc x2))
          (reverse (range x2 (inc x1))))]
  (-> data
      (nth x)
      (nth y1)))
. hm - i’m actually not sure how to do this in a more performant way - the ranges and reverses don’t seem to be the issue afaict, it’s just the bit where we have to use nth to index into each column and then call nth on that column to find our value

nathanmarz 2017-12-18T01:53:41.000097Z

@jrheard the select* codepath should call next-fn on each grid cell, rather than calling next-fn on a sequence of cells

nathanmarz 2017-12-18T01:53:47.000143Z

that will be a lot more performant

jrheard 2017-12-18T01:53:51.000121Z

o nice, i’ll try that

nathanmarz 2017-12-18T01:54:25.000028Z

be sure to read the spec for select*: https://github.com/nathanmarz/specter/blob/master/src/clj/com/rpl/specter/protocols.cljc#L6

jrheard 2017-12-18T01:55:25.000146Z

will do, thanks!

jrheard 2017-12-18T03:58:32.000137Z

hm, that may have caused a speedup but not a drastic one

jrheard 2017-12-18T03:58:47.000116Z

ended up with this:

(defnav
  grid-values
  [x1 y1 x2 y2]
  (select* [this structure next-fn]
           (assert (or (= x1 x2)
                       (= y1 y2)))

           (if (not (cell-is-on-grid x1 y1))
             ; If your starting cell isn't on the grid, you get nothing.
             []

             (let [x2 (bound-between x2 0 (dec GRID-WIDTH))
                   y2 (bound-between y2 0 (dec GRID-HEIGHT))]

               (if (= x1 x2)
                 (let [column (nth structure x1)]
                   (doseq [value (if (< y1 y2)
                                 (subvec column y1 (inc y2))
                                 (reverse (subvec column y2 (inc y1))))]
                     (next-fn value)))

                 (doseq [x (if (< x1 x2)
                           (range x1 (inc x2))
                           (reverse (range x2 (inc x1))))]
                   (next-fn
                     (-> structure
                         (nth x)
                         (nth y1))))))))

jrheard 2017-12-18T03:58:58.000192Z

i noticed i had to use doseq rather than for - presumably next-fn is side-effecty?

jrheard 2017-12-18T03:59:02.000235Z

nbd, just interesting

jrheard 2017-12-18T04:00:44.000064Z

the only other perf thing i can think of is: rather than having grid be a 2d array, have it be a 1d array that’s primarily accessed via some function like (get-cell grid x y), so most consumers don’t treat it like a 1d array, but certain places like select* can treat it like a 1d array and perhaps get a speedup

jrheard 2017-12-18T04:00:51.000046Z

which, i think i just have to try and see what happens, and benchmark before and after

nathanmarz 2017-12-18T04:02:04.000072Z

doseq is not very fast

nathanmarz 2017-12-18T04:02:07.000018Z

neither is reverse

jrheard 2017-12-18T04:02:11.000058Z

is there anything less drastic that you’d recommend i try instead, or is this

(doseq [x (if (< x1 x2)
           (range x1 (inc x2))
           (reverse (range x2 (inc x1))))]
   (next-fn
     (-> structure
         (nth x)
         (nth y1))))
unimproveable?

jrheard 2017-12-18T04:02:34.000171Z

i’ve benchmarked the reverse codepath and it doesn’t seem noticeably slower, it seems to really be the two-nths situation afaict

nathanmarz 2017-12-18T04:02:49.000043Z

also, you're making two nth calls per element, when you could do one nth call per row

jrheard 2017-12-18T04:02:50.000091Z

reverse is always getting a list of at most five items fwiw

jrheard 2017-12-18T04:02:58.000049Z

hm

jrheard 2017-12-18T04:03:22.000012Z

i see how i can do one nth call in the case where i’m just doing down the Y axis - eg going from x=4,y=4 to x=4,y=8, and getting all those values

jrheard 2017-12-18T04:03:38.000136Z

but i don’t see how i can do it if i’m going down the x axis, eg x=4,y=4 to x=8, y=4

jrheard 2017-12-18T04:04:38.000013Z

i extremely agree with you that the two nth calls seem to be the culprit - is there some obvious better way of doing things that i’m not seeing?

nathanmarz 2017-12-18T04:05:01.000004Z

calling range is also materializing an additional sequence that isn't necessary

nathanmarz 2017-12-18T04:05:29.000193Z

the fastest way to do this is probably with just loop

jrheard 2017-12-18T04:06:30.000037Z

(fwiw, in my second-to-most-recent snippet, there’s an if with two paths - the (= x1 x2) path is pretty fast, and it still has the range and reverse)

jrheard 2017-12-18T04:06:37.000132Z

loop’s a great idea! i forgot it existed 🙂

jrheard 2017-12-18T04:06:42.000093Z

i’ll try loop in the morning, thanks nathan!

jrheard 2017-12-18T17:34:48.000416Z

started benchmarking things (just sloppily using cljs’s built-in simple-benchmark, nothing fancy). i have a pick-move function that i’m calling 50 times, which causes this navigator codepath to be executed about a jillion times. when i call next-fn a single time on the whole sequence, i see timings of about 9300ms for my benchmark. when i call next-fn multiple times, once per each item of the sequence, i see the same timings. this is probably because the sequence is always going to be at most five items long - i imagine that if calling next-fn once per item tends to give a speedup, perhaps the speedup is more obvious if you’re doing it on a ton of items as opposed to five? (i should also note that i’m not using advanced compilation while gathering this data)

jrheard 2017-12-18T17:35:20.000229Z

i switched my slow, two-`nth`-calls doseq with range and reverse to a handrolled loop, and that brought me down from 9300ms to 7200ms, which is a good start!

jrheard 2017-12-18T17:35:57.000290Z

i still wish i could figure out a way to not have to do two nth calls - the only one i can think of is to switch from having a 2d vector to having a 1d vector, and measuring the before-and-after

jrheard 2017-12-18T17:36:36.000103Z

at this point i’m mainly digging into this just to satisfy my curiosity, perf’s acceptable for my toy app but i’d like to see if the 1d vector approach causes a huge speed boost, none at all, or somewhere inbetween; i’ll post the results later today when i’m done, and then i will stop spamming this channel 🙂

jrheard 2017-12-18T18:06:16.000224Z

huh, basically zero speedup! my base benchmark timings are different for this one because i used a different board/hand combination to test the 2d vs 1d vector implementations; the 2d vector implementation had a benchmark time of 3100ms-3300ms, and so did the 1d vector implementation (presumably because vectors are implemented under the hood as trees, and so a 1d vector doesn’t guarantee you a contiguous block of memory)

jrheard 2017-12-18T18:08:13.000631Z

so i guess this is as good as it’s gonna get; my thing still spends most of its time making those two nth calls, and i don’t see how that can be avoided if you’ve got a 2d vector of values, the first dimension holds the columns, and you’re trying to select all the values from eg x=4,y=4 to x=8,y=4. it seems to me that you have to call nth two times in that situation, unless i’m missing something

jrheard 2017-12-18T18:08:49.000272Z

but anyway this has been very interesting and educational for me but no longer has much to do with specter. my app’s specter-related perf seems to be as good as it’s going to get for this query, so again, thanks for all the help and for building this great tool!

nathanmarz 2017-12-18T19:10:15.000376Z

yea, for up and down it's two nth calls per element, but for left and right it's less

jrheard 2017-12-18T19:10:54.000368Z

right exactly

jrheard 2017-12-18T19:11:12.000024Z

the one-`nth`-call-and-then-`subvec` dimension is super fast, the two-`nth`-calls-per-element one is 50x slower, and it seems like that’s just a fact of life

jrheard 2017-12-18T19:11:14.000575Z

oh well!