Hi, I have a simple kd-tree / nn-search implementation https://github.com/RedPenguin101/kd-tree/blob/main/src/kdtree/kdtree.clj, and I'm confused about some of the behaviour.
(time (map search big-range))
;; 0.36ms
(time (count (map search big-range)))
;; ~30000ms
I'd assume that the reason the first was so quick was something to do with laziness - but when I run it, it does output the result back to my REPL (in much less than 30 seconds), so it must actually be doing the work. Can someone explain what is going on here?
I'd also appreciate any help on how I could speed it up. 30 seconds is not significantly faster than a brute force approach, and any feedback generally on the code.The time you see in the first case is only the time needed to build the lazy sequence, without realizing it. Something like (time (doall (map search big-range)))
should report the time you're interested in
Ah I see, thank you! Unfortunately that means the algorithm is just slow 😞
Not necessarily 🙂 For one thing I'm not sure if you intended to build the tree anew in every search call:
(def tree (kd-tree big-points 0))
(def search #(nn-search tree % md))
will cut those 30secs in half approximately.Apart from that, I'd try profiling the code -- e.g. with https://github.com/ptaoussanis/tufte -- and see where that time is spent.
Just from a quick glance: is it possible that undraw
as well as its use could be optimized to cause fewer/shallower recursive calculations?
Oh wow, no I definitely didn't intend to rebuild the tree on every call. I thought deffing search like this
(def search #(nn-search (kd-tree big-points 0) % md))
would build the tree only once, is that not the case?
Well look at that, down to 13 seconds. Awesome, thanks! I'll give the profiler a go too
Yes, you'll want something like def
or let
inside your definition of search
, otherwise that form will be evaulated on every call
The profiler worked great! Turns out nearly all the run time was in the distance calculation md, which was being called 2m times. Changing that to unchecked math changed it from 13s to < 1 second 😄
Glad it helped!