code-reviews

2017-01-18T14:40:53.000823Z

Not sure if this is allowed, but any pointers on http://codereview.stackexchange.com/questions/152944/clojure-flatten-a-nested-map would be great 🙂

sveri 2017-01-18T15:41:34.000825Z

@poooogles Does the input map always have the same structure?

sveri 2017-01-18T15:41:49.000826Z

or can it be arbitrarily recursive?

sveri 2017-01-18T15:42:56.000827Z

Running that function from SO with the provided example with criterium gives an execution mean time of 2.947716 µs

2017-01-18T15:57:27.000829Z

@sveri Arbitrary I'm afraid. I've got some criterium tests on the real structures which are much bigger that are nowhere near as fast.

2017-01-18T15:57:45.000830Z

FAIL in (test-normalise-performance) (idbevt_test.clj:51)
Check performance of our normalisation.
expected: (< (first (:sample-mean result)) target-time)
  actual: (not (< 5.225852301972124E-4 2.5E-5))

2017-01-18T16:02:29.000831Z

x86_64 Mac OS X 10.9.5 4 cpu(s)
Java HotSpot(TM) 64-Bit Server VM 25.51-b03
Runtime arguments: -Dfile.encoding=UTF-8 -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -XX:-OmitStackTraceInFastThrow -Dclojure.compile.path=/Users/spegler/Documents/Clojure/clj-utils/target/base+system+user+dev/classes -Dclj-utils.version=0.3.3 -Dclojure.debug=false
Evaluation count : 91320 in 60 samples of 1522 calls.
      Execution time sample mean : 810.484728 µs
             Execution time mean : 811.323195 µs
Execution time sample std-deviation : 132.183916 µs
    Execution time std-deviation : 134.245365 µs
   Execution time lower quantile : 628.892823 µs ( 2.5%)
   Execution time upper quantile : 1.069945 ms (97.5%)
                   Overhead used : 12.416037 ns

2017-01-18T16:03:03.000832Z

is with the "real" data, I just posted the most simple solution I could as I assumed optimisations would scale.

sveri 2017-01-18T16:05:07.000833Z

My first assumption would be that its the recursion that causes this. Would be interesting to find out how deeply nested they are in the most common case and setup test cases for these

sveri 2017-01-18T16:05:19.000834Z

You will need them later anyway for comparison if you improve the performance

sveri 2017-01-18T16:05:23.000835Z

@poooogles

2017-01-18T16:11:13.000836Z

I'll see if I can post a gist with some better examples.

sveri 2017-01-18T19:59:12.000838Z

@poooogles Hm, this brings me to 21,695200 µs which is an order of magnitude more than before but still an order less than your ones.

sveri 2017-01-18T20:27:25.000839Z

Apart from that, is there maybe a way to tackle this by redesigning the data? Maybe if you know the structure of the data in advance you could pick out the keys?

sveri 2017-01-18T20:27:56.000840Z

Or when you build the map, build it in a different way? Or keep track of the map building process and cache the paths to the keys you are interested in?