clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
2019-07-16T22:01:59.436800Z

I'm using Clojure/Java 1.10.1 right now, and I might be seeing some performance degradation when using Clojure sets as map keys or set elements, where it seems much slower when using those vs. using integers as keys/elements, even though the sets being used as keys/elements have (identical? x y) = true when they are equal, so I would expect (= x y) to return true as fast as (identical? x y) to return true. Does this ring a bell for anyone in the implementation? I will poke around a bit more to see if I can find something, and report here.

2019-07-16T22:07:19.437400Z

Hmm. Maybe because APersistentSet has equiv() implementation that does not check for identical using == as a fast path?

2019-07-16T22:08:57.437800Z

Huh, I guess I assumed that would be done there.

2019-07-16T22:09:47.438300Z

APersistentMap equiv() also has no such if-identical-return-true-quickly fast path.

2019-07-16T22:14:06.438600Z

APersistentVector equiv() does have such a fast path

alexmiller 2019-07-16T22:34:34.439600Z

nothing has changed with any of this recently afaik - what do you mean by "degradation"?

alexmiller 2019-07-16T22:35:12.440300Z

can you share what you're actually testing? Util.equiv(Object, Object) has the identity check

2019-07-16T22:53:32.442Z

I'm doing some more investigation to firm up my claims here, or likely figure out that I'm imagining things. I have some code that I think is running quite quickly when I have lots of sets and maps containing longs as set elements / map keys, but I think the same code is noticeably slower when using IPersistentSet's of longs as set elements / map keys.

2019-07-16T22:55:48.443300Z

there were some mailing list posts for many years ago about performance of small collections as keys in maps, I believe inside instaparse

2019-07-16T22:57:37.445200Z

and I don't entirely recall, but I think the gist of it was something like, computing the hash codes for small collections up front would perform better, but clojure generally computes in on demand and caches it

2019-07-16T22:59:15.445700Z

that may predate the hasheq stuff even, so maybe too old

2019-07-16T23:01:16.446600Z

ah, I mis-remembered the whole thing: https://archive.clojure.org/design-wiki/display/design/Better%2Bhashing.html

2019-07-16T23:02:45.447Z

(and of course last edited by @andy.fingerhut)

😂 1
2019-07-16T23:03:38.447400Z

Yep, I recall those events. Fun perf debugging! 🙂

2019-07-16T23:04:06.447800Z

I believe this is different than that.

ghadi 2019-07-16T23:12:21.448800Z

I wonder how SipHash and other more recent non-cryptographic hashes perform

ghadi 2019-07-16T23:13:20.449400Z

Somewhat relatedly https://openjdk.java.net/jeps/8201462