So over in #clojure, I’ve been discussing with myself why (sort - [1 3123 6])
work predicably as long as -
doesn’t return a number larger or smaller than an integer.
It basically comes down to this line here:
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/AFunction.java#L61
Would a ticket for making this work for the case where n
is bigger/smaller than an int be worth it?
Here’s the code that started my investigation:
user=> (sort - [23413461236412630 10000000000000000 1 0 -1])
(-1 0 1 23413461236412630 10000000000000000)
@slipset I'm not sure there's a reasonable way to "fix" that; -
is just not a good comparator function
the Comparator
interface defines the return value as an int
, so I'm not sure what else you would do there
public int compare(Object obj1, Object obj2):
is the interface for comparable.
to make that sort
work you'd have to expand sort
so that it takes some broader definition of a comparator
I doubt there's a compelling use case for that kind of change; I'm getting a headache just trying to think about what (sort - ...)
is supposed to do
I think it's supposed to be the same as (sort ...)
?
@gfredericks thing is all fns in Clojure implement Comparator.
@slipset sure, but in general a function could be even less appropriate for use as a Comparator than -
is
e.g., hash-map
also implements Comparator
would you want there to be some way of using hash-map
with sort
also? where do you draw the line?
I don't think functions implement Comparator because they're expected to all be useful comparators, but just so that when you use a jvm interface that requires a comparator you don't have to reify
it, you can just use an appropriate function
all functions are also Runnable
, even though only functions with 0-arg arities can have .run
called without an exception
The thing is that passing str as a comparator throws, whereas passing - sometimes give the wrong results.
Are you concerned that somebody might be tempted to use -?
I was :)
Was it a situation where you were programmatically picking a comparator and you used - as the comparator for normal sorting order?
yes
Of course I should have reached for compare
, but -
holds the contracts when applied to integers.
the arg contracts, maybe; not the return contract, apparently 🙂
there's also (comparator <)
I think that -
over Integers returns an int?
yeah but over larger numbers it can return larger numbers, as in this case
actually I think you could overflow ints using -
exactly
(- Integer/MIN_VALUE Integer/MAX_VALUE)
or just (- Integer/MIN_VALUE 1)
I agree it's an easy mistake to make; the only idea I can think of would be a check in the .compare
impl that throws an exception if the return value can't be losslessly cast to an int
I'm not sure if that would be prohibitive performance-wise though
You could do that, or you could check if the Number that was returned was less than, equal to, or greater than 0
and then return -1,0,1 you mean?
Yes
But java.lang.Number
doesn’t have the required functions it seems, so you’d have to figure out the type and coerce it correctly and so on. Probably not worth it 🙂
yeah, that's not terrible actually; it's just subject to
A) performance
B) could it possibly break anything relying on the current behavior
an alternative that's a bit easier on B is to only change the return value if it's out of range; if it's less than Integer/MIN_VALUE
, you return Integer/MIN_VALUE
, and similarly on the positive side
what if you called .longValue
I wonder if that can ever flip the sign of a number the way .intValue
does
.doubleValue
could be used also
apparently .longValue
can
user=> (.longValue -139783912983471927356912387921386334)
6181708512281126050
this is promising w.r.t. .doubleValue
user=> (.doubleValue -139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##-Inf
user=> (.doubleValue 139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##Inf
notably the methods on Number
mention "truncation" for every method except .doubleValue
and .floatValue
for those they only mention "rounding"
ratio & bigdec seem to work right also
so it seems like it doesn't have any logical problems; so performance would be the big thing
Could probably use clojure.lang.Numbers
functions for this
I guess if you get a NaN
you should throw an exception
actually (.intValue ##NaN)
returns 0
apparently 😄
I wonder if that's in the floating point spec or just something the jvm authors made up
Funny how deep the rabbit holes become once you start digging…
@slipset I would suggest reading the Clojure guide article I wrote on comparators. It explicitly warns against using -
in any way for a comparator function, because the Java return type is a 32-bit int
. There are very special cases where it is safe, but fewer than the cases where it is a subtle bug waiting to happen: https://clojure.org/guides/comparators
Thanks!
If you want descending numeric order for a sort or sorted collection, #(compare %2 %1)
is about the safest thing I know
And avoids the Clojure/JVM possible performance issue of using >
where it calls >
twice if the two arguments are equal.
which is not an issue if you have none or a small fraction of equal elements in the things to be sorted.
Nice writeup. Seems like you’ve covered most of what I’ve learned today 🙂
Please suggest additions if you learn new things 🙂
It probably wouldn't hurt to mention ##NaN's in that article somewhere.
Will do. I understand and accept why most things are as they are wrt to comparators, the only thing I guess I don’t fully accept is that we silently get wrong results sometimes when using -
as a comparator.
You mean, why does Clojure call .intValue
and not do any kind of bounds checking?
yes
But I guess that falls into the same category as the discussion we’ve had multiple times regarding how the various set functions should work for non-valid inputs.
I guess the answer is that it works for correct programs 🙂
and for uncorrect programs, it’s undefined.
For all I know there is a 10 year old Clojure Google group message covering that topic, but I haven't gone looking for one.
When you have the source code and an open source license, you don't have to fully accept anything as it is. That said, I fully deeply understand that having a locally modified version is often not the right thing.
The other benefit of OSS is that I could read the code and figure out why it behaved like this.
Definitely. It doesn't always answer "why?", but at least does answer "what?"
javascript detected
?