clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
slipset 2019-04-12T12:49:50.001700Z

So over in #clojure, I’ve been discussing with myself why (sort - [1 3123 6]) work predicably as long as - doesn’t return a number larger or smaller than an integer.

slipset 2019-04-12T12:50:22.002Z

It basically comes down to this line here:

slipset 2019-04-12T12:51:40.003100Z

Would a ticket for making this work for the case where n is bigger/smaller than an int be worth it?

slipset 2019-04-12T12:52:27.003600Z

Here’s the code that started my investigation:

slipset 2019-04-12T12:52:32.003800Z

user=> (sort - [23413461236412630 10000000000000000 1 0 -1])
(-1 0 1 23413461236412630 10000000000000000)

2019-04-12T13:00:25.004500Z

@slipset I'm not sure there's a reasonable way to "fix" that; - is just not a good comparator function

2019-04-12T13:00:53.005100Z

the Comparator interface defines the return value as an int, so I'm not sure what else you would do there

dpsutton 2019-04-12T13:01:02.005500Z

public int compare(Object obj1, Object obj2): is the interface for comparable.

2019-04-12T13:01:20.005800Z

to make that sort work you'd have to expand sort so that it takes some broader definition of a comparator

2019-04-12T13:01:48.006500Z

I doubt there's a compelling use case for that kind of change; I'm getting a headache just trying to think about what (sort - ...) is supposed to do

2019-04-12T13:03:11.007600Z

I think it's supposed to be the same as (sort ...)?

slipset 2019-04-12T13:04:27.008400Z

@gfredericks thing is all fns in Clojure implement Comparator.

2019-04-12T13:05:07.009200Z

@slipset sure, but in general a function could be even less appropriate for use as a Comparator than - is

2019-04-12T13:05:14.009600Z

e.g., hash-map also implements Comparator

2019-04-12T13:05:29.010400Z

would you want there to be some way of using hash-map with sort also? where do you draw the line?

2019-04-12T13:06:03.011500Z

I don't think functions implement Comparator because they're expected to all be useful comparators, but just so that when you use a jvm interface that requires a comparator you don't have to reify it, you can just use an appropriate function

2019-04-12T13:06:51.012300Z

all functions are also Runnable, even though only functions with 0-arg arities can have .run called without an exception

slipset 2019-04-12T13:11:23.014100Z

The thing is that passing str as a comparator throws, whereas passing - sometimes give the wrong results.

2019-04-12T13:12:16.014700Z

Are you concerned that somebody might be tempted to use -?

slipset 2019-04-12T13:12:42.014900Z

I was :)

2019-04-12T13:14:57.016200Z

Was it a situation where you were programmatically picking a comparator and you used - as the comparator for normal sorting order?

slipset 2019-04-12T13:29:35.016400Z

yes

slipset 2019-04-12T13:30:42.017700Z

Of course I should have reached for compare, but - holds the contracts when applied to integers.

2019-04-12T13:31:26.018Z

the arg contracts, maybe; not the return contract, apparently 🙂

2019-04-12T13:32:13.018600Z

there's also (comparator <)

slipset 2019-04-12T13:32:14.018800Z

I think that - over Integers returns an int?

2019-04-12T13:32:36.019100Z

yeah but over larger numbers it can return larger numbers, as in this case

2019-04-12T13:32:55.019900Z

actually I think you could overflow ints using -

slipset 2019-04-12T13:32:59.020100Z

exactly

2019-04-12T13:33:06.020400Z

(- Integer/MIN_VALUE Integer/MAX_VALUE)

2019-04-12T13:33:19.020700Z

or just (- Integer/MIN_VALUE 1)

2019-04-12T13:34:53.022100Z

I agree it's an easy mistake to make; the only idea I can think of would be a check in the .compare impl that throws an exception if the return value can't be losslessly cast to an int

2019-04-12T13:35:06.022500Z

I'm not sure if that would be prohibitive performance-wise though

slipset 2019-04-12T13:35:33.023Z

You could do that, or you could check if the Number that was returned was less than, equal to, or greater than 0

2019-04-12T13:35:56.023300Z

and then return -1,0,1 you mean?

slipset 2019-04-12T13:36:24.023500Z

Yes

slipset 2019-04-12T13:37:12.024900Z

But java.lang.Number doesn’t have the required functions it seems, so you’d have to figure out the type and coerce it correctly and so on. Probably not worth it 🙂

2019-04-12T13:37:31.025400Z

yeah, that's not terrible actually; it's just subject to A) performance B) could it possibly break anything relying on the current behavior an alternative that's a bit easier on B is to only change the return value if it's out of range; if it's less than Integer/MIN_VALUE, you return Integer/MIN_VALUE, and similarly on the positive side

2019-04-12T13:38:32.026200Z

what if you called .longValue

2019-04-12T13:38:42.026500Z

I wonder if that can ever flip the sign of a number the way .intValue does

2019-04-12T13:38:52.026700Z

.doubleValue could be used also

2019-04-12T13:39:55.027Z

apparently .longValue can

user=> (.longValue -139783912983471927356912387921386334)
6181708512281126050

2019-04-12T13:40:50.027300Z

this is promising w.r.t. .doubleValue

user=> (.doubleValue -139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##-Inf
user=> (.doubleValue 139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##Inf

2019-04-12T13:41:28.027800Z

notably the methods on Number mention "truncation" for every method except .doubleValue and .floatValue

2019-04-12T13:41:36.028100Z

for those they only mention "rounding"

2019-04-12T13:43:38.028700Z

ratio & bigdec seem to work right also

2019-04-12T13:44:22.029500Z

so it seems like it doesn't have any logical problems; so performance would be the big thing

slipset 2019-04-12T13:44:35.029900Z

Could probably use clojure.lang.Numbers functions for this

2019-04-12T13:44:45.030200Z

I guess if you get a NaN you should throw an exception

2019-04-12T13:45:22.030600Z

actually (.intValue ##NaN) returns 0 apparently 😄

😬 2
2019-04-12T13:45:59.030900Z

I wonder if that's in the floating point spec or just something the jvm authors made up

slipset 2019-04-12T13:46:23.031300Z

Funny how deep the rabbit holes become once you start digging…

2019-04-12T14:10:44.032800Z

@slipset I would suggest reading the Clojure guide article I wrote on comparators. It explicitly warns against using - in any way for a comparator function, because the Java return type is a 32-bit int. There are very special cases where it is safe, but fewer than the cases where it is a subtle bug waiting to happen: https://clojure.org/guides/comparators

slipset 2019-04-12T14:11:13.033Z

Thanks!

2019-04-12T14:14:41.033800Z

If you want descending numeric order for a sort or sorted collection, #(compare %2 %1) is about the safest thing I know

2019-04-12T14:15:36.034500Z

And avoids the Clojure/JVM possible performance issue of using > where it calls > twice if the two arguments are equal.

2019-04-12T14:15:58.034900Z

which is not an issue if you have none or a small fraction of equal elements in the things to be sorted.

slipset 2019-04-12T14:17:25.035600Z

Nice writeup. Seems like you’ve covered most of what I’ve learned today 🙂

2019-04-12T14:17:47.035900Z

Please suggest additions if you learn new things 🙂

2019-04-12T14:18:48.037400Z

It probably wouldn't hurt to mention ##NaN's in that article somewhere.

slipset 2019-04-12T14:19:08.037800Z

Will do. I understand and accept why most things are as they are wrt to comparators, the only thing I guess I don’t fully accept is that we silently get wrong results sometimes when using - as a comparator.

2019-04-12T14:19:44.038800Z

You mean, why does Clojure call .intValue and not do any kind of bounds checking?

slipset 2019-04-12T14:19:51.039Z

yes

slipset 2019-04-12T14:19:59.039300Z

But I guess that falls into the same category as the discussion we’ve had multiple times regarding how the various set functions should work for non-valid inputs.

slipset 2019-04-12T14:20:16.039900Z

I guess the answer is that it works for correct programs 🙂

slipset 2019-04-12T14:20:27.040200Z

and for uncorrect programs, it’s undefined.

2019-04-12T14:20:45.040700Z

For all I know there is a 10 year old Clojure Google group message covering that topic, but I haven't gone looking for one.

2019-04-12T14:24:35.041800Z

When you have the source code and an open source license, you don't have to fully accept anything as it is. That said, I fully deeply understand that having a locally modified version is often not the right thing.

slipset 2019-04-12T14:28:23.043100Z

The other benefit of OSS is that I could read the code and figure out why it behaved like this.

2019-04-12T14:29:05.043600Z

Definitely. It doesn't always answer "why?", but at least does answer "what?"

souenzzo 2019-04-12T21:12:40.044500Z

javascript detected

2019-04-12T22:07:36.044700Z

?