clojure

New to Clojure? Try the #beginners channel. Official docs: https://clojure.org/ Searchable message archives: https://clojurians-log.clojureverse.org/
henrik42 2021-04-03T09:00:24.236700Z

Hi! Working my way through Java interop and method overload. I have this Java class:

package pets;
import java.util.Arrays;
public class Pets {
    public interface Pet { }
    public static class Dog implements Pet { }
    public static class Cat implements Pet { }
    public static String isA(Pet p) { return "pet"; }
    public static String isA(Dog d) { return "dog"; }
    public static String isA(Cat c) { return "cat"; }
    public static void main(String[] args) {
        System.out.println(isA(new Dog()));
        System.out.println(isA(new Cat()));
        for (Pet x : Arrays.asList(new Dog(), new Cat()))
            System.out.println(isA(x));
    }
}
and I get this as expected: overload-selection via type Pet at compile-time.
dog
cat
pet
pet
In Clojure we can do better:
(import (pets Pets Pets$Dog Pets$Cat))
(println (Pets/isA (Pets$Dog.)))
(println (Pets/isA (Pets$Cat.)))
(doseq [x [(Pets$Dog.) (Pets$Cat.)]]
  (println (Pets/isA x)))
We get this: dynamic overload-selection at runtime.
dog
cat
dog
cat
But when I add this to the Java class:
public static String isA(Object x) { return "thing"; }
I get this:
dog
cat
thing
thing
I'm sure this question must have come up many times: can someone briefly explain why this is so or point to some documentation? I assume that the Clojure compiler makes this happen and I tried to use partial and apply to somehow enforce the dynamic overload-selection but with no success. Using (println (Pets/isA ^pets.Pets$Pet x)) gives me the same as for Java but that's not what I'm looking for.

2021-04-03T14:29:11.238100Z

I think your doseq is equivalent to:

2021-04-03T14:29:12.238300Z

for (Object x : Arrays.asList(new Dog(), new Cat())) System.out.println(isA(x));

2021-04-03T14:31:01.239900Z

and I think (that's two thinks in a row) the clojure compiler only tries to match on interfaces if it can't find a direct match - your isA(Object x) can be called without any more inspectrion of type heirarchies so it goes for that

2021-04-03T14:33:32.240600Z

maybe post this on the Clojure FAQ site - it's an Alex Miller-ish type of question

2021-04-03T14:49:27.240800Z

@henrikheine When you provide isAt(Object x) in the java code then the Clojure compiler will at compile time insert a call to that method. If you leave out isAt(Object x) then at compile time the Clojure compiler cannot find a matching method isAt(Object x) and will insert a reflection call to find a matching isAt method at run-time (you can check this by setting (set! *warn-on-reflection* true)). You could invoke the reflector to find and call the most specific match:

(doseq [x [(Pets$Dog.) (Pets$Cat.)]] 
 (println (Pets/isA x))
 (println "REF: " (clojure.lang.Reflector/invokeStaticMethod Pets "isA" ^objects (into-array [x]))))
;;=> thing ;;=> REF: dog ;;=> thing ;;=> REF: cat Because I'm not sure of the definition of 'most specific match' in all cases I would recommend to use instance? checking and calling the method you want with type-hints such as (Pets/isA ^pets.Pet$Cat x) as you would with casting in Java code.

šŸ‘ 1
henrik42 2021-04-05T08:34:52.299300Z

Thx. I understand that the compiler opts for isA(Object) when it's there but still wonder why. Is it for performance? It bugs me to have something that breaks when adding a static method. Using instance? would force me to know all types in advance, no? So what I'm looking for is a solution that uses reflection to select the 'most specific' overloaded method like you said. I would probably go the way the Java compiler does it. I would use reflection to find the match and then memoize/dispatch on class. Does that make sense?

henrik42 2021-04-05T09:52:13.302700Z

Of course the idea is not new: https://groups.google.com/g/clojure-dev/c/X3-CkPrSLM0

henrik42 2021-04-05T10:34:41.303Z

@thegeez I just realize that your solution does just what I asked for. Great. Thank you. The code can still break when adding more overloaded methods to Pets but that's due to the ambiguity and will break the java compile also.

nilern 2021-04-05T17:31:27.322600Z

I always do (set! *warn-on-reflection* true) and add type hints because calling methods reflectively is slow and going through the Reflector makes it even slower. You could use the double dispatch pattern to get non-reflective dynamic dispatch for the argument (see e.g. https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L166-L216). Protocols make it more powerful than in Java but it still is quite a bit of boilerplate.

henrik42 2021-04-06T06:05:19.059100Z

In my case I have to deal with static methods and in that case I cannot use a protocol. Yes, reflection makes it slower but I first want to make it correct. And the way I see it the compiler does produce a reflective invokation when isA(Object) is missing which I think is correct and then suddenly ignores all types once I add isA(Objekt) which I would argue is wrong albeit performant. So using the Reflector makes it correct for me. So how can it be made fast? At load/compile-time the compiler knows about all overloaded methods isA(<type>). So it could produce a memoized-cond-on-type-dispatch to non-reflective invokations. That should then be correct and fast.

nilern 2021-04-06T08:16:56.062900Z

You can still use a protocol for static method arguments.

nilern 2021-04-06T08:19:38.063100Z

(defprotocol PetIsA
  (pet-isa? [pet]))

(extend-protocol PetIsA
  Pets$Dog
  (pet-isa? [dog] (Pets/isA dog)))

nilern 2021-04-06T08:25:13.063300Z

Incidentally I am reading Effective Java and it says to avoid overloads with the same arity. It is especially confusing if the arguments are in subtype relation.

nilern 2021-04-06T08:31:08.063900Z

The compiler is not going to help you. But if you really want you can use the Reflector or the underlying Java reflection in your own dispatch-generating macro. But I would change the confusing overloads if possible or else use the protocol trick. Or both.

henrik42 2021-04-06T20:43:42.122700Z

Thanks a lot. Cool. I didn't know about the protocol trick. And yes, one should not overload with same arity. Usually people use theit IDE to tell them which implementation is invoked ...šŸ˜†

nilern 2021-04-06T20:46:11.123200Z

I learned the protocol variation from those Jsonista sources but double dispatch is a longstanding OO pattern, especially as Visitor

mokr 2021-04-03T15:32:22.247Z

Hi, in an effort to improve my coding it would be nice to read some best practices on the use of keywords (plain, namespaced and the auto namespaced). Considering topics like: ā€¢ Destructuring (:person/name easily collides with :pet/name, :street/name, ā€¦) ā€¢ Store and retrieve from DB (typically a string based name) ā€¢ Interop with Javascript/JSON ā€¢ Records (no ns support) ā€¢ Specs (namespaced keys are automatically checked) ā€¢ Library code and collisions. ā€¢ Domain data vs other data Maybe it is too much to hope for a resource that takes all of this into account? To elaborate: While I donā€™t enforce any hard rules I tend to use namespaced keywords (:person/name) for domain data, auto namespaced keys for ā€œprivateā€ data (e.g. when storing in re-frameā€™s app-db, I store under ::/people to indicate that this key should only be retrieved by the namespace that stored it. Iā€™m not a big fan of the (:require [my.app.foo :as foo]) (::foo/something some-map) syntax as itā€™s a bit noisy for my taste with all those :: prefixing everything. But I also find myself breaking especially the domain data ā€œruleā€ when JS/DB interop is involved and worst case I get an inconsistent codebase with more than one key for a given piece of data.

marciol 2021-04-05T19:54:51.359800Z

Hi @mokr, someone started this same question days ago: https://app.slack.com/client/T03RZGPFR/C03S1KBA2/thread/C03S1KBA2-1616759029.465400

flowthing 2021-04-03T15:48:05.248500Z

Thereā€™s some discussion on that topic here: https://ask.clojure.org/index.php/10380/when-to-use-simple-qualified-keywords Doesnā€™t have an answer for all your questions, though.

mokr 2021-04-03T15:53:35.248800Z

Thanks, @flowthing, thatā€™s a nice resource to add to my list.

2021-04-03T16:32:47.249Z

There's also this other in progress discussion: https://clojureverse.org/t/dont-quite-understand-rules-for-namespacing-keywords/7434?u=jjttjj

mokr 2021-04-03T17:17:00.249300Z

That was a nice discussion Iā€™ll keep an eye on. I really struggle with accepting namespace qualified keywords for domain data as a good idea. Too me that feels like it introduces some unattractive coupling between namespaces that needs to work on domain data and a noisy syntax. A kind of ā€œbelt and suspendersā€ reaction to the fear of key conflicts. Completely ignoring data lifespan, localisation and the likelihood of conflicts in a given use case. But, very experienced developers are proponents of it so I get the feeling Iā€™m missing something bigger here. Apart from the obvious that it wonā€™t ever collide. Possibly we just donā€™t agree on the trade-offs involved thoughā€¦ šŸ™‚

CĆ©lio 2021-04-03T20:41:21.254Z

Hi all. I thought I knew how laziness works in functions like map and filter but I was just caught by surprise with this code:

(-&gt;&gt; [nil "hello" nil nil]
     (map #(do (println "processing " %) %))
     (filter (comp not nil?))
     first)
When evalā€™d in the REPL it outputs this:
processing  nil
processing  hello
processing  nil
processing  nil
"hello"
I was expecting it to print only the first and second elements, but instead it printed all of them. Whatā€™s going on here? Also, any tips on how to make it stop processing elements after the first element returned by filter?

CĆ©lio 2021-04-03T20:48:19.254500Z

Hah! The lazy seqs produced by map and filter are chunked.

2021-04-03T20:58:32.254600Z

The lazy seq produced by calling seq on a vector is chunked

2021-04-03T20:59:04.254700Z

map and filter return chunked seqs if given one

šŸ‘ 2
pavlosmelissinos 2021-04-03T21:02:53.255Z

not really what you asked but fyi you could use some? instead of (comp not nil?) and (some identity) instead of

(filter (comp not nil?))
     first

seancorfield 2021-04-03T21:18:49.255600Z

(some some?) -- identity will not "match" false but (comp not nil?) will.

šŸ‘ 1
seancorfield 2021-04-03T21:19:02.255800Z

Also (comp not f) == (complement f)

šŸ‘ 1
pavlosmelissinos 2021-04-03T22:36:31.256400Z

Right, missed the false case, thanks! (some some?) returns true/false instead of the actual item though, which I don't think is the desired behaviour here (on the other hand there's probably no "desired behaviour", since this is just an example anyway)

seancorfield 2021-04-03T22:41:20.256800Z

Ah, good point. You can't actually use some if you want false as a match.

šŸ‘ 1
seancorfield 2021-04-03T22:42:02.257Z

(since some only returns "logical true" values)

seancorfield 2021-04-03T22:42:59.257300Z

So we're both wrong, for different reasons šŸ™‚

šŸ¤² 1
seancorfield 2021-04-03T22:44:49.257600Z

(keep identity coll) will at least return false so (filter some?) and (keep identity) are the same I believe.

1
ā¤ļø 1
2021-04-03T23:54:16.257900Z

I think namespace keywords "appear" strongly encouraged but are not

2021-04-03T23:54:43.258200Z

They're a useful feature to have when you need it, but most of the time you'll be fine with unqualified keywords

2021-04-03T23:56:59.258400Z

Like I answered in that latest ClojureVerse question. If you're going to collocate keywords with a risk of collision, then they are a good way to avoid that, such as Spec's use of them or Datomic. They also allow collocated keywords to still be grouped, which is what Datomic does.

2021-04-03T23:57:42.258600Z

And they're nice too if you want contextual lineage.

2021-04-03T23:58:40.258800Z

But those are the less common use cases, in practice you will most often just have keywords inside map that represent entities, or as a way to club function inputs/outputs together, and unqualified keywords is the norm