Hi! Working my way through Java interop and method overload. I have this Java class:
package pets;
import java.util.Arrays;
public class Pets {
public interface Pet { }
public static class Dog implements Pet { }
public static class Cat implements Pet { }
public static String isA(Pet p) { return "pet"; }
public static String isA(Dog d) { return "dog"; }
public static String isA(Cat c) { return "cat"; }
public static void main(String[] args) {
System.out.println(isA(new Dog()));
System.out.println(isA(new Cat()));
for (Pet x : Arrays.asList(new Dog(), new Cat()))
System.out.println(isA(x));
}
}
and I get this as expected: overload-selection via type Pet
at compile-time.
dog
cat
pet
pet
In Clojure we can do better:
(import (pets Pets Pets$Dog Pets$Cat))
(println (Pets/isA (Pets$Dog.)))
(println (Pets/isA (Pets$Cat.)))
(doseq [x [(Pets$Dog.) (Pets$Cat.)]]
(println (Pets/isA x)))
We get this: dynamic overload-selection at runtime.
dog
cat
dog
cat
But when I add this to the Java class:
public static String isA(Object x) { return "thing"; }
I get this:
dog
cat
thing
thing
I'm sure this question must have come up many times: can someone briefly explain why this is so or point to some documentation? I assume that the Clojure compiler makes this happen and I tried to use partial
and apply
to somehow enforce the dynamic overload-selection but with no success. Using (println (Pets/isA ^pets.Pets$Pet x))
gives me the same as for Java but that's not what I'm looking for.I think your doseq is equivalent to:
for (Object x : Arrays.asList(new Dog(), new Cat())) System.out.println(isA(x));
and I think (that's two thinks in a row) the clojure compiler only tries to match on interfaces if it can't find a direct match - your isA(Object x) can be called without any more inspectrion of type heirarchies so it goes for that
maybe post this on the Clojure FAQ site - it's an Alex Miller-ish type of question
@henrikheine When you provide isAt(Object x) in the java code then the Clojure compiler will at compile time insert a call to that method. If you leave out isAt(Object x) then at compile time the Clojure compiler cannot find a matching method isAt(Object x) and will insert a reflection call to find a matching isAt method at run-time (you can check this by setting
(set! *warn-on-reflection* true)
). You could invoke the reflector to find and call the most specific match:
(doseq [x [(Pets$Dog.) (Pets$Cat.)]]
(println (Pets/isA x))
(println "REF: " (clojure.lang.Reflector/invokeStaticMethod Pets "isA" ^objects (into-array [x]))))
;;=> thing
;;=> REF: dog
;;=> thing
;;=> REF: cat
Because I'm not sure of the definition of 'most specific match' in all cases I would recommend to use instance?
checking and calling the method you want with type-hints such as (Pets/isA ^pets.Pet$Cat x)
as you would with casting in Java code.Thx. I understand that the compiler opts for isA(Object) when it's there but still wonder why. Is it for performance? It bugs me to have something that breaks when adding a static method. Using instance?
would force me to know all types in advance, no?
So what I'm looking for is a solution that uses reflection to select the 'most specific' overloaded method like you said. I would probably go the way the Java compiler does it. I would use reflection to find the match and then memoize/dispatch on class.
Does that make sense?
Of course the idea is not new: https://groups.google.com/g/clojure-dev/c/X3-CkPrSLM0
@thegeez I just realize that your solution does just what I asked for. Great. Thank you. The code can still break when adding more overloaded methods to Pets but that's due to the ambiguity and will break the java compile also.
I always do (set! *warn-on-reflection* true)
and add type hints because calling methods reflectively is slow and going through the Reflector
makes it even slower. You could use the double dispatch pattern to get non-reflective dynamic dispatch for the argument (see e.g. https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L166-L216). Protocols make it more powerful than in Java but it still is quite a bit of boilerplate.
In my case I have to deal with static methods and in that case I cannot use a protocol. Yes, reflection makes it slower but I first want to make it correct. And the way I see it the compiler does produce a reflective invokation when isA(Object) is missing which I think is correct and then suddenly ignores all types once I add isA(Objekt) which I would argue is wrong albeit performant. So using the Reflector makes it correct for me. So how can it be made fast? At load/compile-time the compiler knows about all overloaded methods isA(<type>). So it could produce a memoized-cond-on-type-dispatch to non-reflective invokations. That should then be correct and fast.
You can still use a protocol for static method arguments.
(defprotocol PetIsA
(pet-isa? [pet]))
(extend-protocol PetIsA
Pets$Dog
(pet-isa? [dog] (Pets/isA dog)))
Incidentally I am reading Effective Java and it says to avoid overloads with the same arity. It is especially confusing if the arguments are in subtype relation.
The compiler is not going to help you. But if you really want you can use the Reflector or the underlying Java reflection in your own dispatch-generating macro. But I would change the confusing overloads if possible or else use the protocol trick. Or both.
Thanks a lot. Cool. I didn't know about the protocol trick. And yes, one should not overload with same arity. Usually people use theit IDE to tell them which implementation is invoked ...š
I learned the protocol variation from those Jsonista sources but double dispatch is a longstanding OO pattern, especially as Visitor
Hi, in an effort to improve my coding it would be nice to read some best practices on the use of keywords (plain, namespaced and the auto namespaced). Considering topics like: ā¢ Destructuring (:person/name easily collides with :pet/name, :street/name, ā¦) ā¢ Store and retrieve from DB (typically a string based name) ā¢ Interop with Javascript/JSON ā¢ Records (no ns support) ā¢ Specs (namespaced keys are automatically checked) ā¢ Library code and collisions. ā¢ Domain data vs other data Maybe it is too much to hope for a resource that takes all of this into account? To elaborate: While I donāt enforce any hard rules I tend to use namespaced keywords (:person/name) for domain data, auto namespaced keys for āprivateā data (e.g. when storing in re-frameās app-db, I store under ::/people to indicate that this key should only be retrieved by the namespace that stored it. Iām not a big fan of the (:require [my.app.foo :as foo]) (::foo/something some-map) syntax as itās a bit noisy for my taste with all those :: prefixing everything. But I also find myself breaking especially the domain data āruleā when JS/DB interop is involved and worst case I get an inconsistent codebase with more than one key for a given piece of data.
Hi @mokr, someone started this same question days ago: https://app.slack.com/client/T03RZGPFR/C03S1KBA2/thread/C03S1KBA2-1616759029.465400
Thereās some discussion on that topic here: https://ask.clojure.org/index.php/10380/when-to-use-simple-qualified-keywords Doesnāt have an answer for all your questions, though.
Thanks, @flowthing, thatās a nice resource to add to my list.
There's also this other in progress discussion: https://clojureverse.org/t/dont-quite-understand-rules-for-namespacing-keywords/7434?u=jjttjj
That was a nice discussion Iāll keep an eye on. I really struggle with accepting namespace qualified keywords for domain data as a good idea. Too me that feels like it introduces some unattractive coupling between namespaces that needs to work on domain data and a noisy syntax. A kind of ābelt and suspendersā reaction to the fear of key conflicts. Completely ignoring data lifespan, localisation and the likelihood of conflicts in a given use case. But, very experienced developers are proponents of it so I get the feeling Iām missing something bigger here. Apart from the obvious that it wonāt ever collide. Possibly we just donāt agree on the trade-offs involved thoughā¦ š
Hi all. I thought I knew how laziness works in functions like map
and filter
but I was just caught by surprise with this code:
(->> [nil "hello" nil nil]
(map #(do (println "processing " %) %))
(filter (comp not nil?))
first)
When evalād in the REPL it outputs this:
processing nil
processing hello
processing nil
processing nil
"hello"
I was expecting it to print only the first and second elements, but instead it printed all of them. Whatās going on here? Also, any tips on how to make it stop processing elements after the first element returned by filter
?Hah! The lazy seqs produced by map
and filter
are chunked.
The lazy seq produced by calling seq on a vector is chunked
map and filter return chunked seqs if given one
not really what you asked but fyi you could use some?
instead of (comp not nil?)
and (some identity)
instead of
(filter (comp not nil?))
first
(some some?)
-- identity
will not "match" false
but (comp not nil?)
will.
Also (comp not f)
== (complement f)
Right, missed the false
case, thanks!
(some some?)
returns true/false instead of the actual item though, which I don't think is the desired behaviour here (on the other hand there's probably no "desired behaviour", since this is just an example anyway)
Ah, good point. You can't actually use some
if you want false
as a match.
(since some
only returns "logical true" values)
So we're both wrong, for different reasons š
(keep identity coll)
will at least return false
so (filter some?)
and (keep identity)
are the same I believe.
I think namespace keywords "appear" strongly encouraged but are not
They're a useful feature to have when you need it, but most of the time you'll be fine with unqualified keywords
Like I answered in that latest ClojureVerse question. If you're going to collocate keywords with a risk of collision, then they are a good way to avoid that, such as Spec's use of them or Datomic. They also allow collocated keywords to still be grouped, which is what Datomic does.
And they're nice too if you want contextual lineage.
But those are the less common use cases, in practice you will most often just have keywords inside map that represent entities, or as a way to club function inputs/outputs together, and unqualified keywords is the norm