Not sure this is the proper channel, and maybe it should go on ask.clojure later, but it seems like apply
and transducers have some friction between them. Ideally, I would not want to hold on to the resulting sequence at all, not even chunks, just pull items out of an iterator then invoke
, but I don't see how that's possible given the current implementation.
Could you explain more what is the problem? Some example would be nice. I'm not sure there is a direct connection between apply
and transducers
I've seen this pattern more than once, of (apply f (map g coll))
.
Why should I want to hold on to the original collection? Why allocate it at all?
I think this generalizes to transducers, too.
Particularly to the case of Eduction which seems to serve exactly the purpose of avoiding allocation
@ben.sless Interesting point. You really just want to have f
see g
applied to all its args, no need to allocate anything. Perhaps (trans-fn f g)
or so could work ;)
Sort of. I went through an iterator:
(defn- consume
[^java.util.Iterator it]
(loop [coll []]
(if (.hasNext it)
(recur (conj coll (.next it)))
coll)))
(defmacro ^:private -apply-to-it
([f it]
`(-apply-to-it ~f ~it [] 0 20))
([f it args depth max-depth]
(if (= depth max-depth)
`(if (.hasNext ~it)
(apply ~f ~@args (consume ~it))
(~f ~@args))
(let [g (gensym)]
`(if (.hasNext ~it)
(let [~g (.next ~it)]
(-apply-to-it ~f ~it ~(conj args g) ~(inc depth) ~max-depth))
(~f ~@args))))))
(defn apply-to-it
"Like apply but does no intermediate allocations, consumes its argument
as an iterable."
[f ^Iterable it]
(let [^java.util.Iterator it (.iterator it)]
(-apply-to-it f it)))
bit yucky, though(defrecord TransFn [f arg-fn]
clojure.lang.IFn
(invoke [_] (f))
(invoke [_ a1] (f (arg-fn a1)))
(invoke [_ a1 a2] (f (arg-fn a1) (arg-fn a2))))
((->TransFn + inc) 1 2) ;;=> 5
That's when you know how many args you're getting, doesn't help with the restFn
case
true
That's not apple, that is reduce
apply
https://clojurians.slack.com/archives/C06E3HYPR/p1623918162066800?thread_ts=1623912490.066500&cid=C06E3HYPR not sure I understand the "hold onto the collection" part
right, apply doesn't force or hold on to anything
user=> (apply (fn [x y & _] (+ x y)) (range))
1
the complaint here is that a lot of functions used with apply
are really binary associative operations with a var args case added and handled via reduce
, and that internal fold is not exposed so you can't fuse other operations into (transducers)
which is, of course, a bad complaint
it conflates apply and reduce, which are not at all the same thing
and if you want the fold over the binary associative operation exposed, then just don't use the varargs case, do you own reduce
which is what the code above basically does, it is a reduce over an iterator, annoyingly called apply, and of course reduce already has a fast path for iterators
fruit of the poison tree, the root of the poison tree being conflating reduce and apply
Thank you for assuming the wrong use case. I know to pick reduce when it's appropriate.
The case I had the misfortune to come across is exactly that of unknown functions which can take any number of arguments
You could argue the code is bad and you'd be right, but please don't make assumptions
I might not understand what you are saying about apply here - how does apply hold onto a collection?
also I'm not seeing the connection between apply and transducers here at all
yes, there are no transducers at all in the examples given
Generally, if we look at (apply g (map f coll))
it would be nice if I could (apply g (->Eduction (map f) coll))
without allocating an intermediate sequence, instead pull the elements directly out of the iterable. The way map
is implemented it wouldn't be possible, but with Eduction it should be
that still isn't about transducers, that is about iterators, and replacing clojure varargs which pass the collected varargs as a seq, as an iterator instead
it still bakes in assumptions about the usage of varargs inside the functions being called, the assumption being the function unrolls arguments, and doesn't just do something with the args as a seq
as far as I know most functions used in this system that way are not varargs functions, its just that at runtime their arity is unknown (it's a badly written interpreter)
Question about select-keys
before I spend time writing this up on http://ask.clojure.org — I have a custom type that behaves like a hash map (it’s an extension to APersistentMap
that allows keys to be strings or keywords and also case-insensitive — for non-Clojure language interop reasons). When I call select-keys
on it, I get a regular Clojure hash map whose keys are the “original” keys from my custom hash map (in this case, they’re uppercase strings) and that makes the result pretty useless in code that follows and it’s because select-keys
explicitly uses {}
. If select-keys
instead used (or (empty map) {})
— or some optimized form of that — then it would preserve the underlying custom hash map type (which would be super-convenient).
Is such a proposed change likely to be considered? (I can understand the answer of “no” here on the grounds that this is an edge case that almost no one is going to run into — and I have a workaround: don’t use select-keys
on this custom type 🙂 )
I think there might be a ticket abou this
Oh, sorry, I should have looked before I “leaped”…
I would worry that this would be a breaking change for the cases where someone is using select-keys specifically to lose the special map type-ness of the source
An interesting take — and a valid concern, yes.
was closed as won't fix
Probably why I couldn’t find it on ask. Fair enough. I’ll tackle this a different way then.
https://groups.google.com/forum/#!topic/clojure/l_V1N1nRF-c - long discussion on ml
Thanks. It turns out my custom hash map type doesn’t implement IObj
so that would be another breakage when using (empty my-map)
in select-keys
. So it’s clearly a terrible idea! 😐
I am actively using select-keys to get around the partition thing
Was a conclusion ever reached on this issue? Is the mistake here having reifying a map facade over a mutable connection that is then combined with resource-managing reducers? Or is it inherent transducers and reducibles?
I’m asking because I’d like to offer similar affordances to next.jdbc but also avoid recreating this issue in a similar API I’m building; one that works over RDF data sources, not JDBC ones.
What do you mean by "this issue" cause we covered quite a bit of ground 😊
There was clear consensus that my proposed change to select-keys
was a bad idea - for several reasons in the end.
And my discussion was nothing to do with next.jdbc
by the way.
Apologies I meant this (from the next.jdbc
) docs:
> Note: you need to be careful when using stateful transducers, such as partition-by, when reducing over the result of plan. Since plan returns an IReduceInit, the resource management (around the ResultSet) only applies to the reduce operation: many stateful transducers have a completing function that will access elements of the result sequence -- and this will usually fail after the reduction has cleaned up the resources. This is an inherent problem with stateful transducers over resource-managing reductions with no good solution.
Which is what I thought @hiredman was referring to here: https://clojurians.slack.com/archives/C06E3HYPR/p1623971622083700?thread_ts=1623971154.083300&cid=C06E3HYPR
Yes, he was referring to that, but that what wasn't what my discussion in #clojure-dev was about -- and his comment was just pointing out a use case that would actually be broken by my proposal for select-keys
.
As for your question, I don't think there's anything more to add over what is in the next.jdbc
docs: "This is an inherent problem with stateful transducers over resource-managing reductions with no good solution."
“the partition thing”?
the thing were if you transduce over a plan, and the transduce does a partition-by, the final partition gets reduced after plan has closed everything
Ah, gotcha. And right now the select-keys
approach lets you extract columns without realizing the row into a hash map in full — and you still get a hash map back. Yes, makes my change even more of a bad idea 🙂