clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
Ben Sless 2021-06-17T06:48:10.066500Z

Not sure this is the proper channel, and maybe it should go on ask.clojure later, but it seems like apply and transducers have some friction between them. Ideally, I would not want to hold on to the resulting sequence at all, not even chunks, just pull items out of an iterator then invoke, but I don't see how that's possible given the current implementation.

2021-06-17T08:00:42.066600Z

Could you explain more what is the problem? Some example would be nice. I'm not sure there is a direct connection between apply and transducers

Ben Sless 2021-06-17T08:22:42.066800Z

I've seen this pattern more than once, of (apply f (map g coll)) . Why should I want to hold on to the original collection? Why allocate it at all? I think this generalizes to transducers, too.

Ben Sless 2021-06-17T08:52:58.067Z

Particularly to the case of Eduction which seems to serve exactly the purpose of avoiding allocation

borkdude 2021-06-17T09:15:07.067200Z

@ben.sless Interesting point. You really just want to have f see g applied to all its args, no need to allocate anything. Perhaps (trans-fn f g) or so could work ;)

Ben Sless 2021-06-17T09:18:06.067400Z

Sort of. I went through an iterator:

(defn- consume
  [^java.util.Iterator it]
  (loop [coll []]
    (if (.hasNext it)
      (recur (conj coll (.next it)))
      coll)))

(defmacro ^:private -apply-to-it
  ([f it]
   `(-apply-to-it ~f ~it [] 0 20))
  ([f it args depth max-depth]
   (if (= depth max-depth)
     `(if (.hasNext ~it)
        (apply ~f ~@args (consume ~it))
        (~f ~@args))
     (let [g (gensym)]
       `(if (.hasNext ~it)
          (let [~g (.next ~it)]
            (-apply-to-it ~f ~it ~(conj args g) ~(inc depth) ~max-depth))
          (~f ~@args))))))

(defn apply-to-it
  "Like apply but does no intermediate allocations, consumes its argument
  as an iterable."
  [f ^Iterable it]
  (let [^java.util.Iterator it (.iterator it)]
    (-apply-to-it f it)))
bit yucky, though

borkdude 2021-06-17T09:23:45.067600Z

(defrecord TransFn [f arg-fn]
  clojure.lang.IFn
  (invoke [_] (f))
  (invoke [_ a1] (f (arg-fn a1)))
  (invoke [_ a1 a2] (f (arg-fn a1) (arg-fn a2))))

((->TransFn + inc) 1 2) ;;=> 5

Ben Sless 2021-06-17T09:41:58.067800Z

That's when you know how many args you're getting, doesn't help with the restFn case

borkdude 2021-06-17T09:48:46.068Z

true

2021-06-17T15:49:18.068700Z

That's not apple, that is reduce

2021-06-17T15:49:26.068900Z

apply

ghadi 2021-06-17T15:58:59.069100Z

https://clojurians.slack.com/archives/C06E3HYPR/p1623918162066800?thread_ts=1623912490.066500&cid=C06E3HYPR not sure I understand the "hold onto the collection" part

1💯
2021-06-17T16:36:04.069400Z

right, apply doesn't force or hold on to anything

user=> (apply (fn [x y & _] (+ x y)) (range))
1

2021-06-17T16:49:08.069800Z

the complaint here is that a lot of functions used with apply are really binary associative operations with a var args case added and handled via reduce, and that internal fold is not exposed so you can't fuse other operations into (transducers)

2021-06-17T16:49:24.070Z

which is, of course, a bad complaint

2021-06-17T16:49:42.070200Z

it conflates apply and reduce, which are not at all the same thing

2021-06-17T16:50:09.070400Z

and if you want the fold over the binary associative operation exposed, then just don't use the varargs case, do you own reduce

2021-06-17T16:50:42.070600Z

which is what the code above basically does, it is a reduce over an iterator, annoyingly called apply, and of course reduce already has a fast path for iterators

2021-06-17T16:51:37.070800Z

fruit of the poison tree, the root of the poison tree being conflating reduce and apply

Ben Sless 2021-06-17T16:53:41.071Z

Thank you for assuming the wrong use case. I know to pick reduce when it's appropriate.

Ben Sless 2021-06-17T16:54:20.071200Z

The case I had the misfortune to come across is exactly that of unknown functions which can take any number of arguments

Ben Sless 2021-06-17T16:55:00.071400Z

You could argue the code is bad and you'd be right, but please don't make assumptions

2021-06-17T16:58:36.071600Z

I might not understand what you are saying about apply here - how does apply hold onto a collection?

2021-06-17T16:58:53.071800Z

also I'm not seeing the connection between apply and transducers here at all

2021-06-17T17:00:29.072100Z

yes, there are no transducers at all in the examples given

Ben Sless 2021-06-17T17:06:41.072300Z

Generally, if we look at (apply g (map f coll)) it would be nice if I could (apply g (->Eduction (map f) coll)) without allocating an intermediate sequence, instead pull the elements directly out of the iterable. The way map is implemented it wouldn't be possible, but with Eduction it should be

2021-06-17T17:13:04.072500Z

that still isn't about transducers, that is about iterators, and replacing clojure varargs which pass the collected varargs as a seq, as an iterator instead

2021-06-17T17:17:02.072700Z

it still bakes in assumptions about the usage of varargs inside the functions being called, the assumption being the function unrolls arguments, and doesn't just do something with the args as a seq

Ben Sless 2021-06-17T17:22:05.072900Z

as far as I know most functions used in this system that way are not varargs functions, its just that at runtime their arity is unknown (it's a badly written interpreter)

2021-06-17T08:00:42.066600Z

Could you explain more what is the problem? Some example would be nice. I'm not sure there is a direct connection between apply and transducers

Ben Sless 2021-06-17T08:22:42.066800Z

I've seen this pattern more than once, of (apply f (map g coll)) . Why should I want to hold on to the original collection? Why allocate it at all? I think this generalizes to transducers, too.

Ben Sless 2021-06-17T08:52:58.067Z

Particularly to the case of Eduction which seems to serve exactly the purpose of avoiding allocation

borkdude 2021-06-17T09:15:07.067200Z

@ben.sless Interesting point. You really just want to have f see g applied to all its args, no need to allocate anything. Perhaps (trans-fn f g) or so could work ;)

Ben Sless 2021-06-17T09:18:06.067400Z

Sort of. I went through an iterator:

(defn- consume
  [^java.util.Iterator it]
  (loop [coll []]
    (if (.hasNext it)
      (recur (conj coll (.next it)))
      coll)))

(defmacro ^:private -apply-to-it
  ([f it]
   `(-apply-to-it ~f ~it [] 0 20))
  ([f it args depth max-depth]
   (if (= depth max-depth)
     `(if (.hasNext ~it)
        (apply ~f ~@args (consume ~it))
        (~f ~@args))
     (let [g (gensym)]
       `(if (.hasNext ~it)
          (let [~g (.next ~it)]
            (-apply-to-it ~f ~it ~(conj args g) ~(inc depth) ~max-depth))
          (~f ~@args))))))

(defn apply-to-it
  "Like apply but does no intermediate allocations, consumes its argument
  as an iterable."
  [f ^Iterable it]
  (let [^java.util.Iterator it (.iterator it)]
    (-apply-to-it f it)))
bit yucky, though

borkdude 2021-06-17T09:23:45.067600Z

(defrecord TransFn [f arg-fn]
  clojure.lang.IFn
  (invoke [_] (f))
  (invoke [_ a1] (f (arg-fn a1)))
  (invoke [_ a1 a2] (f (arg-fn a1) (arg-fn a2))))

((->TransFn + inc) 1 2) ;;=> 5

Ben Sless 2021-06-17T09:41:58.067800Z

That's when you know how many args you're getting, doesn't help with the restFn case

borkdude 2021-06-17T09:48:46.068Z

true

2021-06-17T15:49:18.068700Z

That's not apple, that is reduce

2021-06-17T15:49:26.068900Z

apply

ghadi 2021-06-17T15:58:59.069100Z

https://clojurians.slack.com/archives/C06E3HYPR/p1623918162066800?thread_ts=1623912490.066500&cid=C06E3HYPR not sure I understand the "hold onto the collection" part

1💯
2021-06-17T16:36:04.069400Z

right, apply doesn't force or hold on to anything

user=> (apply (fn [x y & _] (+ x y)) (range))
1

2021-06-17T16:49:08.069800Z

the complaint here is that a lot of functions used with apply are really binary associative operations with a var args case added and handled via reduce, and that internal fold is not exposed so you can't fuse other operations into (transducers)

2021-06-17T16:49:24.070Z

which is, of course, a bad complaint

2021-06-17T16:49:42.070200Z

it conflates apply and reduce, which are not at all the same thing

2021-06-17T16:50:09.070400Z

and if you want the fold over the binary associative operation exposed, then just don't use the varargs case, do you own reduce

2021-06-17T16:50:42.070600Z

which is what the code above basically does, it is a reduce over an iterator, annoyingly called apply, and of course reduce already has a fast path for iterators

2021-06-17T16:51:37.070800Z

fruit of the poison tree, the root of the poison tree being conflating reduce and apply

Ben Sless 2021-06-17T16:53:41.071Z

Thank you for assuming the wrong use case. I know to pick reduce when it's appropriate.

Ben Sless 2021-06-17T16:54:20.071200Z

The case I had the misfortune to come across is exactly that of unknown functions which can take any number of arguments

Ben Sless 2021-06-17T16:55:00.071400Z

You could argue the code is bad and you'd be right, but please don't make assumptions

2021-06-17T16:58:36.071600Z

I might not understand what you are saying about apply here - how does apply hold onto a collection?

2021-06-17T16:58:53.071800Z

also I'm not seeing the connection between apply and transducers here at all

2021-06-17T17:00:29.072100Z

yes, there are no transducers at all in the examples given

Ben Sless 2021-06-17T17:06:41.072300Z

Generally, if we look at (apply g (map f coll)) it would be nice if I could (apply g (->Eduction (map f) coll)) without allocating an intermediate sequence, instead pull the elements directly out of the iterable. The way map is implemented it wouldn't be possible, but with Eduction it should be

2021-06-17T17:13:04.072500Z

that still isn't about transducers, that is about iterators, and replacing clojure varargs which pass the collected varargs as a seq, as an iterator instead

2021-06-17T17:17:02.072700Z

it still bakes in assumptions about the usage of varargs inside the functions being called, the assumption being the function unrolls arguments, and doesn't just do something with the args as a seq

Ben Sless 2021-06-17T17:22:05.072900Z

as far as I know most functions used in this system that way are not varargs functions, its just that at runtime their arity is unknown (it's a badly written interpreter)

seancorfield 2021-06-17T22:39:51.079200Z

Question about select-keys before I spend time writing this up on http://ask.clojure.org — I have a custom type that behaves like a hash map (it’s an extension to APersistentMap that allows keys to be strings or keywords and also case-insensitive — for non-Clojure language interop reasons). When I call select-keys on it, I get a regular Clojure hash map whose keys are the “original” keys from my custom hash map (in this case, they’re uppercase strings) and that makes the result pretty useless in code that follows and it’s because select-keys explicitly uses {}. If select-keys instead used (or (empty map) {}) — or some optimized form of that — then it would preserve the underlying custom hash map type (which would be super-convenient). Is such a proposed change likely to be considered? (I can understand the answer of “no” here on the grounds that this is an edge case that almost no one is going to run into — and I have a workaround: don’t use select-keys on this custom type 🙂 )

1👍
alexmiller 2021-06-17T22:41:43.079500Z

I think there might be a ticket abou this

seancorfield 2021-06-17T22:42:45.079900Z

Oh, sorry, I should have looked before I “leaped”…

alexmiller 2021-06-17T22:43:18.080500Z

I would worry that this would be a breaking change for the cases where someone is using select-keys specifically to lose the special map type-ness of the source

seancorfield 2021-06-17T22:45:01.080900Z

An interesting take — and a valid concern, yes.

alexmiller 2021-06-17T22:45:02.081Z

https://clojure.atlassian.net/browse/CLJ-1287

alexmiller 2021-06-17T22:45:06.081200Z

was closed as won't fix

seancorfield 2021-06-17T22:46:14.081700Z

Probably why I couldn’t find it on ask. Fair enough. I’ll tackle this a different way then.

alexmiller 2021-06-17T22:47:20.081900Z

https://groups.google.com/forum/#!topic/clojure/l_V1N1nRF-c - long discussion on ml

seancorfield 2021-06-17T22:58:59.082900Z

Thanks. It turns out my custom hash map type doesn’t implement IObj so that would be another breakage when using (empty my-map) in select-keys. So it’s clearly a terrible idea! 😐

2021-06-17T23:05:54.083300Z

I am actively using select-keys to get around the partition thing

2021-06-24T06:31:42.101200Z

Was a conclusion ever reached on this issue? Is the mistake here having reifying a map facade over a mutable connection that is then combined with resource-managing reducers? Or is it inherent transducers and reducibles?

2021-06-24T06:34:35.101400Z

I’m asking because I’d like to offer similar affordances to next.jdbc but also avoid recreating this issue in a similar API I’m building; one that works over RDF data sources, not JDBC ones.

seancorfield 2021-06-24T06:46:06.101600Z

What do you mean by "this issue" cause we covered quite a bit of ground 😊

seancorfield 2021-06-24T06:49:29.101800Z

There was clear consensus that my proposed change to select-keys was a bad idea - for several reasons in the end.

seancorfield 2021-06-24T06:50:02.102Z

And my discussion was nothing to do with next.jdbc by the way.

2021-06-24T10:33:44.103500Z

Apologies I meant this (from the next.jdbc) docs: > Note: you need to be careful when using stateful transducers, such as partition-by, when reducing over the result of plan. Since plan returns an IReduceInit, the resource management (around the ResultSet) only applies to the reduce operation: many stateful transducers have a completing function that will access elements of the result sequence -- and this will usually fail after the reduction has cleaned up the resources. This is an inherent problem with stateful transducers over resource-managing reductions with no good solution. Which is what I thought @hiredman was referring to here: https://clojurians.slack.com/archives/C06E3HYPR/p1623971622083700?thread_ts=1623971154.083300&cid=C06E3HYPR

seancorfield 2021-06-24T15:40:30.106700Z

Yes, he was referring to that, but that what wasn't what my discussion in #clojure-dev was about -- and his comment was just pointing out a use case that would actually be broken by my proposal for select-keys.

seancorfield 2021-06-24T15:41:56.107100Z

As for your question, I don't think there's anything more to add over what is in the next.jdbc docs: "This is an inherent problem with stateful transducers over resource-managing reductions with no good solution."

seancorfield 2021-06-17T23:07:02.083400Z

“the partition thing”?

2021-06-17T23:13:42.083700Z

the thing were if you transduce over a plan, and the transduce does a partition-by, the final partition gets reduced after plan has closed everything

seancorfield 2021-06-17T23:15:13.083900Z

Ah, gotcha. And right now the select-keys approach lets you extract columns without realizing the row into a hash map in full — and you still get a hash map back. Yes, makes my change even more of a bad idea 🙂