Hi. Its not really a release as its just addressable as git coordinates right now, but i wanted some feedback if this was a general thing people might want. If you ever have a sequence operation that ends with (->> ... (sort-by f) (take N))
you pay the price to realize the entire collection. sorting kinda naturally falls outside of transduction contexts. This uses a bounded min-heap to accumulate the largest N
seen so far enabling (transduce xf (keep-heap 100 compare-fn) reducible-coll)
. It was motivated by querying a db, scoring in Clojure and only wanting to keep the top 100 results sorted by the scoring. The use of the heap enables taking N sorted results without keeping all of the elements in memory to sort them. https://github.com/dpsutton/heap-keep
This is cool - thanks for sharing. There is an x/sort-by
in https://github.com/cgrand/xforms (that just builds up a complete collection). Have you considered possibly submitting keep-heap
as an additional transducer to xforms (vs releasing a new lib)?
i hadn't. as its only a single function i wasn't sure where it would be most useful or if many people would even find it useful
and yeah i looked at the x/sort-by
but it realizes the whole collection which is the primary evil i'm trying to avoid