[meander/epsilon "0.0.402"]
Thanks!
So I have meandered on over to tech.ml.dataset for processing the columnwise data that is csv. Sorely missing the clarity of Meander patterns.
I’m sorry. 🙂
As someone with a 10,000 foot view and little idea around internals. I’m wondering if there is a new ns there for Meander to handle large column data. Or like a Meander-csv that could leverage the codebase inside of tech.ml.dataset.
Hi! Just so I can understand the desire here I’ll attempt to rephrase as:
I wish meander sequence patterns like (!xs ...) used transducers instead of memory variables.
^^ is this accurate?
i.e.: The issue is that very long sequences don’t fit in memory? Or is it a different problem?(defn unarchived' [stories]
(remove (fn [{:keys [archived completed]}]
(and archived (not completed)))
stories))
(def unarchived
(s/rewrite
((m/or {:archived true
:completed false}
!stories) ...)
;;>
(!stories ...)))
^^ for a really big CSV !stories
needs to be a sequence, not an array.
Conversely when do we need an array not a sequence?I’m still new to all of this so having some trouble keeping up. Though I am willing to dive in and contribute to this problem with a bit of guidance :)
(keep
(fn [value]
(me/rewrite value
{:archived false, :completed true :as ?it}
?it))
'({:archived true, :completed true}
{:archived false, :completed true}
{:archived true, :completed false}
{:archived false, :completed false}))
;; =>
({:archived false, :completed true})
would be decent.This also works
(me/rewrites '({:archived true, :completed true}
{:archived false, :completed true}
{:archived true, :completed false}
{:archived false, :completed false})
(me/scan {:archived false, :completed true :as ?it})
?it)
;; =>
({:archived false, :completed true})
but rewrites
doesn’t support cata
FYI.oh good thinking.
Does that help with the original question of “Meander to handle large column data”?
It can. It just depends on what you are using. If you use a single …
in a pattern, Meander has to apply pattern matching to everything in the collection in question. If you can rephrase the pattern in such a way that search
becomes applicable its nice to go that way.
Yea all of the interesting transformations and where meander has value for me is when I use …
I’m hoping that zeta
will have the ground work for doing that being able to deal with bytes, etc. and then building more sophisticated matching on top.
Cool. At present I’m dealing with two CSV’s. One with 1.5m rows and another with 300k. The dataset library handles it extremely well. It’s just a little more low level than I would like to be working at having seen how nice things can be :)
The goal (for me) is to be able to achieve low level performance but from the comfort of a high level. I believe we can get there via the right pattern matching primitives and pattern aliases (`defsyntax`). So, what I’m saying is, this case is motivating for me too. 🙂