I’ll get back to this in a bit. Going through some other issues but being moved to data preprocessing soon so will be able to use Meander to it’s full potential.
I have a collection of maps where one of the keys in the maps is a timestamp. I am pattern matching over these maps using scan
to filter for particular maps. I ultimately want to select only the most recent map. Something like this:
(m/search [{:ts 0 :m "hello" :j :k} {:ts 1 :m "goodbye" :j :k} {:ts 0 :n "joe" :j :k} {:ts 1 :n "sally" :j :k}]
(m/and (m/most-recent-by ?ts1 (m/scan {:ts ?ts1 :m ?m :j ?v}))
(m/most-recent-by ?ts2 (m/scan {:ts ?ts2 :n ?n :j ?v}))
[?m ?n]) => (["goodbye" "sally"])
I'm sure this will require a custom strategy but I'm lost on how to get started with that
I'll have to think about that one a little. I have a way in mind I think will work, but will have to find the free time to try it out.
Something like
(m/and (m/scan {:time ?t :as ?m})
(m/separated {:time (m/and ?t1 (m/guard (< ?t1 ?t)))}
?m
{:time (m/and ?t2 (m/guard (< ?t2 ?t)))}))
might work.fyi - The code at the top of this thread doesn't compile. I'm getting Unable to resolve symbol: ?t in this context
from the guard clauses.
Here's my full test case:
(m/search [{:time 1 :msg "hello"} {:time 2 :msg "goodbye"}]
(m/and (m/scan {:time ?t :as ?m})
(m/separated {:time (m/and ?t1 (m/guard (<= ?t1 ?t)))}
#_?m
#_{:time (m/and ?t2 (m/guard (< ?t2 ?t)))}))
?m)
I was thinking about another approach: Is it safe to "exit" meander by using m/app
and perform the reduction in Clojure-land?
The thing that concerns me is that I believe that ends up assuming too much about Meander's runtime order of operations. In this case, I think it assumes that the m/scan
is executed once and won't be revisited. In this trivial example it is but I don't know about more complex situations.
That code not working definitely seems like a bug. Will see if I can find the time to track it down.
I'm not 100% sure what you mean by scan only being executed once.
I still consider the error you are getting a bug, but it can be fixed by moving the guards. That said, that code does not do what you want.
You should be safe to use m/app for something like this. I did work on a reduction example using cata, but it is probably more than you want.
I do think this is a good use case that we should make simpler.
I think part of my problem is that I don't have a good conceptual model for meander. I think of it as kind of a query system or as a simplistic constraint logic system (i know I'm wrong but I don't have a good feel for how wrong I am).
Neither query and constraint systems would guarantee the order in which the scan
would run or even how many times it would run. Since the result of m/app find-latest-data
relies on processing the entire set of data, I got concerned.
I'll play around to see if I can get your example to compile by moving around the guards. Thanks a bunch for your help
Sorry about this bug. guard
kind of sucks in this regard because it depends on variables being bound and, to your point, Meander doesn’t have a declared evaluation model which results in bugs like this one. This is going to change in zeta
where the model will always have the semantic left to right, top to bottom.
Even still, I’m looking at this again and am realizing its the wrong approach anyway.
Without seeing a bit more of what sort of transformation you are doing in context, I can't be sure. But I would suggest trying things out, seeing how they work and if you find something you are not expected, we can definitely look at it. meander is kind like a query system and kind of like a constraint system. But for the most part, you shouldn't be worried about ordering and things like that.
But this will be slow.
That could work. But I feel like it would be rather slow right? I think the optimal solution would basically be a reduction. But also it seems that they want more than one if they are tied for minimal.
Yeah. It will be slow for sure.
But if the number of items is small it won’t be too big a deal.
We need max
etc.
Yeah definitely. I think reducing with cata and building up the min elements would make sense.
Slow isn't much of an issue initially. Under our initial rollout, multiple versions of maps is only a theoretical possibility but not likely. We'll see more versions of maps over the course of weeks.
btw, I'm curious about "slow" - Is this issue about algorithmic complexity or object generation overhead? I ask because this pattern is matching against data that is pulled from a firestore database. So, meander only needs to be faster than that IO
Yeah I was meaning algorithmic complexity. You can do the operation in linear time if you do a reduction.
Or n log n time if you just sort your collection.
Sorting the collection would be easy, of course. How would meander know to take advantage of an early exit from its looping?
Or, is that part of the enhancement you were talking about?
If the array was sorted you would just write a match that would take from the front until something changed. There would be no way for us to know that in general for that sort of match, but you wouldn't have to write match like that because you know it is sorted.
This "time-based" join is part of a larger pattern. Of course, I could break up each join into its own match if necessary.
It would be awesome to have something conceptually like clojure's reduce functionality with its "reduced" feature
https://cljdoc.org/d/meander/epsilon/0.0.378/doc/cookbook#recursion-reduction-and-aggregation
Oh, that's awesome. I didn't really understand the problem that cata was solving until that example
how do recursive patterns work?
We can talk about recursive patterns. But that is failing because your pattern says that enabled should be true and you want that to repeat. If instead you wanted to filter for only the true ones you can use gather. (on my phone so can't give an example right now)
ah i didn't see gather, i'll check the source
(e/gather {:enabled? true :thingid !things} ...)
works, thanks!
Essentially, gather
is just
(seqable (or <pattern> _))
which is short hand for
(or [(or <pattern> _) ...] ((or <pattern> _) ...))
and is primarily useful in instances like these.