@grounded_sage you could try using m/separated
[_ {:class "cal_date_show"}
;; Pattern match on the rest of the vector (with `&`)
;; using `separated`
& (m/separated
[_ {:class "cal_date_date"} ?date]
[_ {:class "cal_date_time"} ?time]
[_ {:class "calinfo"} ?info])]
m/separated
expands to _ … p1 . _ … p2 . _ …
(see docstring for more info).@grounded_sage It looks like you are saying there are no elements between the various divs? If that is true, then the direct pattern should match. But I know libraries like hickory will keep whitespace nodes (even if they don't semantically matter) and that makes you have to do those ugly things.
@jimmy yea there is not elements as that is the exact pattern. With what hickory does it may not be that suitable for web scraping then.
I'm confused. I was just talking about why you need the _ ... That is because hickory throws in extra nodes that aren't needed. You can still match those things with meander. It is just true that you will need to deal with them because they are in your data. Is there something else that is tripping you up?
Are you getting back extra results or something you didn't expect?
(Also this is a topic I should write blog post on).
@grounded_sage Put together a gist of what I think you are doing so maybe we can help more. https://gist.github.com/jimmyhmiller/e9e710bce6ee60171560e1052ce49539 Let me know I got anything wrong.
I’ll take a look tomorrow caught up learning core.async atm. Yea once I wrap my head around Meander I will give a talk at the meetup here in Berlin. I want more people to know about this lib
FYi, we've come to rely on a pattern that looks like, (reduce #(%2 %1) input (m/search input PATTERN (fn [input] do some modification)))
.
it seems pretty general
Thats actually a pretty intriguing shape @eraserhd any chance you can share some examples?
there's two here: https://gist.github.com/eraserhd/2697af2eab84f7366fcc2c6c4bdc06bc
they might not be the greatest examples