meander

All things about https://github.com/noprompt/meander Need help and no one responded? Feel free to ping @U5K8NTHEZ
noprompt 2020-01-30T07:00:47.147900Z

@grounded_sage you could try using m/separated

[_ {:class "cal_date_show"}
 ;; Pattern match on the rest of the vector (with `&`)
 ;; using `separated`
 & (m/separated
    [_ {:class "cal_date_date"} ?date]
    [_ {:class "cal_date_time"} ?time]
    [_ {:class "calinfo"} ?info])]
m/separated expands to _ … p1 . _ … p2 . _ … (see docstring for more info).

jimmy 2020-01-30T14:56:40.149400Z

@grounded_sage It looks like you are saying there are no elements between the various divs? If that is true, then the direct pattern should match. But I know libraries like hickory will keep whitespace nodes (even if they don't semantically matter) and that makes you have to do those ugly things.

grounded_sage 2020-01-30T19:18:45.150600Z

@jimmy yea there is not elements as that is the exact pattern. With what hickory does it may not be that suitable for web scraping then.

jimmy 2020-01-30T19:21:58.153400Z

I'm confused. I was just talking about why you need the _ ... That is because hickory throws in extra nodes that aren't needed. You can still match those things with meander. It is just true that you will need to deal with them because they are in your data. Is there something else that is tripping you up?

jimmy 2020-01-30T19:24:30.153900Z

Are you getting back extra results or something you didn't expect?

jimmy 2020-01-30T19:25:45.154600Z

(Also this is a topic I should write blog post on).

jimmy 2020-01-30T20:24:20.155400Z

@grounded_sage Put together a gist of what I think you are doing so maybe we can help more. https://gist.github.com/jimmyhmiller/e9e710bce6ee60171560e1052ce49539 Let me know I got anything wrong.

grounded_sage 2020-01-30T20:45:50.156900Z

I’ll take a look tomorrow caught up learning core.async atm. Yea once I wrap my head around Meander I will give a talk at the meetup here in Berlin. I want more people to know about this lib

👍 1
eraserhd 2020-01-30T21:23:45.158300Z

FYi, we've come to rely on a pattern that looks like, (reduce #(%2 %1) input (m/search input PATTERN (fn [input] do some modification))).

eraserhd 2020-01-30T21:24:43.158700Z

it seems pretty general

noprompt 2020-01-30T22:12:37.159500Z

Thats actually a pretty intriguing shape @eraserhd any chance you can share some examples?

eraserhd 2020-01-30T22:20:41.159800Z

there's two here: https://gist.github.com/eraserhd/2697af2eab84f7366fcc2c6c4bdc06bc

eraserhd 2020-01-30T22:21:08.160100Z

they might not be the greatest examples