meander

All things about https://github.com/noprompt/meander Need help and no one responded? Feel free to ping @U5K8NTHEZ
Bhougland 2020-01-29T03:31:44.116700Z

Would this be idomatic "Meander"? I have a vector of maps read in by the ultra-csv package, i just want to filter by the country column:

Bhougland 2020-01-29T03:32:24.117400Z

Sorry, did actually include the code:

(defn find-records-for-country2 [records country]
  (m/search records
            (m/scan {:Country ~country :as ?country})

            ?country))
It seems to produce the results I want

đź‘Ť 1
Bhougland 2020-01-29T03:36:53.119600Z

Can you do aggregates similar to this Clojure code (which is probably not great as I am a new to Clojure). Again, I have a vector of maps in this example read in from a csv:

;;group-by example
  (def groupby-example (map
                       (fn [[grp-key values]]
                         {:group grp-key
                          :sum (reduce + (map :Sales values))
                          :max (reduce max (map :COGS values))})
                       (group-by :Country csv-seq)))
the csv-seq is the vector of maps in this example

Bhougland 2020-01-29T12:30:42.122100Z

Thank you for the response, and thank you for the library.

đź‘Ť 1
noprompt 2020-01-29T06:25:36.119800Z

This looks fine to me. đź‘Ť

noprompt 2020-01-29T07:00:28.120Z

Meander doesn’t have anything for doing aggregates as conveniently as say group-by. In fact, we tend to recommend group-by because, well, its just fine. Of course, you could pull it off with Meander but it wouldn’t be as clean as what you have here. I am, however, interested in the idea of having some kind of aggregate type thing but I’m not sure what it would look like.

grounded_sage 2020-01-29T11:42:51.121900Z

How would one start with web scraping in Meander. I was planning on using Jsoup and following along the lessons in Purely Functional TV. Though I’m wondering now if I just collect the html using clj-http as I am already using this in my code. Convert to hiccup then use Meander to fetch the information.

grounded_sage 2020-01-29T13:38:59.126200Z

I can’t seem to follow the web scraping with Meander. This is to small of a sample. https://github.com/noprompt/meander/blob/epsilon/doc/cookbook.md#webscrape-html This is way over the top for a beginner. https://github.com/noprompt/meander/blob/epsilon/examples/hiccup.clj This returns nothing.

(m/search html-in-edn
   (m/$ [:main {:class "maincontents"}
         . _ ...
         [:h2 _ ?var]])
   [?var])
This returns the desired result.
(m/search cal-edn
   (m/$ [:h2 _ ?var])
   [?var])

jimmy 2020-01-29T13:55:28.128Z

@grounded_sage does your main tag end with an h2? If not you need to add & _ afterwards to make it match.

jimmy 2020-01-29T13:56:00.128900Z

And yeah the hiccup parser is supposed to be a very advanced example to show what meander is capable of.

grounded_sage 2020-01-29T14:44:56.129600Z

Returns nothing.

[:main {:class "maincontents"}
                  . _ ...
                  [:h2 _ ?var]
                  & _ ]

jimmy 2020-01-29T14:45:16.130100Z

What is your hiccup?

grounded_sage 2020-01-29T14:45:33.130500Z

I thought my intuition was building on it but I’m uncertain now

jimmy 2020-01-29T14:46:09.131300Z

If you give the input I'm sure we can find the problem.

grounded_sage 2020-01-29T14:53:37.131700Z

(def google-edn (as-hiccup (parse (slurp google-url))))
  (m/search google-edn
            (m/$ [:html {:lang "de"}
                  . _ ... 
                  [:title _ ?title]
                  & _])
            ?title)

grounded_sage 2020-01-29T14:53:47.131900Z

Gives me nothing

grounded_sage 2020-01-29T14:55:15.132900Z

(m/search google-edn
            (m/$ [:title _ ?title])
            ?title)
and
(m/search google-edn
            (m/$ [:html {:lang ?lang} & _])
           ?lang)
These work.

jimmy 2020-01-29T14:57:17.134100Z

Guessing at the structure. Google has its title inside the head tag right? If so you'd need to model that or use another $

jimmy 2020-01-29T14:57:50.134500Z

That one says the the title is a direct child of HTML.

grounded_sage 2020-01-29T14:59:34.134700Z

Ah that’s my error

grounded_sage 2020-01-29T14:59:42.135Z

(m/search google-edn
          (m/$ [:html {:lang ?lang}
                . _ ...
                [:head . _ ... [:title _ ?title] . _ ...]
                & _])
          [?lang ?title])
This works

jimmy 2020-01-29T15:00:54.135500Z

Awesome. Hopefully that makes sense.

grounded_sage 2020-01-29T15:00:56.135700Z

Slowly getting there 🙂

jimmy 2020-01-29T15:02:02.136200Z

It is a definitely a learning curve. But these are good important steps to go through.

grounded_sage 2020-01-29T15:03:20.137800Z

Both (m/$ …) and (m/scan) works in that nesting. As well.

jimmy 2020-01-29T15:04:28.137900Z

Yep, so $ will let you traverse deeper. And scan will let you look for things at the same level.

đź‘Ť 1
grounded_sage 2020-01-29T23:15:07.143700Z

So this runs. But I’d like to trim down those repeating items in the (m/scan..) and also figure out how to get the commented pieces to work.

(m/search events-edn
            (m/$ [:main
                  . _ ...
                  (m/$ [_ {:class "cal-listitem"}
                        . _ ...

                        (m/scan [_ {:class "cal_date_show"}
                                 . _ ...
                                 [_ {:class "cal_date_date"} ?date]
                                 . _ ...
                                 [_ {:class "cal_date_time"} ?time]
                                 . _ ...
                                 [_ {:class "calinfo"} ?info]
                                 . _ ...
                                 & _])
                        ;. _ ... 
                        #_(m/$ [_ {:class "listinfo"}
                                . _ ...
                                [_ {:class "thetitles"} ?titles]
                                & _])

                        & _])
                  & _])
            
           [?date ?time ?info])
cal_date_show actually is as below but was unable to get the pattern to match.
[:div {:class "cal_date_show"} 
  [:div {:class "cal_date_day"} _]      ; This could simply be _ignored ??
  [:div {:class "cal_date_date"} ?date]
  [:div {:class "cal_date_time"} ?time]
  [:div {:class "calinfo"} ?info]]