etaoin

borkdude 2018-05-14T18:27:27.000795Z

Has anyone used etaoin as a webscraper instead of testing tool?

borkdude 2018-05-14T18:32:17.000512Z

I’m exploring this option since some site have JS behavior which sucks for doing it simply with Jsoup

borkdude 2018-05-15T10:17:13.000229Z

@gklijs Does Norconex also do JavaScript rendering or purely HTML scraping?

borkdude 2018-05-15T10:17:52.000237Z

Oh, I read it does. Does it work well?

gklijs 2018-05-15T10:57:56.000454Z

It did say so on the website, I used it a couple of years ago and didn't need it.

gklijs 2018-05-14T19:34:16.000239Z

Not sure if it would be the easiest way, I’ve used http://www.norconex.com/collectors/collector-http/ as webscraper before, and was quite easy to setup. Also you have things likes depth, and export of the data, which you would need to build with etaoin, but are already included in norconex.

dottedmag 2018-05-14T19:39:19.000161Z

@borkdude I do

dottedmag 2018-05-14T19:39:48.000438Z

Not really a scraper, but as an automation tool for stupid websites.

borkdude 2018-05-14T19:40:54.000146Z

Thanks, I didn’t know this tool before

borkdude 2018-05-14T19:41:13.000202Z

Stupid websites?

dottedmag 2018-05-14T19:43:15.000099Z

Like, "we have nice REST api to download your bank statements but only after authenticating using terribly slow JS implementing 2-factor authentication"

borkdude 2018-05-14T19:49:20.000175Z

Ah yes, I see.