off-topic

https://github.com/clojurians/community-development/blob/master/Code-of-Conduct.md Clojurians Slack Community Code of Conduct. Searchable message archives are at https://clojurians-log.clojureverse.org/
jaihindhreddy 2021-03-06T08:39:05.171200Z

Will anyone on http://lobste.rs consider giving me an invite?

Ben Sless 2021-03-06T15:02:01.172100Z

Anyone has a recommended Clojure library for simple text analysis in English?

borkdude 2021-03-06T15:07:03.172800Z

What do you mean with simple text analysis? Part of speech? You can try stanford NLP. Here is a demo: https://corenlp.run/

borkdude 2021-03-06T15:07:24.173100Z

@simongray has made a little wrapper lib for this

simongray 2021-03-06T15:08:28.173600Z

yup, it’s available at https://github.com/simongray/datalinguist, but currently requires you to use deps.edn since I have not packaged it as a JAR yet. Nevertheless, it’s probably still the most full-featured CoreNLP experience you will get in Clojure right now.

simongray 2021-03-06T15:13:42.175100Z

Another option is to use CoreNLP directly through interop, but I don’t recommend that… there’s a reason I’m trying to wrap it.

borkdude 2021-03-06T15:14:03.175600Z

We use Standard NLP at work: https://covid-search.doctorevidence.com/

Ben Sless 2021-03-06T15:16:47.176200Z

I was looking for that!

Ben Sless 2021-03-06T15:16:48.176400Z

Google and github did not cooperate with me

Ben Sless 2021-03-06T15:34:53.176600Z

oh wow the models are heavy Does not seem suitable for a small script?

simongray 2021-03-06T15:46:56.176800Z

Yeah, they're usually a couple hundred MBs apiece AFAIK. I think most language models produced through machine learning tend to be quite heavy and the memory requirements are usually pretty substantial too for most of the interesting things you wanna do.

orestis 2021-03-06T18:03:31.178600Z

How do you use NLP? Like, as a better full-text search or doing more interesting stuff like trying to extract information from texts?

orestis 2021-03-06T18:07:51.179100Z

Ouch, Standford and CoreNLP are GPL -- probably no go for us then 😞

simongray 2021-03-06T18:14:22.179200Z

Yup - it sucks

lread 2021-03-06T18:48:04.182100Z

I used the https://github.com/facebookarchive/duckling_old on a personal project a couple of years ago when I was starting my Clojure journey. I enjoyed the experience. The https://github.com/facebook/duckling.

Ben Sless 2021-03-06T20:42:19.184100Z

I just wanted a simple way to lint commit messages

lread 2021-03-06T21:52:59.185400Z

Ah @ben.sless, maybe just roll your own then? Depending on how sophisticated your linting is… that maybe be easier than figuring out some NLP thingy.