clojure

New to Clojure? Try the #beginners channel. Official docs: https://clojure.org/ Searchable message archives: https://clojurians-log.clojureverse.org/
joshkh 2020-10-25T10:05:59.166400Z

this is probably a typo, right? https://github.com/cognitect/transit-clj#usage

(def reader (transit/reader in :json))

rakyi 2020-10-25T11:47:46.166800Z

I don’t see the typo. What do you mean specifically?

misha 2020-10-25T14:42:49.170100Z

greetings! what is the fastest way to reduce size of seq of maps by comparing only subset of keys, eg: [{:id (uuid) :a 1 :b 1} {:id (uuid) :a 1 :b 1}] => ;;just [{:id (uuid) :a 1 :b 1}]

misha 2020-10-25T14:46:36.171400Z

seq is millions of items. the only thing I can think of - is to put those extra unique fields into meta, and accumulate maps into set from the get go

misha 2020-10-26T08:07:53.212400Z

looking into it, thanks

joshkh 2020-10-25T14:49:19.171900Z

specifically the in symbol. i had a flashback to coffeescript

p-himik 2020-10-25T14:50:00.172400Z

clojure.core/distinct

p-himik 2020-10-25T14:50:26.173100Z

Or do you mean comparing only by :a and :b?

p-himik 2020-10-25T14:50:53.173600Z

If so, what of the two :id keys will be used?

misha 2020-10-25T14:52:37.175300Z

the task itself is a game simulation, where you start from initial state, and given some rules, generate possible next states from it, and iterate until out of memory :) so with each iteration set of next states grows exponentially in addition to already calculated ones

misha 2020-10-25T14:53:19.175400Z

compare by equivalent of (dissoc m :id) or (select-keys m [:a :b])

p-himik 2020-10-25T14:53:55.175600Z

But what ID will you use? Or are you OK with a random ID?

p-himik 2020-10-25T14:54:29.175900Z

I mean the result, not for comparison.

p-himik 2020-10-25T14:55:15.176100Z

If dissoc + distinct + adding a new ID (all via transducers) is not fast enough, then I would look into tries.

misha 2020-10-25T14:55:17.176300Z

id format is not important, but what is important - next states will have :prev-state-id

alexmiller 2020-10-25T14:56:34.177100Z

That’s the input arg to read from

misha 2020-10-25T14:57:01.177300Z

so I 1) cannot assign just random ids after everything is done. 2) given millions of items = just scanning through them and assigning ids after some step - takes very long too

misha 2020-10-25T14:57:50.177500Z

(def in (ByteArrayInputStream. (.toByteArray out)))
(def reader (transit/reader in :json))

joshkh 2020-10-25T15:00:03.177700Z

ever stare at something for so long that you don't see the forest through the trees? 😳

misha 2020-10-25T15:09:13.177900Z

happens all the time, it's ok

matthewad 2020-10-25T15:32:30.178100Z

You could dissoc the id, put the original value in the meta data, use distinct, then retrieve the original data from the meta data. I have no idea if this will be suitable performance-wise, but it avoids the 2 issues you listed.

(defn distinct-ignore-id [items]
        (map #(:original-item (meta %))
             (distinct
              (map (fn [item]
                     (with-meta (dissoc item :id) 
                       {:original-item item})) 
                   items))))

matthewad 2020-10-25T15:38:00.178300Z

Sorry, I just noticed this is basically the same solution you suggested in the top level comment. Nevermind.

1
Nico 2020-10-25T17:08:49.181500Z

hi, if I want to take a string like this

this is a [test](test.md) of inline [links](<https://example.com>)
and remove the links but keep track of them, like this:
{:text "this is a test of inline links" :links [{:name "test" :path "test.md"} {:name "links" :path "<https://example.com>"}]}
what would be the best way to do this? (I am parsing a markdown file, but a full markdown parser is unavailable)

2020-10-25T17:18:34.181600Z

One way (not necessarily the best) is to use a parser like instaparse and write a small grammar for the things you want to recognize differently from the rest of the text. That might be tricky for handling arbitrary Github-flavored markdown, since I know that some of their constructs are dependent on what comes first on the line, and other construts like [link to this text](something.md) can have the contents between [] spanning multiple lines.

2020-10-25T17:19:44.181800Z

This StackOverflow question has some answers that might lead you to a full parser for Github-flavored markdown, but perhaps not written in Clojure.

Nico 2020-10-25T17:21:19.182800Z

I know that I'm only going to have link contents on one line, and all I need to do is extract links rather than do proper markdown parsing

2020-10-25T17:21:25.183100Z

This sundown library is mentioned: https://github.com/vmg/sundown Its README says it has bindings for many languages, including Python, Ruby, JavaScript, Haskell, and Go, but I don't see Java there. The JavaScript library might be easily callable from ClojureScript.

Nico 2020-10-25T17:21:48.183900Z

I'm also running in a babashka environment, so the libraries that would be available arne't here

2020-10-25T17:22:04.184400Z

If you know such links are always going to be within a single line, you could attempt to use regex matching.

solf 2020-10-25T17:22:46.185100Z

Is there existent tooling that would take function docstrings from a project to populate an API section of it's README? The idea being to avoid having to maintain documentation from two places

2020-10-25T17:23:16.185200Z

Do you ever expect the text to have comments in parentheses or square brackets, that aren't links? e.g. could someone write "the foo (which is variant of a bar) can be blurg [see reference 5]"

Nico 2020-10-25T17:25:10.185400Z

yeah

Nico 2020-10-25T17:25:46.185600Z

I know how to regex match to find if a line contains a link but not what the link is or how to then remove it

Nico 2020-10-25T17:30:43.185800Z

I'm not sure how you'd do that with regex

lukasz 2020-10-25T17:31:06.186Z

Something like this https://cljdoc.org or https://github.com/weavejester/codox ?

Ed 2020-10-25T17:33:20.186300Z

(let [s "this is a [test](test.md) of inline [links](<https://example.com>)"
        re #"\[([^\]]*)\]\(([^\]]*)\)"]
    {:links (-&gt;&gt; (re-seq re s) (map (fn [[_ n p]] {:name n :path p})))
     :text (clojure.string/replace s re "$1")})

Ed 2020-10-25T17:33:27.186500Z

maybe something like that?

Nico 2020-10-25T17:37:27.186700Z

ah, thanks

Nico 2020-10-25T17:39:32.186900Z

didn't know re-seq existed

👍 1
solf 2020-10-25T18:55:45.187300Z

Codox might be to target the README, using a custom writer. It does seem rather over featured for what I had in mind. I'm not sure how easy it would be to use it through clj

solf 2020-10-25T18:58:26.187500Z

I had in mind a light program, using probably clj-kondo under the hood, to just lift some docstrings from .clj files to the README and using an api like this imaginary one:

clj -A:readme "./src/api.clj"  --output README.md