clojure

New to Clojure? Try the #beginners channel. Official docs: https://clojure.org/ Searchable message archives: https://clojurians-log.clojureverse.org/
Tamas 2020-12-22T04:35:49.325300Z

To complete the example :-): (data or {}).get("domain", {}).get("name", "bar") That being said these days I end up with a get-in function in python code.

1👍
jumar 2020-12-22T07:18:39.325600Z

Also note that pmap will very likely run more than 2+cpus tasks at the same time due to chunking: https://github.com/jumarko/clojure-experiments/blob/master/src/clojure_experiments/experiments.clj#L556-L576

phronmophobic 2020-12-22T08:11:03.326Z

the above is a nice addition. I still prefer clojure to python by quite a bit, but python isn't so bad

Pavel Shatyor 2020-12-22T09:59:43.326300Z

Hi everyone! 😉 Folks, Node Congress has CFP open. Please submit if have something to say. CFP is open until January 10 Join here — http://bit.ly/CFPNode

Tamas 2020-12-22T10:21:22.326500Z

same here! ie. python isn't bad but I prefer clojure

kari 2020-12-22T13:26:50.327900Z

Is https://github.com/MastodonC/kixi.stats the most used statistics library for Clojure nowadays?

Otis van Duijnen Montijn 2020-12-22T15:50:56.331500Z

I am trying to figure out what states a certain state of a state machine can transition to. I am now dealing with data that looks like this:

{:com.fulcrologic.fulcro.ui-state-machines/handler #unknown function (env)
 {return com.fulcrologic.fulcro.ui_state_machines.activate
  (env, new cljs.core.Keyword ("state", "button1", "state/button1", 936328262))}}
I need to extract the "state/button1" from this data. Does anyone have tips on how to do that?

borkdude 2020-12-22T15:54:08.332Z

it seems you are dealing with an opaque function object and not data?

Otis van Duijnen Montijn 2020-12-23T08:51:53.357600Z

It's CLJS

Otis van Duijnen Montijn 2020-12-22T15:56:10.332100Z

Do you know of any documentation on this I can use?

borkdude 2020-12-22T16:03:10.332300Z

documentation on what exactly?

Otis van Duijnen Montijn 2020-12-22T16:03:45.332500Z

I think I get what you meant now. Thanks for this. I probably need to call the function to figure out the return value

scottbale 2020-12-22T18:08:43.332900Z

What is the idiomatic way these days to implement a CLI in a project? I'm new to deps.edn - does it allow specifying a default main namespace, like the equivalent of :main in project.clj? I'm envisioning that a user could git-clone the repo and invoke some minimal clojure command to pass control to main method of intended namespace (which actually implements the CLI using tools.cli and provides usage info when invoked with no args, as per usual).

mike_ananev 2020-12-23T11:48:09.359900Z

@scottbale You may pick my template for app using cli https://github.com/redstarssystems/app-template

1
mike_ananev 2020-12-23T11:48:55.360200Z

This template adapted for IDEA. Use make for control.

clyfe 2020-12-22T18:21:15.333100Z

see -M (main) or -X (arbitrary fn) here: https://clojure.org/guides/deps_and_cli#_using_a_main https://clojure.org/reference/deps_and_cli

Ed 2020-12-22T18:27:36.333300Z

it looks a bit like javascript?? if you have a javascript function, you can call .toString on it to get it's source and parse that string? is that what you mean?

scottbale 2020-12-22T18:40:24.333500Z

Thanks. I'm still working my way through all that documentation. So in my example a user would have to know to invoke

clojure -M -m cli
at a minimum. My question is, can the cli namespace somehow be specified in deps.edn? Is there some even more minimal clojure command like
clojure -M
or like a way to have project specific usage resulting from clojure -?. Or is it just not designed|intended for that?

scottbale 2020-12-22T18:42:13.333700Z

I suppose it's a moot question, once the project were packaged up as a proper release, presumably a jar file with a start script.

clyfe 2020-12-22T18:43:00.333900Z

in an alias: :main-opts ["-m" "my.ns.with.main" "arg1" "arg2"]

clyfe 2020-12-22T18:43:55.334200Z

then: clj -M:thealias

borkdude 2020-12-22T18:56:04.334400Z

@scottbale If your CLI only uses clojure.core and tools.cli and some other commonly used libs, you could also consider babashka, since that is built with this usage in mind.

1👍
borkdude 2020-12-22T18:57:43.334700Z

Another option (for regular JVM Clojure) is just writing a script which you can invoke directly, not with a main. Like here: https://gist.github.com/borkdude/e6f0b12f9352f3375e5f3277d2aba6c9

borkdude 2020-12-22T18:58:30.335Z

But typically (with deps.edn) you would write an alias with pre-defined main args like explained above.

scottbale 2020-12-22T18:59:43.335200Z

Thanks to you both, this is very helpful and exactly what I was wondering about: how is this typically done.

2020-12-22T20:19:03.335500Z

Yes, sometimes, but now it's just a design choice, not a limitation of the paradigm. Key is just a function implemented with:

(defn key
  "Returns the key of the map entry."
  [map-entry]
  (-key map-entry))
If it wanted, it could handle nil in any way.

2020-12-22T20:22:40.335800Z

I wasn't specifically singling out Python, more OO vs Functional.

2020-12-22T20:23:24.336Z

My point being, what if you wanted a .get that can handle None or any other type, maybe vector, etc.

2020-12-22T20:24:02.336200Z

In OO, all types would need to agree to share a .get interface, and provide an implementation for it

2020-12-22T20:27:11.336400Z

But also, in this particular case, ya I do find Python's handling of None on .get less then ideal. Think Clojure's handling is much nicer specifically because I think the above is a common source of bug.

2020-12-22T20:30:17.336600Z

And not withstanding, I found this example because it was in our case 😅

2020-12-22T20:31:29.336800Z

Don't think so

2020-12-22T20:34:11.337Z

Hum, actually it does seem popular. This one is as well: https://generateme.github.io/fastmath/fastmath.stats.html

2020-12-22T20:34:44.337200Z

Can't say which one is most popular though. I'd say both are production ready if that's what you're worried about

2020-12-22T20:45:35.337400Z

I guess it depends who the user you are targeting is

2020-12-22T20:45:52.337600Z

If a Clojure dev, then use an alias. Then they don't even need to git clone or anything

2020-12-22T20:46:10.337800Z

They just add the alias to their deps.edn user config, and now they can use it

1👍1
2020-12-22T20:54:53.338Z

@jumar I don't think you're correct here. The parallelization level is restricted by the thread pool it uses, chunking won't change that.

2020-12-22T21:06:17.338300Z

I think the difference is fastmath uses Java implementations under the hood, while kick.stat is fully implemented in Clojure using transducers.

2020-12-22T21:44:37.338700Z

the parallelization is controlled by the lag between the launch of new futures and the deref, it uses future which is an expanding unlimited pool

2020-12-22T21:45:15.338900Z

chunking changes the behavior of (map #(future (f %)) coll) which is what actually creates the threads

2020-12-22T21:46:50.339100Z

so the answer is weird and complicated (another reason I don't like pmap) - chunking causes futures to be launched a chunk at a time, if the input is chunked, otherwise the number of futures in flight is controlled by the lag between future generation and future realization (which is done via the blocking deref)

2020-12-22T21:47:10.339300Z

(defn pmap
  "Like map, except f is applied in parallel. Semi-lazy in that the
  parallel computation stays ahead of the consumption, but doesn't
  realize the entire result unless required. Only useful for
  computationally intensive functions where the time of f dominates
  the coordination overhead."
  {:added "1.0"
   :static true}
  ([f coll]
   (let [n (+ 2 (.. Runtime getRuntime availableProcessors))
         rets (map #(future (f %)) coll)
         step (fn step [[x & xs :as vs] fs]
                (lazy-seq
                 (if-let [s (seq fs)]
                   (cons (deref x) (step xs (rest s)))
                   (map deref vs))))]
     (step rets (drop n rets))))
  ([f coll & colls]
   (let [step (fn step [cs]
                (lazy-seq
                 (let [ss (map seq cs)]
                   (when (every? identity ss)
                     (cons (map first ss) (step (map rest ss)))))))]
     (pmap #(apply f %) (step (cons coll colls))))))

2020-12-22T21:48:30.339500Z

the (drop n rets) creates the lag between creation of new futures and blocking deref to wait on them

2020-12-22T21:49:00.339700Z

breaking a common piece of advice to not mix lazy calculation with procedural side effects

2020-12-22T21:55:48.339900Z

Oh ya, my bad, I was thinking of agent send

2020-12-22T21:56:17.340100Z

I actually never deep dived the impl of pmap, hum..

2020-12-22T21:57:10.340300Z

Doesn't the implementation of step here unchunks?

2020-12-22T22:03:08.340600Z

;; changes to this atom will reported via println

(def snitch (atom 0))

(add-watch snitch :logging
           (fn [_ _ old-value new-value]
             (print (str "total goes from " old-value " to " new-value "\n"))))

(defn exercise
  [coll]
  (doall
   (pmap (fn [x]
           (swap! snitch inc)
           (print (str "processing: " x "\n"))
           (swap! snitch dec)
           @snitch)
         coll)))
user=> (exercise (range 10))
total goes from 3 to 4
total goes from 4 to 5
total goes from 2 to 3
total goes from 1 to 2
total goes from 0 to 1
processing: 0
processing: 4
processing: 2
processing: 3
processing: 1
total goes from 5 to 4
total goes from 4 to 3
total goes from 1 to 0
total goes from 2 to 1
total goes from 3 to 2
total goes from 0 to 1
total goes from 1 to 2
processing: 6
processing: 7
total goes from 2 to 3
total goes from 3 to 4
total goes from 5 to 4
total goes from 4 to 5
processing: 8
total goes from 4 to 3
processing: 9
processing: 5
total goes from 3 to 2
total goes from 2 to 1
total goes from 1 to 0
(0 0 0 0 0 0 3 2 0 0)
max parallelism here is 5 - I'm going to try a version where I capture the max and exercise it more aggressively

2020-12-22T22:03:37.340800Z

Cool

2020-12-22T22:03:48.341Z

@didibus I am not good enough with lazy-seqs to read the pmap code and know whether it unchunks, so I'm working empirically

2020-12-22T22:04:14.341200Z

Haha, no one is 😛

2020-12-22T22:06:16.341400Z

yeah, here's my version of exercise that captures the max parallelism:

(defn exercise
  [coll]
  (let [biggest (atom 0)]
    (dorun
     (pmap (fn [x]
             (swap! snitch inc)
             (swap! biggest max @snitch)
             (print (str "processing: " x "\n"))
             (swap! snitch dec)
             @snitch)
           coll))
    @biggest))
(exercise (range 1000)) prints a lot more than I'm going to paste here, and returns 19

2020-12-22T22:06:41.341600Z

lmk if that's flawed, but to my eye that will accurately tell you the max futures spawned concurrently by pmap

2020-12-22T22:07:01.341800Z

(nb range is chunked, which is why I'm using it here)

2020-12-22T22:08:23.342Z

Hum. Ya, looking at the code, its kind of hard to get a full picture. I think the branch of if-let that uses cons will unchunk, but the other branch would not. And the drop n will also trigger the first chunk.

2020-12-22T22:10:12.342200Z

all the retries on that poor little atom make the output with bigger inputs absurd

2020-12-22T22:10:50.342400Z

or maybe that's caused by the printing contention...

2020-12-22T22:11:15.342600Z

Might be better to use a sempahore? I think a lock instead of atom's retry maybe would make this more clear?

2020-12-22T22:11:21.342900Z

(the reason all the prints call str is because otherwise the parts of the prints overlap in the output

2020-12-22T22:11:28.343100Z

hmm

2020-12-22T22:12:18.343300Z

Oh, no I don't think that's what I meant. Whatever the thing that is a locking counter is called

2020-12-22T22:13:30.343500Z

Then again, hum... What if you changed the impl of pmap so that inside the future it incremented and decremented the counter before and after running f ?

2020-12-22T22:14:10.343700Z

that would be the same behavior, with more work to achieve it

2020-12-22T22:17:09.343900Z

hum..

2020-12-22T22:18:21.344100Z

I rewrote to an agent (doesn't retry), the prints are now in intelligible order, the answer is still high (33, 37, 38, 39, 36 ...)

2020-12-22T22:20:38.344300Z

max value in theory is 42 (32 chunk size + 8 processors + 2)

2020-12-22T22:25:43.344700Z

Ya, so that matches my interpretation of the code

2020-12-22T22:26:12.344900Z

The first branch I think unchunks, but the drop is what triggers the first chunk

2020-12-22T22:26:27.345100Z

So instead of getting n parallelization, you get size of first chunk

2020-12-22T22:26:39.345300Z

+n

2020-12-22T22:26:53.345500Z

+n hum..

2020-12-22T22:27:00.345700Z

(when you overlap the next chunk)

2020-12-22T22:28:26.345900Z

Oh boy, that's one confusing little function haha. It does seem like, it was written pre-chunking though, so I guess chunking just wasn't taken into account. Hum, I wonder if that explains why I see poor performance improvements from it in practice, like with chunking, the thread overhead is way too high for parallelization

2020-12-22T22:28:35.346100Z

it launches chunk-size futures, but iterates by nproc+2 delay between reader of input and reader of future values, if your input is big enough to have multiple chunks you can have more than chunk size in flight

2020-12-22T22:29:37.346300Z

that could be - I consider it more like "an example of what you could do to parallelize a specific problem" that happened to make it into the codebase, and it doesn't match most people's problems

2020-12-22T22:30:36.346600Z

reducers are more general, but I haven't used them in anger and haven't seen much usage of them in the wild

2020-12-22T22:31:28.346800Z

Ya, I think having to require their namespace and the fact that only fold is still useful now that we have transducers makes them kind of DOA

Eugen 2020-12-22T22:32:46.347Z

I've also built a CLI with https://github.com/l3nz/cli-matic (features on top of tools.cli ) and I had a good experience.

1👀
borkdude 2020-12-22T22:34:37.347300Z

docopt is also an option: https://github.com/nubank/docopt.clj (also works in babashka as a lib)

1👀1✔️