beginners

Getting started with Clojure/ClojureScript? Welcome! Also try: https://ask.clojure.org. Check out resources at https://gist.github.com/yogthos/be323be0361c589570a6da4ccc85f58f.
grazfather 2021-02-24T02:06:36.276800Z

I have a function that’s sort of hard to pull apart to debug. What’s a good way to basically print out each intermediate step without hacking it apart?

defn my-merge-with
  [f & maps]
  (reduce (fn [acc v]
            (reduce (fn [iacc [key newval]]
                      (assoc iacc key (if-let [oldval (key iacc)]
                                        (f oldval newval)
                                        newval)))
                    acc (seq v)))
          {} maps))
fwiw don’t tell me what’s wrong with it lol

grazfather 2021-02-24T02:10:14.277200Z

So i figured out the bug just by throwing in printlns, but more interested in best practices

alexmiller 2021-02-24T02:12:50.277400Z

test smaller pieces

seancorfield 2021-02-24T02:52:32.280300Z

@grazfather Stu Halloway has a great talk online where he defined the REPL error handler to just report a very generic error (so there are no clues about the failure) and then he shows how to step through the code in the REPL, one expression at a time, so you can see what each piece does and where your code "breaks". I can't remember which talk that is, unfortunately. Then he has another talk called Debugging With The Scientific Method which is a really good overall look at how you should approach debugging in general: https://github.com/matthiasn/talk-transcripts/blob/master/Halloway_Stuart/DebuggingWithTheScientificMethod.md (transcript with slides, links to video if you prefer that format).

seancorfield 2021-02-24T02:53:10.281300Z

(can anyone remember the talk I'm thinking of with the generic error handler and the REPL exploration of the code?)

grazfather 2021-02-24T02:53:10.281400Z

I’ve seen his scientific method talk actually

grazfather 2021-02-24T02:53:23.281700Z

it’s more a problem that’s mechanical

grazfather 2021-02-24T02:53:41.282200Z

how do I non-annoyingly extract parts to test and extract information

seancorfield 2021-02-24T02:54:40.283Z

With the code above, my first suggestion would be to extra the two (fn [..] ..) forms as named private functions so you can explore those directly from the REPL.

seancorfield 2021-02-24T02:55:58.284400Z

Another trick is, in a comment form, to write def forms for the parameter names so you can eval subforms of the functions directly in the REPL. I just did this today in trying to debug some refactoring I was doing...

grazfather 2021-02-24T02:57:10.286Z

actually, yeah I have a question about that: How do I send just a subform to the repl in spacemacs? spc m e f sends the top level, and spc m e e just gives me the symbol name

seancorfield 2021-02-24T02:57:28.286600Z

For example, I added this RCF:

(comment
  (countries "fr")
  ;; so that I could test subforms in the function above:
  (def ^:private locale "fr_FR")
  (def ^:private featured? #{"GB" "FR"})
  (def ^:private as-clojure? true)
  .)
so I could evaluate pieces of this code via the REPL
all-countries  (map #(set/rename-keys % {:iso :value})
                                (countries (if (seq locale) (subs locale 0 2) "en")))
            featured       (into [] (filter (comp featured? :value) all-countries))
            collated-query (cond-> featured
                             (seq featured)
                             (conj {:name "----" :value "__"})
                             :all
                             (into all-countries)
                             (not as-clojure?)
                             (cfml/to-clj-struct))]

seancorfield 2021-02-24T02:58:17.287500Z

No idea about any variant of Emacs, sorry, but I know all Clojure-integrated editors do have a commend for sending just the "current" form at the cursor.

grazfather 2021-02-24T02:59:12.287800Z

yeah, cqq works well in fireplace

grazfather 2021-02-24T02:59:15.288100Z

no sweat I will figure it out

seancorfield 2021-02-24T02:59:30.288800Z

I would expect spc m e e to eval the form if your cursor is on or just after the )

grazfather 2021-02-24T02:59:37.289Z

I am definitely more accustomed to printf debugging or step by step

grazfather 2021-02-24T03:00:10.289600Z

Yeah, you’d think so, right?

seancorfield 2021-02-24T03:01:15.290400Z

With VS Code/Clover, I can "eval block" whenever my cursor is on or just after the opening ( or on or just after the closing ) in general...

robertfw 2021-02-24T03:02:03.291200Z

@grazfather it may not be exactly what you are looking for, I have been just finding my way around spacemacs eval as well and have been using spc m e v with my cursor on the opening bracket of the form

robertfw 2021-02-24T03:03:13.292600Z

Actually I use , e v, I believe spacemacs comes with , already set by default as the major mode leader key

seancorfield 2021-02-24T03:03:18.292800Z

Hmm, I don't see that listed on https://www.spacemacs.org/layers/+lang/clojure/README.html#evaluation @robertfrederickwarner?

seancorfield 2021-02-24T03:04:00.293400Z

(it certainly has a lot of "send and eval" variants though!)

robertfw 2021-02-24T03:04:28.294Z

spacemacs recently went through a big keybinding shuffle, I'm guessing that hasn't landed on master yet. Most installs recommend using the develop version

robertfw 2021-02-24T03:05:13.294300Z

https://develop.spacemacs.org/layers/+lang/clojure/README.html#evaluation is for the develop version

seancorfield 2021-02-24T03:05:29.294900Z

Oh joy 🙂 Calva did a similar reshuffle and folks are still "recovering" from that 🙂

robertfw 2021-02-24T03:05:52.295500Z

I almost ditched it after they dropped that! Massive shock to the system. But I got tired of maintaining my own emacs config pretty quickly and decided to just go with the flow

grazfather 2021-02-24T03:05:54.295600Z

@robertfrederickwarner thank you!

robertfw 2021-02-24T03:07:52.296600Z

glad to help 🙂 I recently completely redid my setup using the practicalli guide and their provided configs, it got everything working nicely out of the box. https://practicalli.github.io/spacemacs/

grazfather 2021-02-24T03:09:23.297200Z

Ah, and spc m e e works but you have to to be actually after it, so on the next line to eval a top level. I’ll stick with e f and now e v

grazfather 2021-02-24T03:09:32.297400Z

yeah practically brought me here

grazfather 2021-02-24T03:09:41.297900Z

and clojure force me into emacs 😅

robertfw 2021-02-24T03:09:52.298100Z

yeah I'm finding those two get me 99% of what I need

grazfather 2021-02-24T03:10:24.298600Z

Now I just need to figure out all the slurping stuff

robertfw 2021-02-24T03:10:44.298800Z

slurp and barf, don't leave home without em

grazfather 2021-02-24T03:16:15.299200Z

Yep! I have those well enough that I wish I could use them in my golang work

grazfather 2021-02-24T03:16:20.299400Z

thanks for the help, guys, good night

pez 2021-02-24T07:30:20.299700Z

And the reshuffle was about three of the bindings. 😃 You simply don’t mess with people’s muscle memory.

West 2021-02-24T07:53:38.304400Z

So I found this side-by-side comparison of Clojure and Clojurescript. https://www.freecodecamp.org/news/here-is-a-quick-overview-of-the-similarities-and-differences-between-clojurescript-and-javascript-c5bd51c5c007/ I'm sure this isn't an exhaustive list, but I'd be curious to know what other features there are. Has anyone else made a list like this? If not, how can I see JavaScript generated from ClojureScript, with variable and function names intact?

solf 2021-02-24T08:10:42.304700Z

Here you go: http://app.klipse.tech/

👀 1
West 2021-02-24T22:58:45.336Z

Awesomesauce!

Yang Xu 2021-02-24T09:20:17.306100Z

How to do some initialize things when the first load in a namespace?

dharrigan 2021-02-24T09:22:07.306700Z

Can you elaborate on what you mean by "initialize things" please?

Yang Xu 2021-02-24T09:23:50.307900Z

Some myself logic need to do before execute code of namespace.

Yang Xu 2021-02-24T09:26:02.309400Z

In other words, I just need a way to execute when first load namespace.

Yang Xu 2021-02-26T10:49:50.438800Z

Wow, Thank you for your help and give me some library for that.

dharrigan 2021-02-24T09:28:44.310700Z

I don't think there is a "pre-load" eval that would eval functions and do something before the namespace is loaded.

dharrigan 2021-02-24T09:28:49.310900Z

Someone else might know better

2021-02-24T10:56:32.313200Z

@xu20151211 do you mean load the dependencies from project.clj? when you load lein repl with dependencies it load all the library

2021-02-24T11:02:43.315300Z

hi all, i've read atom description : Atoms provide a way to manage shared, synchronous, independent state. They are a reference type like refs and vars. question is : how many atom can i made? does each atom = 1 thread? is there any side effect to use a lot of atom?

aratare 2021-02-24T11:13:29.317100Z

Hi there. Not sure if this is the right channel to ask this but I can't seem to deploy to Clojars. I'm going through the steps as mentioned on Clojars but whenever lein is sending the package I always get this:

Could not transfer artifact epsilon:epsilon:jar:v1.0.0 from/to clojars (<https://repo.clojars.org/>): Failed to transfer file <https://repo.clojars.org/epsilon/epsilon/v1.0.0/epsilon-v1.0.0.jar> with status code 401
I've configured deploy tokens and GPG accordingly. Any insight is appreciated. Thanks.

2021-02-24T14:42:10.317900Z

You can create as many atoms as you want. Each one only takes a relatively small amount of memory, perhaps around 32 bytes or so, depending upon the JVM version you are using. No threads are created when you create an atom. An atom can be accessed from any or all threads that are in your JVM process that have a reference to it.

2021-02-24T14:43:09.318100Z

Many Clojure developers find that if they store a collection inside of an atom, e.g. a Clojure map, that they have no need for a large number of atoms in a typical application. Rich Hickey mentions how many Clojure applications can easily get by with a few, or one, atom.

2021-02-24T14:45:10.318300Z

ohhh okay... i don't really get the Rich Hickey part?

2021-02-24T14:45:53.318500Z

I'm looking for one of his talks where he mentions the perhaps surprising fact that many Clojure applications, even complex ones, can often get by with very few atoms.

2021-02-24T14:46:10.318700Z

does he said, that by using atom, it'll make things/apps much more simpler?

2021-02-24T14:46:13.318900Z

i.e. You can create many atoms if you want, but there is often no need to do so.

2021-02-24T14:47:14.319100Z

You can write complex intertwined code even with atoms, refs, agents, etc. These mechanisms don't force simpler code. They can enable simpler code.

2021-02-24T14:49:56.319300Z

"An atom can be accessed from any or all threads that are in your JVM process that have a reference to it." This part is enlightening for me, we could update using atom and pass it to another thread then wow...

2021-02-24T14:51:26.319500Z

atoms, refs, and agents are intended to help simplify writing Clojure programs that access and update state in a multi-threaded program. atoms are sometimes used even in single-threaded programs, as a place to store some value that you want to be able to change over time.

2021-02-24T14:52:52.319700Z

yes, and by storing in memory, what memory? is it in the cache?

2021-02-24T14:53:19.319900Z

sorry this is just out of curiosity

2021-02-24T14:58:25.320800Z

What cache are you referring to? The CPU's L1/L2/etc. caches?

Yang Xu 2021-02-24T14:59:25.321700Z

@adrianimanuel No, Isn't relate about dependencies.

2021-02-24T15:01:02.321800Z

that i don't know. just curious where the atom stored? and if we stop/restart the repl is it still storing?

2021-02-24T15:01:44.322Z

All values in a Clojure program, whether they are contained within atoms, refs, agents, or not, are in the main memory of your system, which is usually DRAM hardware chips that loses its state if the power goes off, but CPUs make copies of parts of that memory in on-chip L1/L2/etc. caches temporarily near the time when they are accessing those values, just as they do when running programs written in any programming language.

2021-02-24T15:03:18.322200Z

CPU cache behavior is designed into the CPU hardware design, independent of programming language (although a few low-level programming languages, e.g. assembly language, give you mechanisms for having some effect on how those caches operate -- those are rarely explicitly used in high level languages like C, C++, Java, or Clojure)

2021-02-24T15:03:25.322400Z

ahh okay, thanks a lot for the explanation

2021-02-24T15:03:36.322600Z

now i know better

2021-02-24T15:04:14.322900Z

No values in Clojure are automatically stored to non-volatile storage like a hard disk or SSD, unless you make explicit calls that write the data there.

2021-02-24T15:05:34.323100Z

There are probably some third-party libraries, separate from Clojure, that provide such "automatically store updates to disk/SSD when you make changes", but those are separate libraries not built into Clojure.

2021-02-24T15:08:14.323300Z

okay noted

grazfather 2021-02-24T15:19:59.323500Z

that’s an authentication issue. either your GPG isn’t set up correctly or they don’t have your public key

Pravin 2021-02-24T16:03:16.328300Z

I have two functions, each gets me the minimum year and date respectively. What Iam seeing the function works good if i pass integer values and get me min value, but breaks when i pass date value, what will be the alternative of this function to make it work for date values ? (def data [{:date1 "2012-10-07", :emp "emp-1", :yrs 2}       {:date1 "2014-10-06", :emp "emp-2", :yrs 3}       {:date1 "2017-10-08", :emp "emp-3", :yrs 1}] (def dates (map :date1 data)) (def years (map :yrs data)) (def min-year-value (apply min years)) (println "Min years in employment is : " min-year-value) ;; returns 1 (def min-date-value (apply min dates)) (println "Min employment date is : " min-date-value) ;; throws exception: class java.lang.String cannot be cast to class java.lang.Number

grazfather 2021-02-24T16:20:51.328800Z

you need to use a date/datetime type to be able to compare dates

👍 1
vanelsas 2021-02-24T16:20:52.328900Z

It breaks because you are trying to find a minimum in a vector of strings instead of numbers. So (apply min dates) will fail. Take a look at a time library that helps to convert date strings to something that you can compare. Maybe this one? https://github.com/dm3/clojure.java-time

👍 1
aratare 2021-02-24T16:40:00.329400Z

So I've created a GPG key pair, signed credentials.clj and then do lein deploy clojars as instructed by clojars itself.

aratare 2021-02-24T16:40:15.329600Z

but it seems that clojars keeps refusing my file transfer.

2021-02-24T17:44:39.329800Z

any code at the top level of the namespace (usually this is just "outside a defn") will execute when the namespace is loaded

2021-02-24T17:45:11.330Z

usually needing one of these is a sign of a design problem (but it could be needed because of some dep like javafx that needs initialization)

2021-02-24T17:46:47.330200Z

for a more principled approach, there are libs like stuartsierra/component and integrant which are designed to manage "lifecycles" of code objects, which includes initializing all your namespaces, plus shut down / restart if needed https://github.com/stuartsierra/component https://github.com/weavejester/integrant

2021-02-24T19:23:58.333800Z

Hello fellow beginners! After taking a break from streaming my Clojure Startup in a Month project, I'm getting back into it this week. We start each stream with a warmup exercise. Today's comes from @pez of VS Code Calva renown, and it sounds really interesting! It will consist of implementing this 1800-year-old algorithm for finding prime numbers: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes If you'd like to watch me stumble through it (I've been programming in Clojure for all of two months), I'll be starting the stream in about 5-10 minutes: https://www.twitch.tv/a_fry_ We also have a channel if you'd like to join: #startup-in-a-month

3
mbjarland 2021-02-25T14:02:02.356600Z

I'll add another which is a bit more performant and using java BitSet:

(defn sieve [end]
  (let [bs (BitSet. end)]
    (loop [x 2]
      (if (= x end)
        (-&gt; (doto bs (.flip 0 end)) .stream .toArray vec)
        (do (doseq [y (range (* x 2) end x)]
              (.set bs y))
            (recur (.nextClearBit bs (inc x))))))))
and timing:
(time 
 (count 
  (sieve 1000000)))
"Elapsed time: 26.078883 msecs"
=&gt; 78500
(includes 0 and 1 which could be argued is incorrect)

mbjarland 2021-02-25T14:04:58.356800Z

guess I could add a (drop 2 ... to throw away the 0 and the 1. Anyway, this is fairly performant and could be argued cheating by walking into java land. Though I have to say, any time I've tried to push performance far enough, I end up using some java stuff in clojure.

2021-02-25T14:06:13.357Z

my laptop must be slow. It's 4x slower on my machine 😮

2021-02-25T14:07:06.357200Z

more than 10x actually when doing count.

(time
  (count
    (sieve 1000000)))
"Elapsed time: 375.747701 msecs"
=&gt; 78500
What spec is your machine?

mbjarland 2021-02-25T14:07:57.357400Z

: ) yeah, I'm running this on a workstation and I ran it a few times in the repl and picked one timing randomly. Not exactly scientific, the times tend to bounce between 25ish ms and 40ish

2021-02-25T14:08:34.357600Z

I need to have a talk with HR, I need a new laptop!

mbjarland 2021-02-25T14:09:24.357800Z

CPU: Intel Core i7-7700 @ 8x 4.2GHz
OS:  Ubuntu 20.10
JVM: 11.0.10.9.1-amzn

mbjarland 2021-02-25T14:09:49.358Z

not exactly new, but it was fast a few years back and these days not that much happens year to year so still pretty fast

mbjarland 2021-02-25T14:12:05.358200Z

interesting, this is the same machine running your sieve-primes above:

(time 
 (count 
  (sieve-primes 1000000)))
"Elapsed time: 1115.377674 msecs"
=&gt; 78499

2021-02-25T14:13:44.358400Z

yeah, my avg time for that over 5 runs was 1466.8

mbjarland 2021-02-25T14:14:52.358600Z

so not that big a difference there

mbjarland 2021-02-25T14:19:38.359Z

anyway, would love to see some more solutions, I learn a lot from reading different peoples solutions to the same problem

mbjarland 2021-02-25T14:19:59.359200Z

oh, there seems to be a #code-golf channel...

mbjarland 2021-02-25T14:25:03.359400Z

I reposted the above solutions there, hope that was ok @pez @qmstuart and @eagonmeng

mbjarland 2021-02-25T14:26:51.359600Z

@pez by the way, (pos? (rem x)) does divisibility

mbjarland 2021-02-25T14:31:37.359800Z

for correctness as it was pointed out above that 0 and 1 should not be there, a version excluding them with BitSet:

(defn sieve [end]
  (let [bs (BitSet. end)]
    (loop [x 2]
      (if (= x end)
        (drop 2 (-&gt; (doto bs (.flip 0 end)) .stream .toArray vec))
        (do (doseq [y (range (* x 2) end x)]
              (.set bs y))
            (recur (.nextClearBit bs (inc x))))))))

2021-02-25T14:44:17.360Z

Upgraded my java version, from 8 to 15, now your sieve does:

(time
  (count
    (sieve 1000000)))
"Elapsed time: 39.0181 msecs"
=&gt; 78500
Much better.

pez 2021-02-25T15:28:44.360200Z

I might as well post my fastest solution so far. (Not that I am super likely to jump down this rabbit hole again 😄 ) I also went for mutability.

(defn sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (concat '(2)
                  (filter #(aget primes %)
                          (range 3 (inc n) 2)))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))

mbjarland 2021-02-25T15:36:39.360400Z

and short-circuiting for n < 2 etc : )

mbjarland 2021-02-25T15:37:37.360600Z

that is fast...:

(time 
 (count 
  (sieve 1000000)))
"Elapsed time: 13.828079 msecs"
=&gt; 78498

mbjarland 2021-02-25T15:37:52.360800Z

I think the count adds something like 1ms or 2...

pez 2021-02-25T15:38:40.361Z

Maybe doall adds less?

mbjarland 2021-02-25T15:40:20.361200Z

it's interesting, now it's stuck around 35-40 ms and I can't make it run in ~13 again

mbjarland 2021-02-25T15:41:19.361400Z

and then it does this:

(time
 (do (doall (sieve 1000000))
     nil))
"Elapsed time: 10.447096 msecs"
yeah, not sure what the deal is with the fluctuating timings

mbjarland 2021-02-25T15:41:29.361600Z

some jvm hotspot optimization thing perhaps

pez 2021-02-25T15:41:38.361800Z

I use criterium for keeping sanity when timing these things. 😃

mbjarland 2021-02-25T15:42:06.362Z

yeah, agreed, the only sane way to do this

pez 2021-02-25T15:42:40.362200Z

Your machine is fast, btw!

mbjarland 2021-02-25T15:49:48.362400Z

(crit/quick-bench
  (doall (sieve 1000000)))
Evaluation count : 54 in 6 samples of 9 calls.
             Execution time mean : 14.509525 ms
    Execution time std-deviation : 3.175044 ms
   Execution time lower quantile : 11.937822 ms ( 2.5%)
   Execution time upper quantile : 18.133132 ms (97.5%)
                   Overhead used : 2.753197 ns
this is for your latest @pez, and my bitset version gives:
(crit/quick-bench
  (doall (sieve-bitset 1000000)))
Evaluation count : 30 in 6 samples of 5 calls.
             Execution time mean : 30.563095 ms
    Execution time std-deviation : 8.093254 ms
   Execution time lower quantile : 24.216931 ms ( 2.5%)
   Execution time upper quantile : 39.494854 ms (97.5%)
                   Overhead used : 6.554876 ns

pez 2021-02-25T15:51:10.362600Z

A difference could be that I start with an array with only odd numbers.

mbjarland 2021-02-25T15:51:23.362800Z

my machine by the way is an airtop2 from a few years back. Totally fanless, no moving parts, passive cooling and small form factor. The latest incarnation can be found here <https://fit-iot.com/web/products/airtop3/>

mbjarland 2021-02-25T15:52:14.363Z

yeah I suspect there are quite a few things I could improve performance vise with the bitset version. Perhaps tomorrow : )

😄 1
mbjarland 2021-02-25T15:53:02.363300Z

@qmstuart hmm...java 15. Will have to try that

mbjarland 2021-02-25T15:57:01.363500Z

fyi with java 15:

(crit/quick-bench
  (doall (sieve-bitset 1000000)))
Evaluation count : 36 in 6 samples of 6 calls.
             Execution time mean : 19.852500 ms
    Execution time std-deviation : 3.493472 ms
   Execution time lower quantile : 16.478030 ms ( 2.5%)
   Execution time upper quantile : 23.753494 ms (97.5%)
                   Overhead used : 7.001575 ns
=&gt; nil

(crit/quick-bench
  (doall (sieve-pez 1000000)))
Evaluation count : 66 in 6 samples of 11 calls.
             Execution time mean : 11.521028 ms
    Execution time std-deviation : 2.464373 ms
   Execution time lower quantile : 9.025629 ms ( 2.5%)
   Execution time upper quantile : 14.383504 ms (97.5%)
                   Overhead used : 7.001575 ns
=&gt; nil

2021-02-25T16:11:28.365400Z

Damn, there is some serious talent in this thread!

pez 2021-02-25T16:15:36.365600Z

Haha, in my case more like many hours invested in this silly problem: https://clojureverse.org/t/eratosthenes-party-time-a-k-a-feedback-wanted-on-this-implementation-of-eratosthenes-sieve/3801

pez 2021-02-25T19:58:15.403500Z

btw, I also tried bitsets back then. Had a version of my boolean-array solution that used bitsets instead. Didn’t succeed in bringing it to the same speed, but came pretty close:

(defn bs-sieve [^long n]
  (let [primes (java.util.BitSet. n)
        sqrt-n (long (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if-not (&lt; p sqrt-n)
          (concat '(2)
                  (remove #(.get primes %)
                          (range 3 (inc n) 2)))
          (do
            (loop [i (* p p)]
              (when (&lt; i n)
                (.set primes i)
                (recur (+ i p))))
            (recur (.nextClearBit primes (inc p)))))))))
A cool thing with that experiment was that it sort of landed me the job I have today. Long story, but it boils down to that I had use for my fiddlings with java bitsets during the job interview and tests. 😃

em 2021-02-26T00:26:24.409600Z

@pez

"Elapsed time: 4.2257 msecs"
holy smokes your version is fast, skipping evens in both candidate sequence and filter sequence shaves off ~2ms for me which is pretty good, it's a neat optimization Tried the same exact algorithm with transients to see what a more clojure-ish solution would be but ends up at around "Elapsed time: 62.2843 msecs" on the same spec

pez 2021-02-26T07:12:35.422100Z

I only learnt about transients quite recently. Can you share the solution, @eagonmeng?

mbjarland 2021-02-26T14:57:53.449600Z

@pez, yea, bitsets are applicable surprisingly often and something not everybody knows about...or has the energy to care about. I think if you've taken the time to find it, understand it, and know when to use it it is an indicator that you know what you are doing. Not surprised if it tipped the scales during an interview

❤️ 1
pez 2021-02-26T17:47:10.452400Z

Your bitset solution croaks Eric’s test script, btw, @mbjarland. I haven’t figured out why, the test never finishes, afaikt.

mbjarland 2021-02-26T18:11:09.453100Z

Could be because it returns 0 and 1? I can check once I'm by a repl again : )

em 2021-02-26T21:55:25.459500Z

@pez Sure! Here's my (slightly) more idiomatic clojure implementation with transients:

(defn sieve [n]
  (let [limit (inc n)
        primes (transient (into [] (repeat limit true)))
        sqrt-n (int (Math/ceil (Math/sqrt n)))
        filter-multiple (fn [primes p]
                          (reduce #(assoc! %1 %2 false)
                            primes
                            (range (* p p) limit (* p 2))))
        sieve-prime (fn [primes p]
                      (if (get primes p)
                        (filter-multiple primes p)
                        primes))]
    (let [mask (reduce sieve-prime primes (range 3 sqrt-n 2))]
      (cons 2 (filter #(get mask %) (range 3 limit 2))))))

em 2021-02-26T21:59:16.459700Z

A neat thing with transients is that since the ergonomics are the same, the purely clojure implementation without mutability is literally 2 edits away: remove the call to transient for primes at the top, and then take away the bang in filter-multiple, i.e. assoc! -> assoc. Transients gives me a roughly 5x speedup, the pure clojure implementation is ~400ms for me

pez 2021-02-26T22:06:24.460Z

On my machine your transients solution costs 4x the performance from mutating a byte-array. A cost I think I would be willing to pay in many, even most, cases.

em 2021-02-26T22:19:33.460200Z

Yeah definitely, that's well within what I'm happy to pay for a more consistent approach and interfacing with other persistent code. But it's pretty cool that transients mesh so well with just the normal stuff we do, so often in thread-safe hot loops it's a super quick drop in for a performance boost!

mbjarland 2021-02-28T16:41:43.005800Z

Ok @pez, @eagonmeng , @qmstuart and gang, took another look at the performance of my solution, results now after tweaks:

(crit/quick-bench
  (count (sieve-bitset 1000000)))
Evaluation count : 96 in 6 samples of 16 calls.
             Execution time mean : 6.476842 ms
    Execution time std-deviation : 79.562876 µs
   Execution time lower quantile : 6.411202 ms ( 2.5%)
   Execution time upper quantile : 6.574641 ms (97.5%)
                   Overhead used : 6.046188 ns
=&gt; nil
(crit/quick-bench
  (count (sieve-pez 1000000)))
Evaluation count : 72 in 6 samples of 12 calls.
             Execution time mean : 10.665594 ms
    Execution time std-deviation : 2.495503 ms
   Execution time lower quantile : 8.844994 ms ( 2.5%)
   Execution time upper quantile : 14.692171 ms (97.5%)
                   Overhead used : 6.046188 ns
and the latest code, replaced a range with a loop and a few other tweaks:
(defn sieve-bitset [end]
  (let [bs (BitSet. end)]
    (loop [x 2]
      (if (= x end)
        (drop 2 (-&gt; (doto bs (.flip 0 end)) .stream .toArray))
        (do (loop [y (+ x x)]
              (when (&lt; y end)
                (.set bs y)
                (recur (+ y x))))
            (recur (.nextClearBit bs (inc x))))))))

mbjarland 2021-02-28T17:00:43.006Z

running this on a different machine though so times probably not comparable to before

pez 2021-02-28T17:43:37.006300Z

Way to go!

pez 2021-02-28T17:44:32.006500Z

Eric’s test still can’t cope with it, though. 😃

pez 2021-02-28T17:54:00.006700Z

I get different results on my machine. Here your new version is faster than the previous one, but still takes 2X the time of mine.

(quick-bench (count (mbjarland-bs-sieve 1000000)))
Evaluation count : 12 in 6 samples of 2 calls.
             Execution time mean : 97,098755 ms
    Execution time std-deviation : 341,019335 µs
   Execution time lower quantile : 96,681139 ms ( 2,5%)
   Execution time upper quantile : 97,402338 ms (97,5%)
                   Overhead used : 14,531080 ns

(quick-bench (count (mbjarland-bs-sieve-2 1000000)))
Evaluation count : 12 in 6 samples of 2 calls.
             Execution time mean : 78,531183 ms
    Execution time std-deviation : 1,032172 ms
   Execution time lower quantile : 77,299070 ms ( 2,5%)
   Execution time upper quantile : 79,888896 ms (97,5%)
                   Overhead used : 14,531080 ns

(quick-bench (count (pez-bs-sieve 1000000)))
Evaluation count : 18 in 6 samples of 3 calls.
             Execution time mean : 43,063580 ms
    Execution time std-deviation : 564,826778 µs
   Execution time lower quantile : 42,403500 ms ( 2,5%)
   Execution time upper quantile : 43,831587 ms (97,5%)
                   Overhead used : 14,531080 ns

(quick-bench (count (pez-sieve 1000000)))
Evaluation count : 24 in 6 samples of 4 calls.
             Execution time mean : 27,789680 ms
    Execution time std-deviation : 344,536671 µs
   Execution time lower quantile : 27,344088 ms ( 2,5%)
   Execution time upper quantile : 28,228818 ms (97,5%)
                   Overhead used : 14,531080 ns

mbjarland 2021-02-28T18:04:21.006900Z

that is strange…what java version are you on?

mbjarland 2021-02-28T18:05:00.007100Z

me: Clojure 1.10.0 and java 15.0.1-amzn

pez 2021-02-28T20:49:15.010300Z

java --version
openjdk 15.0.2 2021-01-19
OpenJDK Runtime Environment Zulu15.29+15-CA (build 15.0.2+7)
OpenJDK 64-Bit Server VM Zulu15.29+15-CA (build 15.0.2+7, mixed mode, sharing)
Same Clojure version as you.

pez 2021-02-28T20:57:24.010500Z

I get the same results if I try with 15.0.2.7.1-amzn

mbjarland 2021-03-01T06:48:38.017500Z

Interesting discrepancy.

mbjarland 2021-03-01T06:49:46.017700Z

I run these within an intellij cursive repl. Should not matter though, a java process is a java process

pez 2021-03-01T07:16:12.018200Z

How does my bitset sieve fare on your computer? Pasting again for convenience

(defn pez-bs-sieve [^long n]
  (let [primes (java.util.BitSet. n)
        sqrt-n (long (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if-not (&lt; p sqrt-n)
          (concat '(2)
                  (remove #(.get primes %)
                          (range 3 (inc n) 2)))
          (do
            (loop [i (* p p)]
              (when (&lt;= i n)
                (.set primes i)
                (recur (+ i p p))))
            (recur (.nextClearBit primes (+ p 1)))))))))

pez 2021-03-01T08:00:27.019300Z

I made a faster version now. Shaves some 30% of my byte-array solution time, even:

(defn pez-bs-sieve-2 [^long n]
  (let [primes (doto (java.util.BitSet. n) (.set 0 (inc n)))
        sqrt-n (long (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 2]
        (if-not (&lt; p sqrt-n)
          (drop 2 (-&gt; primes .stream .toArray))
          (do
            (loop [i (* p p)]
              (when (&lt;= i n)
                (.clear primes i)
                (recur (+ i p))))
            (recur (.nextSetBit primes (+ p 1)))))))))

mbjarland 2021-03-01T14:51:16.022300Z

best out of a few runs:

(crit/quick-bench
 (pez-bs-sieve-2 1000000))
Evaluation count : 102 in 6 samples of 17 calls.
             Execution time mean : 6.035838 ms
    Execution time std-deviation : 101.771435 µs
   Execution time lower quantile : 5.921544 ms ( 2.5%)
   Execution time upper quantile : 6.155316 ms (97.5%)
                   Overhead used : 7.118834 ns

2021-03-01T14:52:55.022500Z

what is quick-bench from ?

mbjarland 2021-03-01T14:53:08.022700Z

<https://github.com/hugoduncan/criterium>

2021-03-01T14:53:14.022900Z

thanks!

mbjarland 2021-03-01T14:53:41.023100Z

@pez nice with the bit flip by setting the high bit! That was bothering me

mbjarland 2021-03-01T15:00:21.024700Z

and also optimizing with sqrt as all the ones below have already been caught...could be argued that it is then a modified algorithm, but works for me. : ) Nice, definitely the fastest so far.

pez 2021-03-01T16:05:49.026600Z

I think it could be argued to be part of the algo 😃 https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Pseudocode

pez 2021-03-01T17:34:28.027900Z

After running some more tests I think I have concluded that: • It is much faster to aset into a boolean-array than to .clear or .set bits in a BitSet • It is much faster to .stream .toArray from a BitSet than to filter out the values of a boolean-array So, I now tried to speed up my filtering by using reducers/filter and it does speed it up enough to make my boolean-array solution as fast as my BitSet one, on my machine. Here’s the latest incarnation:

(defn pez-ba-reducer-sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (concat '(2)
                  (into [] (r/filter #(aget primes %)
                                     (range 3 (inc n) 2))))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))
Now, I am thinking that on your machine, @mbjarland, the gain from reducers might be a bigger win than it is on mine. Can you test it for me? 😃

pez 2021-03-01T23:21:13.037100Z

Someone has tested the writing to BitSet vs to boolean[] more scientifically than I did, but came to similar results. https://www.baeldung.com/java-boolean-array-bitset-performance

pez 2021-03-02T11:30:58.041700Z

Since updating the byte-array is faster, I started to focus on how to get the results from it faster. The reducers solution was one such attempt. Also tried core.async/pipeline, but that was super slow, some 200X slower. It’s not a big enough problem for that tool. 😃 But then I got a tips to try with loop and that does make a difference. Not huge, but anyway:

(defn pez-ba-loop-sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (loop [res []
                 i 3]
            (if (&lt;= i n)
              (recur (if (aget primes i)
                       (conj res i)
                       res)
                     (+ i 2))
              (concat [2] res)))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))
It shaves 3ms from the reducers attempt. Will try to parallellize this one as well.

pez 2021-03-02T18:28:14.085400Z

Enter transient. 😃

(defn pez-ba-loop-transient-sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (loop [res (transient [])
                 i 3]
            (if (&lt;= i n)
              (recur (if (aget primes i)
                       (conj! res i)
                       res)
                     (+ i 2))
              (concat [2] (persistent! res))))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))

Evaluation count : 60 in 6 samples of 10 calls.
             Execution time mean : 10,599681 ms
    Execution time std-deviation : 163,784470 µs
   Execution time lower quantile : 10,417649 ms ( 2,5%)
   Execution time upper quantile : 10,831060 ms (97,5%)
                   Overhead used : 14,507923 ns
Contrast to the version I had when this thread started and I said I would not open up this box again. 😃
(defn pez-ba-sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (concat '(2)
                  (filter #(aget primes %)
                          (range 3 (inc n) 2)))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))

Evaluation count : 24 in 6 samples of 4 calls.
             Execution time mean : 25,913184 ms
    Execution time std-deviation : 569,484180 µs
   Execution time lower quantile : 25,135620 ms ( 2,5%)
   Execution time upper quantile : 26,538920 ms (97,5%)
                   Overhead used : 14,507923 ns

em 2021-03-03T08:08:44.108300Z

I think at this point the can of worms has been fully opened lol Double byte-array + transient for filter output is a neat way to get around the bottlenecks of either approach Curious though, I wonder how close these implementations are getting to bare metal Java solutions

pez 2021-03-03T14:38:04.121200Z

I think I have lessons gained in my optimizing of the Java-ish solution that I can bring over to my more Clojuresque solutions. Maybe there are at least two classes to this hunt?

pez 2021-03-03T23:36:06.142500Z

Here’s a version of that last loop w/ transients solution, where I parallelize the collection of the sieved indexes.

(defn pez-ba-loop-transient-parallel-sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (&lt; n 2)
      '()
      (loop [p 3]
        (if (&lt; sqrt-n p)
          (let [num-slices (if (and (even? n)
                                    (&gt; n 1000))
                             50
                             1)
                slice-size (quot n num-slices)
                futures (mapv (fn [^long slice-num]
                                (future
                                  (let [start (inc (* slice-num slice-size))
                                        end (dec (+ start slice-size))]
                                    (loop [res (transient [])
                                           i start]
                                      (if (&lt;= i end)
                                        (recur (if (aget primes i)
                                                 (conj! res i)
                                                 res)
                                               (+ i 2))
                                        (persistent! res))))))
                              (range num-slices))]
            (concat [2]
                    (drop 1 (into []
                                  (mapcat deref)
                                  futures))))
          (do
            (when (aget primes p)
              (loop [i (* p p)]
                (when (&lt;= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))
Doesn’t gain me the slightest speed on my machine, though. I don’t quite understand why, because in a more syntetic test I had a 40% speed gain. Maybe on some other machine it makes more difference. In any case, now I sieve through 10M numbers in 70ms on my laptop. Chasing this particular local maxima might not be the way to make real progress.

pez 2021-03-04T10:17:12.152300Z

I now understand why I couldn’t see a gain in the sieve. 😃 Update coming. Haha.

mbjarland 2021-03-05T10:19:07.303600Z

for reference, I ran this on the same airtop workstation as the earlier benchmarks:

(crit/quick-bench
 (do
  (doall (pez-ba-loop-transient-parallel-sieve 1000000))
  nil))
Evaluation count : 156 in 6 samples of 26 calls.
             Execution time mean : 4.214238 ms
    Execution time std-deviation : 548.921344 µs
   Execution time lower quantile : 3.861080 ms ( 2.5%)
   Execution time upper quantile : 4.931209 ms (97.5%)
                   Overhead used : 7.105910 ns

mbjarland 2021-03-05T10:19:11.304Z

that is quite fast and also very stable across a number of quick-bench runs. The version without parallel:

(crit/quick-bench
 (do
  (doall (pez-ba-loop-transient-sieve 1000000))
  nil))
Evaluation count : 132 in 6 samples of 22 calls.
             Execution time mean : 5.424505 ms
    Execution time std-deviation : 868.418473 µs
   Execution time lower quantile : 4.685846 ms ( 2.5%)
   Execution time upper quantile : 6.411823 ms (97.5%)
                   Overhead used : 7.105910 ns
=&gt; nil
and same here, very stable across a number of runs. So on this machine the parallel code seems to make a difference, about 22% faster.

pez 2021-03-05T11:05:58.317500Z

Thanks! I was super curious. Seems the airtop does not outperform my macbook with as much for these. Btw, try the parallel one without the doall. Since it’s an eager solution, that will add a pass over the collection that is not necessary for the test.

mbjarland 2021-02-24T20:51:45.334200Z

Would love to see some clojure solutions to the sieve here. I think it would be quite instructive. I’ll start with one:

(defn sieve [end] 
  (loop [ps [] [x &amp; xs] (range 2 end)]
    (if (not x)
      ps
      (recur (conj ps x) (doall (remove #(= 0 (rem % x)) xs))))))
where doall is there to eliminate stack overflow

pez 2021-02-24T21:33:36.334500Z

That’s quite awesome! I happen to have my first solution to this saved. Totally slow and verbose:

(defn not-divisible-by [n d]
  (when-not (integer? (/ n d))
    n))

(defn sieve [n]
  (loop [found-primes [2]
         candidates (range 3 (inc n))]
    (let [highest-found-prime (last found-primes)
          remaining (-&gt;&gt; candidates
                         (map #(not-divisible-by % highest-found-prime))
                         (remove nil?))]
      (if (= remaining candidates)
        (concat found-primes candidates)
        (recur (conj found-primes (first remaining)) (rest remaining))))))

2021-02-24T22:11:49.334900Z

Welp ... you both made it a lot farther than I did 😛

2021-02-24T22:12:07.335100Z

Didn't finish it this time around, but I'll be continuing the challenge next afternoon EST.

pez 2021-02-24T22:14:41.335300Z

I’ll wait with posting some of my other solutions then. 😃

pez 2021-02-24T22:16:32.335500Z

I spent A LOT of time with it. Just say’n. Turned out it had some performance lessons for me to take. Haha.

West 2021-02-24T22:19:18.335700Z

Just followed!

2021-02-24T23:52:20.338600Z

This turned out pretty slow, but I wanted to use reduce. Thinking is create set of all the numbers. Then reduce that over the numbers 1..(sqrt n). Then to remove the values that are multiples of n is just set/difference with range with step of n.

(defn sieve-primes [upper]
  (let [xs (-&gt;&gt; (range 1 (inc upper))
                (set))]
    (sort (reduce #(if (or (= 1 %2) (not (%1 %2)))
                     %1
                     (let [multiples (range (* 2 %2) (inc upper) %2)]
                       (set/difference %1 multiples))) xs (range 1 (Math/sqrt (inc upper)))))))

2021-02-24T23:53:59.338800Z

(time (sieve-primes 1000000))
"Elapsed time: 1765.2869 msecs" 
slow 😞