clojure-spec

About: http://clojure.org/about/spec Guide: http://clojure.org/guides/spec API: https://clojure.github.io/spec.alpha/clojure.spec.alpha-api.html
johanatan 2020-08-21T03:37:29.033900Z

btw i think i found the problem

(apply concatv rest)
can blow up the stack / is not in tail position.

seancorfield 2020-08-21T03:40:23.034400Z

Oh, interesting... Does that introduce laziness somehow?

johanatan 2020-08-21T03:40:27.034500Z

(at least that is "one" problem)

johanatan 2020-08-21T03:40:52.034900Z

well, it's a recursive call

seancorfield 2020-08-21T03:41:20.035600Z

OH! Yeah, that sounds so obvious once you say it out loud! 🙂

johanatan 2020-08-21T03:42:06.036500Z

let me explain. sorry, i noticed that even if I remove the :fn I still get the crash with the original impl of concatv. but with this loop/recur one (inspired by the Stuart Sierra blog on concat) that crash won't occur:

(defn concatv
  "Strict version of concat."
  [head & tail]
  (loop [seqs (cons head tail) acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

seancorfield 2020-08-21T03:43:28.038Z

Seems weird to have head & tail when you only cons them together and never use them again.

johanatan 2020-08-21T03:44:01.038500Z

oh, could just be [& args] ?

johanatan 2020-08-21T03:44:06.038700Z

good point

seancorfield 2020-08-21T03:44:40.039400Z

You might want to consider whether (concatv) should work and what it should produce (I'd say yes and [] respectively).

johanatan 2020-08-21T03:45:18.040100Z

i'm just trying to match behavior of the original concat (which seems to agree w/ you) 🙂

cljs.user=> (concat)
()

johanatan 2020-08-21T03:45:45.040400Z

and yea that concatv does similarly:

cljs.user=> (concatv)
[]

seancorfield 2020-08-21T03:46:39.040700Z

I just tried it in Clojure and got an error

seancorfield 2020-08-21T03:48:03.041400Z

Oh, it works after changing to & args and (loop [seqs args ...

johanatan 2020-08-21T03:48:05.041500Z

hmm, mine was from lumo

johanatan 2020-08-21T03:48:12.041800Z

yea, sorry

johanatan 2020-08-21T03:48:14.042Z

didn't paste in here

johanatan 2020-08-21T03:48:24.042500Z

(defn concatv
  "Strict version of concat."
  [& args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

petterik 2020-08-21T16:45:58.047100Z

Did you consider (reduce into [] args)?

johanatan 2020-08-21T20:51:08.048700Z

Nope, does it work?

johanatan 2020-08-21T20:51:56.048900Z

Also, how much memory will it use? the loop/recur one is tail optimized so should use a constant amount of memory

seancorfield 2020-08-21T03:48:32.042700Z

The version with [head & tail] requires 1+ arguments 🙂

johanatan 2020-08-21T03:49:14.043400Z

haha, yea 2+ or 0+ makes more sense than 1+

johanatan 2020-08-21T04:01:06.043900Z

here is a "final" version that works (tested up to 75 iterations at a time):

(s/def ::any-coll
  (s/with-gen
    (s/coll-of any?)
    #(s/gen (s/coll-of any? :max-count 125))))

(s/fdef concatv
  :args (s/cat :args (s/* ::any-coll))
  :fn #(let [r (:ret %1)
             args (-> %1 :args :args)
             sumhash (fn [c] (apply + (mapv hash c)))]
         (and
          (= (count r) (apply + (mapv count args)))
          (= (sumhash r) (apply + (mapv sumhash args)))))
  :ret (s/coll-of any? :kind vector?))

(defn concatv
  "Strict version of concat."
  [& args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

1
johanatan 2020-08-21T04:06:33.045Z

so yea i think i was kinda confounded earlier by the fact that there was a potential stack overflow in the code under test. so even when i got the test code "right" the code under test could screw me

johanatan 2020-08-21T04:07:29.045900Z

but it was somewhat non-deterministic on both sides. bit of an entangled heisenbug

johanatan 2020-08-21T04:07:32.046100Z

🙂

johanatan 2020-08-21T04:08:35.046400Z

two entangled heisenbugs rather

seancorfield 2020-08-21T04:08:44.046600Z

Complected, even 🙂

johanatan 2020-08-21T04:08:51.046800Z

haha exactly

petterik 2020-08-21T16:45:58.047100Z

Did you consider (reduce into [] args)?

johanatan 2020-08-21T19:04:30.048500Z

small update: so ... there turned out to be 3 problems. my hunch about s/* was also correct. some of the crashes were emanating from it as well. here's a "more final" version (at least until I get another random, intermittent failure some time in the future):

(defn arbitrarily-partition [coll freq]
  (let [signals (take (count coll) (cycle (cons true (repeat freq false))))
        shuffled (shuffle signals)
        zipped (map vector shuffled coll)
        partitioned (partition-by first zipped)]
    (for [c partitioned]
      (map second c))))

(s/def ::coll-of-colls
  (s/coll-of (s/coll-of any?)))

(s/def ::distinct-coll-of-colls
  (s/with-gen
    ::coll-of-colls
    #(gen/let [elems (gen/set (s/gen any?) {:max-elements 150})
               freq (s/gen (s/int-in 1 10))]
       (arbitrarily-partition elems freq))))

(s/fdef concatenate
        :args (s/cat :args ::distinct-coll-of-colls)
        :fn #(let [r (:ret %1)
                   args (-> %1 :args :args)
                   sumhash (fn [c] (apply + (mapv hash c)))]
               (and
                (= (count r) (apply + (mapv count args)))
                (= (sumhash r) (apply + (mapv sumhash args)))))
        :ret (s/coll-of any? :kind vector?))

(defn- concatenate
  "Strict version of concat."
  [args]
  (loop [seqs args acc []]
    (if (empty? seqs)
      acc
      (recur (rest seqs) (into acc (first seqs))))))

(defn concatv
  "Strict version of concat."
  [& args]
  (concatenate args))

misha 2020-08-23T08:23:39.049100Z

I did not read entire discussion, so it might be irrelevant, but: there is a :distinct option in s/coll-of

(s/coll-of int? :distinct true) 

misha 2020-08-23T08:25:19.049400Z

or is it supposed to be "distinct for all 1st and 2nd level items"?

johanatan 2020-08-21T20:51:08.048700Z

Nope, does it work?

johanatan 2020-08-21T20:51:56.048900Z

Also, how much memory will it use? the loop/recur one is tail optimized so should use a constant amount of memory