One minor nit I saw recently was that I think defstruct allows qualified keywords as keys, but defrecord does not.
defstruct
allows all sorts of things as keys (in addition to qualified keywords):
dev=> (defstruct Bar :a 'b "d" [:e] :f/g ::h)
#'dev/Bar
dev=> (struct Bar 1 2 3 4 5 6)
{:a 1, b 2, "d" 3, [:e] 4, :f/g 5, :dev/h 6}
dev=> (get *1 [:e])
4
dev=> (::h *2)
6
dev=>
dev=> (defstruct Obj (Object.))
#'dev/Obj
dev=> (= (struct Obj 1) (struct Obj 1))
true
๐wow
So looks like defstruct is about as general as regular Clojure maps there?
Rich never added it in Clojure I guess, but David added it in ClojureScript maybe.
If (conj)
returned ()
then it be consistent. In the absence of a collection, you get a ()
. It would be consistent with (conj nil 1)
at least, and the 1-ary is well a special case that has no choice but to be identity because of transducers.
With regard to vector being more useful, yes I agree, and actually that's how I got here. I was doing:
(update m :a conj :b)
Where the key :a
doesn't exist the first time around, and that's how I realized damn (conj nil :a)
returns a list ๐ And I almost never want a list, and always a vector, so in that sense you could say its great that (conj)
returns vector, since they are more useful. But now that also creates an inconsistency, which is a little bit of a gotcha which you need to be aware and remember carefully about that depending on the arity, when not given a collection, conj
will not always default to the same type of coll.But, I'd love to learn that, actually (conj)
returning a vec and (conj nil :a)
returning a list turns out they allow for a ton of cool useful idiom scenarios, and its just my particular one isn't one of them. If that was true that would convince me, otherwise I think it might just be a case of slightly accidental accretion.
Not complaining, more curious to learn some new idioms that maybe I don't know about. If there isn't and conj is just that way just as a kind of accretion artifact, that's fine, and lucky for me fnil saves the day in my case: (update m :a (fnil conj []) :b)
What's a defstruct under the hood? Something I've thought about recently is it be nice to have an array backed struct, where keys are static. Basically they'd be a bit like class fields, but much more lightweight then creating a class.
Like (defstatic Point3D :x :y :z)
. And then you'd do (get-field point :x) or something, but get-field would be a macro which will rewrite this to an (aget point 0)
Something of that sort
Oh, maybe struct are kind of what I'm thinking actually way better then what I was thinking. You need to use accessor
though to get the kind of direct access I'm talking about it seems
The Java implementation isn't too hard to follow. It is a persistent map of the "base" keys to small integer indexes in the range [0, n-1], stored once for each named defstruct
, and then for each instance of such a persistent struct there is an array indexed by [0, n-1] of associated values. There is an optional persistent map to store other keys that might be assoc'd on later that are outside of those named when you create a named defstruct
Yeah, looks like you don't need to use the accessor
function to retrieve elements from a struct, but if you do not, then performance is pretty much like a persistent array map or persistent hash map, depending the number of keys.
Ya, so its quite close to what I was saying. The struct fields are stored in an Object array. There's a map from the field name to the index where their value is. And there is another map for dynamically "added" fields. Accessor will create a function that already looked up the field in the map and closed over the index, so when you use the accessor afterwards, it's just getting you the value directly from the array by index. At least that's my 30 second glance.
That all agrees with my 4-minute glance ๐
I'm actually not sure why regular gets of its fields seem to be faster than for regular maps hum...
Like getting the index form a small map of field -> index and then looking it up in an array is faster than just a normal map lookup?
What test are you running that shows regular gets of a defstruct's keys are faster than for regular maps?
The one @borkdude posted
(def ks [:a :b :c :d :e :f :g :h :i :j :k :l :m :n :o :p])
(def Foo (apply create-struct Foo ks))
(def s (struct Foo 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6))
(time (let [s s] (dotimes [i 1000000000] (:a s))))
"Elapsed time: 17160.9286 msecs"
(def z (zipmap ks (range)))
(time (let [z z] (dotimes [i 1000000000] (:a z))))
"Elapsed time: 24000.9011 msecs"
I would guess that is because the regular map there is a hash map, but the defstruct is like an array-map for keys in the 'base' set, and your test is always accessing the first one in the array.
Try always accessing key :p
in the struct map, and I would guess you will notice a difference in performance.
I was thinking array-map vs hash-map probably, but ya forgot that for array map if I'm using the first key it be way faster. Let me try
wait, no cancel that explanation.
The defstruct should have a hash map for the keys, too, I think.
wouldn't 12 keys though result in a hash-map?
Still faster with struct
(time (let [s s] (dotimes [i 1000000000] (:p s))))
"Elapsed time: 22069.8155 msecs"
(time (let [z z] (dotimes [i 1000000000] (:p z))))
"Elapsed time: 30195.8514 msecs"
Trying to dig now to see if it actually uses a hash-map for that many keys, or an array-map. Not there yet.
I think that's the line: return new Def(keys, RT.map(v));
so it seems to delegate to the RT map constructor, which I think would create an array-map if length under 8 and a hash-map if greater
Oh, HASHTABLE_THRESHOLD = 16;
I thought it was 8
but the v
passed to RT.map has 2 elements per key, one for the key, one for the associated value
Ah yes, that's why it is actually 8
The treshold is on the length of the list of key/value pairs. So set at 16, it means a map of 8 mapEntry
So pretty sure in this example, both are using HashMap
Yep, PersistentHashMap class for the value of the keyslots
field for the defstruct keys you are using, so I don't currently have a guess what the performance difference might be caused by.
(verified via reflective access to the value of the field keyslots
of an actual JVM object created by defstruct, rather than reading the code)
There is a pretty significant difference in run times from one key to another in your results, too, so I'm not getting very excited over the performance differences yet.
True
FYI the hash maps are different between those two cases, because your expression (apply create-struct Foo ks)
is adding the var Foo itself as one of the fields of the struct. You probably want to change that to (apply create-struct ks)
Doesn't seem to make a difference :man-shrugging:
Maybe some weird JVM optimization
Anyways, the difference is pretty small (if you don't run it like a million time)
Using accessor
is really where you see a big speedup
Presumably because the accessor
does the hash map lookup for you, and then using it is just a simple array access?
correct
Ya and since each instance of the same struct will have the fields in the same order in the array, you can use the same accessor for all of them. So it's not just like a cache over a key lookup.
I shared a form a while ago to probe the interests of clojurians around message queues technologies. As promised here are the raw results: https://account606590.typeform.com/report/sirkUS13/xTQgUFApGNu34PSV TLโDR: people are familiar with Kafka, prefer hosted SaaS products that are easy to get started with. I think I can work in that direction ๐
Is there an easy way to prettyprint maps without commas?
You might not want to use another library, but https://github.com/kkinnear/zprint let's you do that and many other formatting options
I'll try this, thanks
took me a second to find it in the docs but it's here: https://github.com/kkinnear/zprint/blob/7a8c0e82943a0ad87135ff749ae5795063cb0f74/doc/using/repl.md#configure-zprint
Works great, thanks!
Just have to add the :comma? true
and it works
@jjttjj you can always use regular java cache libraries like caffiene
I might be missing something but that still has a map-like interface, ie you still need a key/value for everything, right? I'm looking for a vector-like interface where I just conj stuff to the end
I could be missing something in their docs, I might just not know the right terminology to look for
is there a โstandardโ on when to use a map as one parameter vs a sequence of parameters? for exampleโฆ
(defn foo [{:keys [foo-a foo-b]}] ...)
(foo {:foo-a "bar" ...
; vs
(defn foo [foo-a foo-b] ...)
(foo "bar" ...
Not a standard, but certainly if you expect the list of values to grow in the future, a map is more easily expandable than a list of parameters.
Yeah I often agonize over these choices. I think for 2 arguments, with no further information, I would default to positional args but of course there are a lot of factors. more than 3 arguments it gets harder.
A map lets you name each one, rather than remember their position in a list of args, and remembering position in a list of args over about 4 or so, or even manually checking such calls while looking at the definition of the function, can be pretty taxing.
sounds like iโm not alone ๐ thanks @jjttjj @andy.fingerhut
One thing that some people worry about on the disadvantages of using a map there, is that if you later want to change the function so that a new key/value pair is required in order for the function call to work, there is no automatic editor/IDE support that I know of that will automate the process of finding all calls, other than the obvious one of "search for the function name throughout your code to find all calls". That approach works for the separate arguments approach, too, of course, but there are also lint tools like clj-kondo that could help you find wrong-arity calls.
interesting, good point! everything a map would make refactoring challenging for sure
One thing that kind of trips me up is sometimes I know I'm going to want a map for args, because I have a lot of args and/or want to have the option to add more later easily. But then I'm not quite sure how to break things up. Do I just use one big map for all args? Do I separate a leading map for "component" type args, use a separate, trailing "options" map, possibly with some positional args thrown in between (things that are definitely intrinsic to the nature of the function and probably wont change)? I've wondered if the clojure philosophy of "maps should be open and use qualified keys and combined liberally" from the spec related talks is an argument for just one big map of arguments for any functions that are going to use any map args as a convenience measure
I guess, more specifically, components and options are two totally separate use cases for map arguments, do you mix them together or keep them separate?
Usually I make required args fixed and options go last in a map
Not a hard rule
Any philosophy talks go out the window when talking about something low level like caching
@jjttjj does this look good?
https://guava.dev/releases/23.0/api/docs/com/google/common/collect/EvictingQueue.html
I think that's just a ring/sliding buffer with a integer capacity, where I need just a vector where inserts have a TTL. I'll have to dig deeper into the guava stuff though it might be in there somewhere
I've used the "one map for components, one map for options" approach in a few places and quite liked it. The app structure was such that the components were being injected via a map already, and could easily be passed into dependent sub functions following the same pattern, and when it came time for testing, I could use generators to create my options and then do my own injection of the component map with various component mocks.
how about the other way around nowโฆwhen destructuring, when should the destructuring happen in the signature vs a let
:
(defn foo [{bar :bar
{{a :a} :b} :c}])
vs
(defn foo [mamap]
(let [{bar :bar
{{a :a} :b} :c}]))
In the signature allows tooling to display the shape more easily
And removed a layer of nesting for the let
I welcome more Clojure based SaaS
I've proposed on http://ask.clojure.org a solution to this. If Spec were to conform functions specs at compile time the same way it does for macros, than you could have compile time error for this if using inline named args (fn [& {:keys []}])
https://ask.clojure.org/index.php/9876/conform-function-specs-at-read-time-similar-to-macros
Anyone have an example for doing an embedded repl?
I keep trying variations of this but it never works.
(clojure.core.server/start-server {:name "repl" :port 5563 :accept clojure.core.server/repl})
I see this error on the server during connect: java.lang.ClassCastException: clojure.core.server$repl cannot be cast to clojure.lang.Named
it wants a symbol, not the function itself
it resolves the symbol
yup, that works
thanks
user=> (doc clojure.core.server/start-server)
-------------------------
clojure.core.server/start-server
([opts])
Start a socket server given the specified opts:
:address Host or address, string, defaults to loopback address
:port Port, integer, required
:name Name, required
:accept Namespaced symbol of the accept function to invoke, required
:args Vector of args to pass to accept function
:bind-err Bind *err* to socket out stream?, defaults to true
:server-daemon Is server thread a daemon?, defaults to true
:client-daemon Are client threads daemons?, defaults to true
Returns server socket.
nil
granted, i should have looked closer at the docs. I was thrown off because the code example just provides a function reference: https://archive.clojure.org/design-wiki/display/design/Socket%2BServer%2BREPL.html
Most people start a socket REPL using the JVM property so you don't need any code inside your process @doubleagent
I remember adding that to :jvm-opts
in my leiningen project and it produced an error. Maybe I did it wrong, though. :man-shrugging:
That's what that page is showing, BTW: -Dclojure.server.NAME="{:address \"127.0.0.1\" :port 5555 :accept clojure.repl/repl}"
That's an argument to java
itself when starting up the process. You can pass it to the Clojure CLI via the -J
option.