Using next-jdbc and hikari-cp to connect to my Postgres database, when I query a row, I’m getting an #inst "2020-08-14T00:26:30.000000000-00:00"
on the timestamp value. I’m trying to get it to transform the value to an OffsetDateTime.
This is my row builder.
(jdbc/execute-one! @data-source
query
{:builder-fn (result-set/as-maps-adapter
result-set/as-unqualified-maps
result-set/read-column-by-index)
:return-keys true}
When I query, it gives me this value now
#object[com.zaxxer.hikari.pool.HikariProxyResultSet 0x45deee29 "HikariProxyResultSet@1172237865 wrapping org.postgresql.jdbc.PgResultSet@97151ca"]
an #inst tagged literally is the printed form of a few different kinds of date time objects
so that doesn't actually tell you how to proceed to get what you want
but it is almost certainly a java.sql.Timestamp
@audyarandela For the builder, why are you not just using result-set/as-unqualified-maps
?
which is a subclass of java.util.Date
result-set/read-column-by-index
is not a valid second argument for as-maps-adapter
so your values will not be read correctly.
so you just figure out how to go from java.util.Date to whatever you want
So I did just use as-unqualified-maps
but the value in dB = 2020-12-11 16:02:57.217602
but want it to be the OffSetDateTime 2020-12-11T16:02:57.217602-06:00
. Do I just take that value and do a java-time conversion outside of next-jdbc, or can next-jdbc convert it before returning the row?
oh
you are running into a timezone issue as well
Yes, I hate working with time lol
you are storing dates without a timezone, and your timezone and your databases configured timezone don't agree
The answer is complicated. #sql would be the best channel to follow-up in. Yeah, timezones are hard 😞
you are going to have a bad time, because if I recall most jdbc drivers default to assuming a date is in utc if no timezone is stored
Yea thats starting to make more sense now. Thank you both for the replies!
PostgreSQL has both timestamp and timestamp-with-tz types (making the problem worse, in my opinion).
The only sane approach with databases and timezones is to have your server set to UTC, your JVM set to UTC, and your database set to UTC, and to work entirely in UTC, converting to/from zoned-times just at the edges. And even then it's actually a bit more complicated than that.
aye yi yi
Ok, i’ll have to go back to the drawing board. Thanks again!
There's a next.jdbc.date-time
namespace that might help you a bit, if you're not already requiring that -- see the Tips & Tricks page for the PostgreSQL section (it's also briefly mentioned near the bottom of the Getting Started guide).
I like storing and manipulating all dates as long
Unix time stamps with milliseconds (always UTC), it saves from lots of troubles.
Yeah, I've seen that as advice for dealing with dates/times. It seems a bit extreme when you can have the affordance of actual date/time values in UTC.
Yeah, I guess if the SQL queries use date functions, it’s in many ways beneficial to use proper date types. Funnily though, from my experience in finance, every time a developer doesn’t store dates as Unix time stamps, someone’s going to stay up at night at least twice a year, when summer/winter time change.
Aye, finance is its own peculiar world...
I’m soon done with Brave and True. A great and fun introduction to the language! But I would now like read a book that’s a bit more direct and demanding in terms of learning Clojure. Possibly aimed at experienced programmers who aren’t new to immutability, FP etc but just to LISPs. Any good recommendations? I’m especially experienced with C# so any “Clojure for Java devs” type book would probably work well too.
https://pragprog.com/titles/shcloj3/programming-clojure-third-edition/ is a very good choice
🙂
For a very very good explaination of each of the core functions, the standard library so-to-speak, then this book <https://www.manning.com/books/clojure-the-essential-reference>
is , well, essential! 🙂
Thanks a lot!
//**foo.js**
$(document).ready(function ($) {
initBarD();
//...
});
//**bar.js**
function initBarD() {
$('.aclass').on('click', function () {
$(this).toggleClass('is-active');
$(this).find('.cm').slideToggle();
});
//...
}
** index.html for cljs(reagent/re-frame) code **
<body>
<div id="app"></div>
<!-- ... -->
<div class="aclass"><div class="cm"></div></div>
<!-- ... -->
<script src="/js/bar.js"></script>
<script src="/js/foo.js"></script>
<script src="/js/app.js"></script>
</body>
For some reason the method initBarD
is not working. I tired to set extern but still same thing....
Is defrecord
considered good practice? I was kind of surprised to see it in the language. Not even sure why. Just felt a bit... type-y?
@anders152 defrecord
is usually a way to get good performance for often used keys, while also having the flexibility of adding other keys. Also you can implement protocols with it, unlike with regular maps. Well, you can now, via metadata
I see defrecord
as one of the many examples for Clojure being a very practical language. Like, yeah, this is a "dirty" performance trick tied to internals of JVM, but sometimes you need this performance.
What's a reasonable way to get map-like behavior in a more memory efficient way?
I've got a map that serializes to 140mb. When I deserialize it into memory, it takes up 1.5gb due to the overhead of a hash-map. The only behavior I need is index lookup. The keys are strings, the value are maps of depth one. The reason for the value being a map is mostly convenience. It could be a pair of lists that I could manually seq through if that would be more memory efficient. I don't need O(1) lookup on the value. I do need O(1) lookup on the outermost key/value pairs.
I thought about maybe using something like SQLite and looking up from disk with an added LRU cache. The data is basically a dictionary of English words, so for performance, I'll mostly be hitting the same values the most often. Is there an alternative anyone else knows of?
{("profanity" "unholy") {"its" 2},
("ants" "triumph") {nil 1},
("hiding" "our") {"of" 1, "expose" 3, "above" 1},
("won't" "intervention") {"divine" 1, "an" 1},
("pines" "weeping") {"the" 1},
("let" "give") {"to" 1},
("memory" "undead") {"an" 1},
("waters" "one") {"as" 1},
("that" "palms") {"the" 1, "these" 1},
("you" "tonite") {"but" 1, "volume" 1}
,,,
}
I tried using defrecord for the inner values. That resulted in worse memory size.
Instead of {("hiding" "our") (hash-map "of" 1 "expose" 3,,,)}
it was {("hiding" "our") (->MyRecord ["of" "expose" ,,] [1 3 ,,,])
Do the same words occur many times, e.g. "the" appears thousands of times?
Yes. Very much so.
The reason I ask is that the default way of reading such a data structure will cause each occurrence of "the" to be a different JVM object allocated in memory.
If they were Clojure keywords, that would not be the case, because each Clojure keyword, for time efficiency sake (but also it can help with memory efficiency) is guaranteed to be the same JVM object in memory.
Ah. That was a question I had and I assumed all of the references of "the" pointed to the same spot in memory.
I do not know of an off-the-shelf way to change the behavior of a typical Clojure data reader to "intern" strings the way that is done for keywords, without doing some code changes inside of those data readers.
"intern" being a verb that is often used for this process of trying to reuse the same identical object when it appears again.
It also is not clear exactly how much memory it would save, but perhaps you would be interested in a quick experiment to find out?
Excellent. I will try just using keywords for everything and converting them back to strings before display.
Be aware that the set of characters that can appear in a keyword, and be print/read-round-trippable is smaller than for strings.
if they are all letters, you are fine. It is characters like : ' / that can cause troubles in reading (and others)
The footprint difference is 400mb. 1.9g to ~1.5g. That's awesome. I think I'll still need more savings on top of that.
(name (keyword "won't"))
-> "won't"
works fine. I think that's the only one in my data.
I believe (String/intern x)
does interning on arbitrary strings if it helps, without a need for keywords. https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#intern%28%29
Also, (array-map ...) should be more memory-efficient than (hash-map ...) with its lookup complexity tradeoffs, but we're going into deep byte-hoarding territory. https://clojuredocs.org/clojure.core/array-map
I've done some experiments, and reading Eric's data using the standard Clojure reader produces array maps for most of them already, since most of his maps are small.
Hmmm, and it seems that forcing the creation of array maps with large numbers of keys is O(n^2) (guess on my part at the moment, but would make sense if so), so not recommended for maps with 90,000 entries 🙂
Yep, it is O(n^2), to check for duplicate keys. As expected, array maps are good for small maps, not huge ones.
what about -XX:+UseStringDeduplication
?
As an interesting curiosity, both vim and Emacs take more memory to load that file than Clojure does 🙂
Probably because of some mode that the .edn suffix causes them to invoke, not sure.
@ericihli Most of your maps are small, so they are by default implemented in Clojure as an array map, not a hash map. The memory overhead for an array-map with 1 key in it (which most of your maps have) is 56 bytes. For 1 million such maps, the memory overhead is about 56 MBytes. That doesn't sound like where most of the memory is coming from.
2.8 million keys. Each key is a list of 2 strings (or keywords). Each value is an array-map (or hash-map) with an average of 3 kv pairs of string (or keyword) -> int.
yeah, you asked something similar before and shared your data file, which I still have a copy of and have been experimenting a bit more with this morning. 2.8 million 1-key array-maps takes about 150 MBytes of RAM.
for just the part that is "overhead", that is, over and above the memory required to represent the key and the value
The average might be 3, but I see 2 million out of those 2.8 million with 1 key/value pair
Try this (if your data is in x1
) and you will see how few of them are hash maps: (frequencies (map class (vals x1)))
Ah thanks. Just came across this that I'm going to try too. > -XX:+UseCompressedOops Enables the use of compressed pointers (object references represented as 32 bit offsets instead of 64-bit pointers) for optimized 64-bit performance with Java heap sizes less than 32gb.
I thought that was the default if your max heap is under something like 16 GBytes
Ah. I didn't know that.
The cljol library I wrote can help visualize small data structures in a JVM, and quickly show whether references are 4 or 8 bytes in size. Don't try it on your whole data structure or it will run out of memory.
Is this the right way of thinking about memory size by reading the code? https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentList.java
private final Object _first;
private final IPersistentList _rest;
private final int _count;
2 pointers (4 bytes each?) + 1 int (another 4 bytes?).
https://github.com/clojure/clojure/blob/0df3d8e2e27fb06fa53398754cac2be4878b12d1/src/jvm/clojure/lang/ASeq.java#L16
transient int _hash;
transient int _hasheq;
+ 2 ints
https://github.com/clojure/clojure/blob/0df3d8e2e27fb06fa53398754cac2be4878b12d1/src/jvm/clojure/lang/Obj.java#L17
final IPersistentMap _meta;
+ 1 pointeryes, and cljol can show you the exact layout of the fields in memory in your running JVM, too.
Here is a gallery of images created using it (again, on small data structs) demonstrating compressed OOPS, and also 1-byte-per-char strings that might help you, if you are not already taking advantage of it: https://github.com/jafingerhut/cljol/blob/master/doc/README-gallery.md
What JDK are you using? If 8, then all strings are consuming 2 bytes per char in memory, even if they are ASCII. JDK 9 and later can store ASCII strings in 1 byte per char: https://github.com/jafingerhut/cljol/blob/master/doc/README-gallery.md#compact-strings-in-java-9-and-later
OpenJDK 1.8
It might not reduce memory a lot for you, since most of your strings are probably short, and it will not eliminate the per-string overhead of JVM objects, which is noticeable.
The most it could save would be the size of your file, and it probably won't save that much.
Are you trying to reduce this memory usage because 1.4 GB is too much RAM, for some reason? Or because you want to scale this up to 10x or 100x larger data sets?
Started off as a toy project that I wanted to be able to run on a $5 digitalocean instance with 1g of ram. Not worth paying $40/mo. I think I can get by using swap, but it results in ~10 minute startup times for the app. Haven't yet seen how it affects running performance but I imagine it will be unbearable. On top of that, this is a minimum set of data and I want to know how I could scale it if needed.
Just looking at a random sampling of MapEntries using cljol
it looks like they are averaging ~400 bytes each. Lots of overhead from _meta
, _hash
, _count
, _first
, _rest
, and of course the strings although those could be interned or keywordized.
But I'm realizing a lot of that is just for human readability. The keys can be just hashes in memory as long as I have a way to eventually convert the hashes to strings.
I think I'm going to have 1 file that is a map of
{(hash '("foo" "bar") {(hash "buzz") 1 (hash "bazz") 3}
,,,}
And then have an on-disk database of hashes to human things.There are no MapEntry objects that exist when creating a map, I believe, only when doing something like seq
on a map.
I'd recommend using something like cljol to determine which objects are actually taking up memory.
but yes, in general, there are fields in there that you do not use in your application, very likely.
Ah. So I was doing
(require '[cljol.dig9 :as d])
(def m (into {} (take 5 dl/darklyrics-markov-2)))
(d/view m)
But that's the only reason I am seeing MapEntry objects, is because the take
is "like seq
"?Store it directly in a array of bytes as a trie, no need to serialize/deserialize, size on disk directly matches size in memory
It is a little weird, I know, but you should do (d/view [m])
(it takes a sequence of top-level objects, all drawn in gray)
(to make it easier to see sharing of memory between multiple root data structures, when it exists)
Because you did (into {} (take 5 ...))
the result should be a map. I do not see MapEntry objects in the resulting data structure when I do that.
@hiredman I'm trying to parse what you said into what you mean and my brain's throwing exceptions. Are you saying instead of:
{("foo" "bar") {"baz" 1 "buz" 3}
("foo" "biz") {"baz" 2 "buz" 5}
,,,}
rather:
{"foo" {"bar" {"baz" 1 "buz" 3}
"biz" {"baz" 2 "buz" 5}}
,,,}
That's my understanding of trie from a quick google search. Instead of
{"abc": ,,,
"abd": ,,,
"abe": ,,,}
rather
{"a" {"b" {"c" ,,, "d" ,,, "e" ,,,}}}
I guess
you can have a trie and different levels depending, the bit level, the byte level, the character level, etc
if you are committed to just lowercase english words, you could do a trie at the letter level, so you would have a tree with 26 way branches
(defn operation2 [f & args] (apply f args))
(operate2 str "It " "should " "concatenate" ) ;; it works
(operate2 + [1 2 3] ) ;;not working
(operate2 + 1 2 3 ) ;;not working
Hello all, I am learning the basics of Clojure and there is a small exercise with I did not understand, could someone let me know why the function operate2 does not work with “+” function and it works with the “str”, considering that both are functions, it should work, shouldn’t it?a trie directly in a byte array is mostly useful as a static lookup table, not so useful for doing updates on
Ah. I see. https://www.aclweb.org/anthology/W09-1505.pdf#page=2 The figure on the right of that page makes sense.
It should work alright. I notice that you have defined operation2
in the defn
block while what you're calling in the examples is actually operate2
- any chance there's an old function definition lurking behind operate2
?
You are right. Thank you very much. it is time to take a break 🙂. too much information. Just a quick question. Now it works if I pass the numbers (operation2 + 1 2 3) but it does not work when I pass a vector (operation2 + [1 2 3]). Should it work passing a vector?
No. Because that would be the equivalent of (+ [1 2 3)
which would not work.
It shouldn't. The & args
passes the rest of the arguments to the method in wrapped in a collection, so (operate2 + [1 2 3])
would end up doing something like (apply + [[1 2 3]])
.
understood Thank you so much dorab and pyry
Hi, I have a REPL question I started a REPL and trying to do a http request with HttpClient and repl throws java.lang.IllegalStateException: Client/Server mode has not yet been set. but if I build the jar and run it’s works fine Do I need run REPL with some kind of flag or something?
That sounds like a problem I was having recently where there was a version incompatibility between a couple of libraries I was using.
If I remember rightly, I fixed it by upgrading to http-kit 2.5.0
😮 @manutter51 yes! I updated to 2.5.0 and the problem has gone thank you!
I’m building a function that will conj
to a list of maps but if there is a duplicated entry it should replace it. I got it working but it looks ugly 😞
What is the idiomatic way of solving this? Is there a function that already does this?
(defn conj-on-conflict [state new-map]
(let [id (:id new-map)
maps (:maps state)
duplicated (first (filter #(= (:id %) id) maps))
maps (if duplicated
(reduce (fn [accum map]
(if (= (:id map) id)
(conj accum new-map)
(conj accum map)))
[] maps)
(conj maps new-map))]
(assoc state :maps maps)))
(comment
(let [state {:maps [{:id 1 :val 1}]}
new-map {:id 1 :val 2}
empty-state {:maps []}]
(conj-on-conflict state new-map) ;=> {:maps [{:id 1, :val 2}]}
(conj-on-conflict empty-state new-map) ;=> {:maps [{:id 1, :val 2}]}
)
)
@miguel994 Does the order in the list matter? Otherwise you can use a set to solve your problem
No it doesn’t, can you elaborate? duplicated in this case means that it has the same id.
I see, so not the entire entry. That's what a set is; an unordered collection of entries that is guaranteed to contain no duplicates (and provides very fast lookup for contains?
)
In that case, why not use a map that contains the other maps?
Using the ids as keys
That would require to transform the data to be a map of maps and then transforming it back again to a list of maps, which is fine I guess .
Depends on what you want to do with them! What's the use case?
I would like something like INSERT ON CONFLICT
from postgres.
I want to assoc back to the state which has a spec that requires a list of maps
As a general rule, optimizing down to the lowest constant factors at the bit level is the realm of C/C++/assembly, in my opinion, not the JVM. There are certainly more memory-dense data representations than Clojure's that can be done on the JVM, too. Clojure's built-in data structures are not really optimized for minimum memory, more so for fast "effectively constant time" lookups and modifications, and for generality in the types that can be stored in keys/values/vector-elements/set-elements, which has some cost in memory, versus creating specialized structures that can only hold 32-bit integers, for example.
The biggest hammer you have for reducing memory usage by a factor of 4 or more would be looking for a Java library that does what you want, where memory usage has been taken into account in its design.
Hmmm, ok. What I'm curious about is why it has to be a list? I would say that the idiomatic way to solve this "overwrite entry by id" problem is to use a map
Often in Clojure data structure choice is a big part of problem solving
Yes, probably i’m just being stubborn. 😅 Thanks for the inputs
A cool thing about Clojure is that most collection data structures implement seq
, meaning you can use map
, filter
, reduce
etc on them without any modifications! So if the order doesn't matter, you'll most probably be fine with a map if you iterate over the state later in your program.
You could map the sequence of maps replacing things with the same id, it would be more general and avoid a lot of code.
Something like:
(defn conj-on-conflict
[{:keys [maps] :as state} {:keys [id] :as new-map}]
(assoc state
:maps (map #(if (= (:id %) id)
new-map
%)
maps)))
Probably, a little bit more organized than that
Actually, you would have to add a step verify duplicates, but that could be done with some
instead of filter
, since some would return when it finds the first true.
What if maps its a empty list?
then it will not conj to the list
So, instead of (first (filter #(= (:id %) id) maps))
, you could use (some #(= (:id %) id) maps)
.
Sure, I missed that. Wait a sec.
You could convert your map sequence to a map, where the keys would be the ids and the values would be the maps.
Then you assoc the new-map using its id as key.
Now, if you do a vals
in this final map, you will have a sequence of maps with no duplicates.
(defn conj-on-conflict
[{:keys [maps] :as state} {:keys [id] :as new-map}]
(assoc state
:maps (vals
(assoc (into {} (map #(vector (:id %) %) maps))
id new-map))))
I'm going through some Advent of Code days just to get a feel for Clojure (and loving it so far). Tried rewriting a solution using ->>
but keep stumbling into an issue. This is the code:
(->> "day1.txt"
io/resource
slurp
str/split-lines
(map #(Integer. %))
#(for [l1 % l2 % l3 %] [(+ l1 l2 l3) (* l1 l2 l3)]))
If I remove the last function with a list comprehension everything works as expected. But that line makes my program complain about:
Execution error (IllegalArgumentException) at test.day1/eval1660 (day1.clj:12).
Don't know how to create ISeq from: test.day1$eval1660$fn__1667
I'm sure I'm missing something obvious here...@anders152 #(...)
expands to (fn [..] ...)
, so you get (fn [...] (for ...) <result of computation>)
Ah. I need to wrap it in parens?
that works but isn't what most people would recommend
I'm all ears! 😄
usually we restructure, eg. putting the rest of the chain inside the binding block of the for
or making a function with a name calling for, and putting that in the chain
Second one I understand - first one... could you show me an example?
Or link me to something to read on it
in your case you'd need let first since you use the same input multiple times
but in the normal case (for [x (->> ...)] ...)
- instead of for being inside ->>, ->> can be inside for
What about something like this?
Ah! I understand. I'll just define a separate function in this case. Definitely makes it easier to read too.
(let [list-comp #(for [l1 % l2 % l3 %] [(+ l1 l2 l3) (* l1 l2 l3)])]
(->> "day1.txt"
io/resource
slurp
str/split-lines
(map #(Integer. %))
list-comp))
even better, no need to do a def of course
once it has a name, I'd no longer use #()
and a local binding to define it
That was just a quick refactor to get around the quirky interaction between ->>
and #()
Normally I’d go for a defn
.
right, while we are reviewing the code, I'd use #(Long/parseLong %)
instead of #(Integer. %)
too
I’ve seen (read-string %)
variants for parsing ints a couple of times — is it general?
it works for longs, it also has the ability to execute arbitrary code, or create arbitrary object types
also, of course, #(read-string %)
can be replaced by read-string
nice, thanks!
it's more specialized, and gives the same numeric datatype you'd get for a standard literal in a file
any other things that jump out just let me know
also [(+ l1 l2 l3) (* l1 l2 l3)]
can be simplified to ((juxt + *) l1 l2 l3)
user=> ((juxt + *) 2 3 4)
[9 24]
Interesting, reading on juxt
now. Looks useful and powerful
my experience is its fun, and a delight to find uses for, but relatively rare :D
Definitely fits the "I want to apply a series of functions to a series of values and get the summary of all the function executions in a nice format" type problem...
I use juxt
mostly with map
(map (juxt :id :name) some-hashmaps)
I like it for sort-by as well
instead of #([(:id %) (:name %)])
?
instead of #(vector (:id %) (:name %))
that's invalid, it turns into calling a [] with zero args, which blows up
oh right, that was even mentioned as a gotcha somewhere...
#() and ->/->> are much weirder than they first seem, because they operate in the realm of syntax transforms, not program logic
Yeah, I mostly wanted to see if I understood them correctly. Not sure they really fit my solution here
user=> (->> (+ a b) (let [a 19 b 23]))
42
I do have one more question. What's the standard way of getting values out of a map when passed to a function? For vectors I can obviously just match on the position, but on maps?
you can destructure using :keys
, :syms
, :strs
or the full map destructure syntax {var "some-key"}
the full guide to destructuring https://clojure.org/guides/destructuring
Thank you, will read
Yes, that makes a lot of sense and its way better than what I had, thank you 🙏
I'm glad to help
is it possible to have one arity of an anonymous fn call another arity? what name do you use in that situation?
the arity itself solves it (you need a direct call by name, recur doesn't work)
And, I just saw another discussion below, and noticed something.
#(vector (:id %) %)
could actually be replaced by only (juxt :id identity)
you can give an anonymous fn a name like this: (fn fn-name ([x] ...) ([x y] ...))
aha, thanks
giving it a name that way also makes stack traces a bit more useful
Is it possible to tell the clojure
command line app where your deps.edn
file is?
(If say, you want to run clojure
from a different directory.
Maybe something like:
clojure -d ./sub-dir/deps.edn -m cljs.main --compile foo
Or do I have to do something like:
(cd sub-dir && clojure -m cljs.main --compile foo)
?
I think it's easier to cd to the dir with the deps.edn, and tell it your source tree is on some other path, yeah
but maybe there's a new option for specifying a deps file, since I last checked
@noisesmith Sad, okay, thanks. 🙂
no, there's not
https://clojure.org/reference/deps_and_cli#_deps_edn_sources
CLJ_CONFIG=/path/to/dir_with_deps_edn clj ...
May helpBut that's the user deps, not project