Good Morning!
Good morning
Morning
moin
morning
Morning
good AM me
morning.
@ordnungswidrig cool. Glad it worked. What stack did you use in the end?
http://Tech.ml and vega
Anyone here got a good suggestion for something like nippy but that writes out records in a file rather than just a big take it or leave it data structure?
file per record?
I've done something before with baldr and record separators, but that felt a bit janky
file per record would overwhelm the OS file handles I think. There are about 2-10 million records
I like the speed of nippy, and the compression is pretty good too, but I lose a lot of compression by needing to split things up and I lose a lot of file efficiency by having each file be a single vector of records that gets read in
probably not the performance you are looking for, but this is the main reason for ednl https://github.com/lambdaisland/edn-lines
thx 🙂
as this is often the eduction channel, I've been looking at @ben.hammond's blog post here: https://juxt.pro/blog/ontheflycollections-with-reducible and thinking that you don't need to have a reducible for the directory of files, you just need a reducible for each file type, you can then have a vector of eduction of those reducibles which would give you all your short circuiting/ reduced? functionality if you did something like
(eduction ;; changed from sequence thanks to Ben Hammond's advice
cat
[(eduction mappify-record (reducible-type-1 file-1))
(eduction mappfiy-record (redcucible-type-1 file-2))])
you can replace sequence with eduction depending on whether or not you want to have the results in memory or recalculate them each time (from what I understand)
(errors of misunderstanding of the blog post are mine)
I think this simplifies the chaining-reducible
bit. I think
the real magic happening in cat
looks like transit, based on fressian, might be the sweet spot? Looks like you can read and write individual objects from a stream. https://cognitect.github.io/transit-clj/#cognitect.transit/read
and there is a reducible friendly wrapper already https://gitlab.com/pjstadig/reducibles
An eduction of a reducible might not implement ISeq
, at which point things start breaking
Ah, TIL.