Is it possible to replace pldb in core logic with something like lmdb and if so how do I go about doing so
I did see this - https://github.com/clojure/core.logic/wiki/Extending-core.logic-(Datomic-example)
having trouble particularly loading a large data set in memory with pldb
how large is large? We use pldb with very large data sets
But obviously not so large that we can’t load it in memory
well the size of the file I am loading from is close to 6GB
When I add in the facts - it blows up all the way up to 19GB in memory
This could be due to intermediates created via clojure data structures?
Are you indexing a lot? There’s a lot of work to create the in memory index
I did not optimize that code for memory efficiency.
Yes I do have few indices
I would say our “large” is 100-1000MB, which is definitely a lot less than your “large”
At some point an actual database is better. I don’t think that example is current, and if I recall it wasn’t actually a very good example
I probably can’t help too much though. I wrote pldb, but our use of core.logic and pldb has been stable for many years now, so it’s not code I touch on a day to day basis anymore. And since core.logic is under the clojure CA and dev process, I haven’t been terribly motivated to actively contribute
I think the size bloat could be due to indexing as you pointed out - maybe I can play around with it and see if it reduces further
If you don’t need them, then remove them. But if you do, you’ll just be trading memory for CPU
Also I've realized that clojure data structures by default occupy quite a bit of memory when operated on - unless we use transient
so - I will also try if I serialize the db to disk and read back from it - if that reduces the size
I’m fairly certain the pldb code does not do that, but it might. It’s been a loong time 🙂
You can definitely serialize it. At one point we were saving pldbs in riak
if anything, serializing it will make it larger
ok
serializing will remove any structural sharing in the data
Thank you for your help @norman
I am pretty new to Clojure and core.logic
using it for a clinical decision support tool - and as such it's knowledge base is pretty large
https://gist.github.com/terjesb/3181018 might be a good place to start if you want to use core.logic without storing things in memory
oh cool - thanks @hiredman -
the dataset here being the lucene index?
in that code sure, but you can do something similar to extend it to other datastores, I was thinking of it more as example
ok - I understand - thanks