core-logic

bajrachar 2018-07-12T15:42:05.000412Z

Is it possible to replace pldb in core logic with something like lmdb and if so how do I go about doing so

bajrachar 2018-07-12T15:44:39.000120Z

I did see this - https://github.com/clojure/core.logic/wiki/Extending-core.logic-(Datomic-example)

bajrachar 2018-07-12T15:45:32.000218Z

having trouble particularly loading a large data set in memory with pldb

2018-07-12T15:46:23.000329Z

how large is large? We use pldb with very large data sets

2018-07-12T15:47:09.000567Z

But obviously not so large that we can’t load it in memory

bajrachar 2018-07-12T15:51:53.000649Z

well the size of the file I am loading from is close to 6GB

bajrachar 2018-07-12T15:52:37.000407Z

When I add in the facts - it blows up all the way up to 19GB in memory

bajrachar 2018-07-12T15:53:19.000384Z

This could be due to intermediates created via clojure data structures?

2018-07-12T15:56:51.000502Z

Are you indexing a lot? There’s a lot of work to create the in memory index

2018-07-12T15:57:33.000272Z

I did not optimize that code for memory efficiency.

bajrachar 2018-07-12T15:58:07.000389Z

Yes I do have few indices

2018-07-12T15:58:12.000153Z

I would say our “large” is 100-1000MB, which is definitely a lot less than your “large”

2018-07-12T15:59:14.000207Z

At some point an actual database is better. I don’t think that example is current, and if I recall it wasn’t actually a very good example

2018-07-12T16:01:16.000347Z

I probably can’t help too much though. I wrote pldb, but our use of core.logic and pldb has been stable for many years now, so it’s not code I touch on a day to day basis anymore. And since core.logic is under the clojure CA and dev process, I haven’t been terribly motivated to actively contribute

bajrachar 2018-07-12T16:02:26.000067Z

I think the size bloat could be due to indexing as you pointed out - maybe I can play around with it and see if it reduces further

2018-07-12T16:03:29.000240Z

If you don’t need them, then remove them. But if you do, you’ll just be trading memory for CPU

bajrachar 2018-07-12T16:04:27.000095Z

Also I've realized that clojure data structures by default occupy quite a bit of memory when operated on - unless we use transient

bajrachar 2018-07-12T16:04:56.000476Z

so - I will also try if I serialize the db to disk and read back from it - if that reduces the size

2018-07-12T16:05:19.000505Z

I’m fairly certain the pldb code does not do that, but it might. It’s been a loong time 🙂

2018-07-12T16:05:43.000511Z

You can definitely serialize it. At one point we were saving pldbs in riak

2018-07-12T16:05:58.000068Z

if anything, serializing it will make it larger

bajrachar 2018-07-12T16:05:58.000122Z

ok

2018-07-12T16:07:01.000397Z

serializing will remove any structural sharing in the data

bajrachar 2018-07-12T16:07:04.000401Z

Thank you for your help @norman

bajrachar 2018-07-12T16:07:19.000287Z

I am pretty new to Clojure and core.logic

bajrachar 2018-07-12T16:08:18.000539Z

using it for a clinical decision support tool - and as such it's knowledge base is pretty large

2018-07-12T16:09:00.000352Z

https://gist.github.com/terjesb/3181018 might be a good place to start if you want to use core.logic without storing things in memory

bajrachar 2018-07-12T16:10:25.000201Z

oh cool - thanks @hiredman -

bajrachar 2018-07-12T16:11:02.000051Z

the dataset here being the lucene index?

2018-07-12T16:30:00.000159Z

in that code sure, but you can do something similar to extend it to other datastores, I was thinking of it more as example

bajrachar 2018-07-12T16:30:36.000148Z

ok - I understand - thanks