datahike

https://datahike.io/, Join the conversation at https://discord.com/invite/kEBzMvb, history for this channel is available at https://clojurians.zulipchat.com/#narrow/stream/180378-slack-archive/topic/datahike
2020-11-10T11:33:25.047900Z

Hello, I believe datahike flushes each tx to index in storage. Are there any plans to implement a live in memory index that accumulates novelty and periodically flushes to storage similar to datomic? The idea is that peers (potentially many) maintain a live in memory index; The transactor reflects txs to peers; At query time live merge join happens, joining in memory index and indexes already flushed to storage. It stands to reason this would improve write performance significantly. I realize this is a major architectural change. To start with it would necessitate a separate transactor process. I didn’t see this discussed anywhere on the roadmap, so was just wondering if this is anywhere on the radar at all? Perhaps a conscious choice was made NOT to do this? Thank you.

timo 2020-11-10T14:29:05.052500Z

Hi @geodrome. Thanks for your question. We are thinking about distributed Datahike but as you said it is a major challenge. Right now we are working on delivering a Datahike server that is not embedded inside your application. We are thinking about if a transactor is a good solution for Datahike as well and it is on our roadmap but probably it will take us some time to implement. Concerning your suggestion of in-memory-index: as far as I understand the hitchhiker-tree Datahike uses (and I am not an expert by any means) does not flush every transaction to storage. But @konrad.kuehne or @whilo are probably better answering that.

1👍