datahike

https://datahike.io/, Join the conversation at https://discord.com/invite/kEBzMvb, history for this channel is available at https://clojurians.zulipchat.com/#narrow/stream/180378-slack-archive/topic/datahike
alekcz 2020-05-20T06:40:25.175Z

@whilo I'm almost done with Carmine. Should be done by the end of tomorrow. I'll tackle couchdb over the weekend.

alekcz 2020-05-20T12:22:55.176400Z

@whilo Done with Redis. Added tests. And added github action. https://github.com/alekcz/konserve-carmine

3
2020-05-20T13:04:14.177800Z

I'm just passing along, but does a backend for replicative means having a backend for datahike?

alekcz 2020-05-20T21:33:26.179600Z

@jeroenvandijk yeah it does. One will need setup the datahike "connector". But it's a super trivial task once you have a backend. This was the entire connector for the firebase store I put together.

(ns datahike-firebase.core
  (:require [datahike.store :refer [empty-store delete-store connect-store scheme->index]]
            [hitchhiker.tree.bootstrap.konserve :as kons]
            [konserve-fire.core :as fire]
            [superv.async :refer [<?? S]]))


(defmethod empty-store :fire [config]
  (kons/add-hitchhiker-tree-handlers
   (<?? S (fire/new-fire-store (:db config) :env (:env config) :root (:root config)))))

(defmethod delete-store :fire [config]
  (let [store (<?? S (fire/new-fire-store (:db config) :env (:env config) :root (:root config)))]
    (fire/delete-store store)))

(defmethod connect-store :fire [config]
  (<?? S (fire/new-fire-store (:db config) :env (:env config) :root (:root config))))

(defmethod scheme->index :fire [_]
  :datahike.index/hitchhiker-tree)

alekcz 2020-05-20T21:43:51.179800Z

@whilo @konrad.kuehne I just pulled the latest update to konserve feature_metadata_support I see you've added -get-version in the filestore implementation. It results in a compilation error because it's not in the protocol definition.

2020-05-20T22:07:11.182500Z

@alekcz360 that’s really cool. Any downsides/upsides between these backends? Or does it all depend on the underlying backend?

alekcz 2020-05-20T22:13:26.185200Z

@jeroenvandijk I'm not an expert on the topic by any stretch of the imagination. As far as I understand, datahike flushes the hitchhiker-tree to the backend at regular intervals asynchronously, so the store speed doesn't particularly affect the datahike's performance.

1👍
whilo 2020-05-21T09:02:47.194200Z

@jeroenvandijk @alekcz360 Yes, flushing is decoupled, but not asynchronous. That would definitely be doable, one way to achieve it right now is to use Redis or the filestore without fsync'ing.

1👍
alekcz 2020-05-20T22:14:30.186400Z

@whilo could probably give a more precise answer on that.

alekcz 2020-05-20T22:19:14.189400Z

I'd need to understand datahike a bit better to give you pros and cons of using a particular store in relation to datahike.

alekcz 2020-05-20T22:24:52.192200Z

If datahike is out of the picture and you're just using konserve as a store then it really depends on underlying backend.

whilo 2020-05-21T08:59:01.193900Z

Basically all backends boil down to being used as key value stores. So it depends on how quickly you can store a binary blob in your backend or get it from there. That can include the speed of network IO in your system. Since the indices are persistent fragments can also be locally cached in-memory on each peer independent of the backend.

whilo 2020-05-21T09:09:56.194400Z

One thing to keep in mind is that some backends still use threads like CouchDB, LevelDB or Redis to unblock core.async. This can produce an overhead in system load if you write a lot to Datahike. Datahike also should write all fragments at one in its flush procedure and not sequentially, that is straightforward to fix, but we did not manage to do it yet: https://github.com/replikativ/hitchhiker-tree/blob/master/src/hitchhiker/tree.cljc#L484

1👍
alekcz 2020-05-20T22:27:32.192900Z

@whilo Done with CouchDB. Added tests. And added a github action. https://github.com/alekcz/konserve-clutch

3
alekcz 2020-05-20T22:29:00.193200Z

I'll need to make changes to both redis and couchdb once we've added -get-version to the protocol and pushed it to clojars

2020-05-20T22:29:29.193400Z

With datahike all same, same? Like how Datomic has the same characteristics over the different backends?