datahike

https://datahike.io/, Join the conversation at https://discord.com/invite/kEBzMvb, history for this channel is available at https://clojurians.zulipchat.com/#narrow/stream/180378-slack-archive/topic/datahike
tvaisanen 2021-05-28T05:51:18.010600Z

I'm evaluating if it would be possible to use with AWS Lambda. Is there any risks in having multiple instances doing writes at the same time? The native image support sounds really interesting.

viesti 2021-05-28T05:56:23.012700Z

Hi! I also share same interest with @vaisanen.toni πŸ™‚ I'm not that familiar with datahike & replikative, so it's not that clear to me how one could, for example, interact with a datahike database, backed by, say a jdbc store, from JVM/Clojure and from ClojureScript, by multiple users concurrently.

viesti 2021-05-28T05:58:48.014200Z

I guess the https://github.com/replikativ/datahike-jdbc backend is limited to JVM/Clojure only, for example (as the name suggest :)), so maybe in such a case, we'd need to build a "api" for sending queries/transactions

viesti 2021-05-28T05:59:21.015Z

> Is there any risks in having multiple instances doing writes at the same time? This concurrency aspect is especially interesting to me too πŸ™‚

viesti 2021-05-28T06:23:20.017700Z

trying to remember how stuff goes with Datomic Cloud, now I remembered the https://docs.datomic.com/on-prem/overview/clients-and-peers.html, which sends queries to a peer server, wondering that there would be a "serverless" datahike, then maybe something similar could be done. The https://github.com/replikativ/datahike/tree/206-cljs-support branch sounds really interesting, but don't know how far the P2P idea still πŸ™‚ I guess the server part there could be a serverless implementation, maybe πŸ™‚

kkuehne 2021-05-28T12:22:48.019800Z

Hey @viesti, we are also looking into creating a client-server variant: you could run https://github.com/replikativ/datahike-client serverless and run a https://github.com/replikativ/datahike-server somewhere for storage. Both are still in an early stage but would that be something that you're interested in?

πŸ‘€ 1
viesti 2021-05-28T13:40:35.020500Z

@konrad.kuehne thanks for bringing the projects up, this certainly looks interesting πŸ™‚

viesti 2021-05-28T13:43:03.021600Z

can there be multiple instances of the server for high-availability/zero downtime?

viesti 2021-05-28T13:49:49.024600Z

I don't yet have a good grasp on the concurrency nature. Can there be multiple writers and how do transactions work (I'm a bit new in how these go in the datalog world)? Thinking of a situation where one for example does a read, then based on the read, conditionally a write. Or would this require transaction function support from the server.

viesti 2021-05-28T13:50:07.025100Z

Could the server itself be a serverless implementation, say with a Lambda?

kkuehne 2021-05-28T18:30:18.028700Z

We’re still in the drafting phase for multi-read and write setup for the server in https://github.com/replikativ/datahike/pull/332. You could do conditional writes with https://cljdoc.org/d/io.replikativ/datahike/0.3.6/doc/entity-specs. Additionally we’re working on a more broad invariant library. A serverless system could be really interesting but we haven’t tested it yet since the server is work in progress. What kind of data would you like to use with Datahike?

πŸ‘ 1
viesti 2021-05-28T20:05:16.029100Z

We're dealing with questionnaires, with question text, their type (free, list of options, maybe follow ups etc.) answers, maybe multiple versions of answers, might have more metadata in the future, so Datalog flexibility is a plus. Data might be per questionnaire/document at first (thinking about per document database file) but there might be value in having all documents in a single database. The system is probably not used 24/7 so having a database system that doesn't require a constantly running server sounds appealing. Having said that, there probably are ways to tackle "pausing" a server and taking care that data is backed up, but the idea of using managed services, like DynamoDB+S3 sound interesting too.