datahike

https://datahike.io/, Join the conversation at https://discord.com/invite/kEBzMvb, history for this channel is available at https://clojurians.zulipchat.com/#narrow/stream/180378-slack-archive/topic/datahike
whilo 2021-05-30T09:47:32.032700Z

@viesti We recently also got a native-image with a command line interface working https://github.com/replikativ/datahike/blob/336-native-image-cli/doc/cli.md. This might be particularly interesting to your use case because a) it has a very fast startup time and low memory usage compared to a full JVM and b) it can safely coordinate multiple writers without requiring a coordinating server/daemon process. There still can only be a single process writing to the database value at each point in time (so no automatic conflict resolution yet), but the coordination is happening through the transactional semantics of the underlying key-value store.

👀 1
viesti 2021-05-30T15:53:28.033400Z

> By prefixing the path with `db:` to the query engine you can pass multiple db configuration files and join over arbitrary many databases. Everything else is read in as `edn` and passed to the query engine as well. This I think is an aspect that I haven't fully thought about 🙂

viesti 2021-05-30T16:30:24.038900Z

Hmm, was thinking that if there would be a DynamoDB backend, then with a Lambda with concurrent invocations set to 1, we'd have single writer process and the storage itself would be "serverless" then too. I guess with the native-image cli, this could be done by storing the database file(s) into say EFS. Or in S3 and then read from S3 at start and synched back after write (or before Lambda timeout). Maybe a consistent background sync in both cases (EFS / S3) would be good.

whilo 2021-05-30T21:50:00.039Z

The CLI does not have a way to access temporal databases yet, but you could also hold onto a snapshot in your bash scripts.

whilo 2021-05-30T21:51:30.040100Z

@viesti I have seen you joined discord, we can continue this discussion there, which is nicer because it retains history and allows us to have more Datahike related subchannels.

👍 1