asami

Asami, the graph database https://github.com/threatgrid/asami
quoll 2021-02-16T13:50:29.021800Z

@peter.royal! Fancy seeing you in the Clojure universe 🙂

Steven Deobald 2021-02-16T15:58:54.022Z

Wow, small world.

osi 2021-02-16T16:45:51.023700Z

👋 i keep an eye on it 😀 … i’m not actively using clojure, but i continue to be interested in (gestures around) these approaches to data modeling/etc. the same stuff that brought us together initially

osi 2021-02-16T17:17:49.024200Z

i had seen your name pop up in the #crux channel @steven427 - makes me wonder what you’re up to now 😀

quoll 2021-02-16T17:30:12.026900Z

Well, you’ll see code in Asami that is a direct result of a bar discussion in SF over a decade ago 🙂 https://github.com/threatgrid/asami/blob/storage/src/asami/durable/pool.cljc#L95

quoll 2021-02-16T17:32:41.028800Z

The Clojure version of reading IDs like this is at https://github.com/threatgrid/asami/blob/storage/src/asami/durable/decoder.clj#L191 i.e. check the top bits, and if they match a known datatype, then decode the remainder of the Long value into the data

quoll 2021-02-16T17:33:35.029800Z

So all the short strings, numbers, dates and keywords are being stored inside a long, and not in an index

quoll 2021-02-16T17:33:53.030100Z

Geee…. I wonder where that idea came from? 😉

Steven Deobald 2021-02-16T17:33:56.030300Z

@peter.royal It's a good question, really. A very long story short: spent most of 2020 (otherwise known as Pandemic Season Classic) in North India, most of that in Kashmir. Was working on a digital library for http://pariyatti.org at the time, which is why I'm in #asami and #rdf ... ended up choosing #crux for simplicity / bandwidth reasons. Started working halftime for JUXT this month, partly due to all the Crux questions I was asking, I think. Still working on Pariyatti the other half of the time. Still very curious about RDF, SPARQL, Real Datalog, proper graphs (Crux's edges are the graph db equivalent of duck typing unless one layers on extra semantics). I lurk in here so that by 2025 I might have some idea of what the heck is going on.

Steven Deobald 2021-02-16T17:34:04.030600Z

@peter.royal Where are you these days?

osi 2021-02-16T17:37:51.030800Z

ha! i had completely forgotten about that 😀

quoll 2021-02-16T17:38:53.031400Z

It speeds up “storage” in the data pool quite a lot!

quoll 2021-02-16T17:40:01.032800Z

I always meant to do it for Mulgara, but then I didn’t have any reason to be working on Mulgara anymore, and I just didn’t like using Java either.

quoll 2021-02-16T17:40:17.033200Z

But when I started again from scratch, it was one of the first things to happen!

osi 2021-02-16T17:40:24.033500Z

oh nice! i’m at Netflix, doing biz apps for our studio. ultimately, it’s a giant data management problem, and my mind is “poisoned” by the RDF work I did in the mid-2000's at a failed startup (which is what let me to meet @quoll, back in the Mulgara days). as Rich’s ideas on modeling/constraints are aligned with mine, I like to broadly follow his work. temporality in data has been on my mind, which led me to Datomic and then Crux

osi 2021-02-16T17:41:52.033600Z

heh, i’m the opposite - clojure and i don’t “mesh” well enough. i tried using it again 2yr back to solve a problem, and when i came back after a month and could understand my own code, i realized there would be no hope for a successful introduction in my team

Steven Deobald 2021-02-16T17:45:14.035Z

Crazy! I'd love to hear a story about the "poisoning" one day ... and your philosophy on the whole space, actually. It's not the easiest thing to hear opinions about and even harder to hear war stories about.

quoll 2021-02-16T17:54:33.035100Z

While this is definitely a risk with Clojure, in practice I find that it can be controlled: • Writing in a well structured way (which takes practice!) • Documenting with comments. There is FAR TOO MUCH Clojure code out there without good documentation in it. This is definitely a hurdle. • Ensure that the entire team learns to do the same as above.

quoll 2021-02-16T17:55:02.035300Z

I too have struggled with code that I returned to and had to re-learn.

osi 2021-02-16T17:55:17.035500Z

heh, i’m (slowly) working on writing thoughts down. a very very scoped from was done as a conference talk, https://www.youtube.com/watch?v=JGtybIKUdh4&list=PLKKQHTLcxDVbJtlef15003TYaEkq1ZY8c&index=18

quoll 2021-02-16T17:55:53.035800Z

It is extremely easy to write Clojure without documentation, making it impenetrable to your future self. Which is why structuring and documentation habits become so important

osi 2021-02-16T17:56:41.036Z

All of that makes a lot of sense. I had spiked something with the hopes of trying to introduce more clojure, but my experience made me realized that it would be too much of an uphill battle, that I’d rather push the concepts from clojure with a java implementation, even though it’d be less-ideal.

Steven Deobald 2021-02-16T18:15:16.036200Z

Could you elaborate on Command + State -> Events = a pure function? In practice this is never really true, right (since disk tends to be involved)? Are you thinking more of philosophy / theory here? Or have you exploited this somewhere?

osi 2021-02-16T18:19:17.036400Z

As theory, yes. I have built a system that exploits this. The state necessary for that process is in memory. It is only the state necessary to enforce the rules needed to generate events. It’s a subset of all the information the events represent. The loop is then in-process, read a command from disk, apply it to the state, generate events prospective events, apply them back to the state to ensure that it moves forward. If all of that succeeds, then persist/publish the events and record the disposition of the command

osi 2021-02-16T18:21:30.036600Z

the disk-related parts are intermixed, but I’m treating that as an implementation concern. that core loop doesn’t depend upon it, which is leveraged for testing the system. the way that persistence/publication is managed isn’t dependent upon the loop. the strategy can be changed independently. (right now we’re entirely in postgresql with tables, but using kafka similar to how crux does should be possible)

Steven Deobald 2021-02-16T18:22:12.036800Z

Huh, neat. Makes sense.

Steven Deobald 2021-02-16T18:22:32.037Z

Yeah, it feels like you're really building something quite Crux-like here. 🙂

osi 2021-02-16T18:41:39.037400Z

Yes, it is certainly Crux-like. I’ve had thoughts about materializing the output into Crux to leverage its temporal queries. As far as my current implementation, it’s all in Postgres as our internal operations are easier with the fewest moving parts a team has to manage.

quoll 2021-02-16T19:28:49.037600Z

If it’s a spike, then that’s what I would recommend too. But I also recommend practicing with Clojure. It may be uphill for you at this point, but it really does get a lot easier! 🙂

Steven Deobald 2021-02-16T19:39:55.037800Z

Seems legit.

osi 2021-02-16T22:26:37.038Z

It was a spike of sorts - using Clojure to do a DB -> GraphQL translation with #lacinia. I had plans to try and use Clojure (and spec specifically) for a data pipeline, but then I was switched to a different project that didn’t have (as) strong of a need