asami

Asami, the graph database https://github.com/threatgrid/asami
bocaj 2021-06-25T14:35:48.174400Z

I’m very excited to get into Asami for the graph algorithms, and I’m looking for anecdotal experience about it’s current limits.

quoll 2021-06-25T16:48:03.175Z

Limited to in-memory graphs right now

quoll 2021-06-25T16:49:58.177700Z

There are only a few things. It can find shortest paths between nodes. It can identify all subgraphs. It can find a subgraph that a node is part of. It also integrates into Loom, and will respond to the algorithm APIs there as well

2021-06-25T17:14:23.181600Z

I’m currently writing some SPARQL queries to detect whether SKOS concept schemes conform to various topologies; i.e. whether they’re flat, tree, dag, cyclic-graph etc… Those tasks turn out to be pretty trivial queries; and will almost certainly be in asami too. The most complex of them is detecting cycles, and that’s trivial with property paths.

👍 3
2021-06-28T15:01:46.189600Z

I don’t know. For my implementation it’ll be dependent on the underlying database as it’s done in a single SPARQL query with a property path, and the graphs I’m working with are small enough to live in memory anyway. I would expect the query to be memory bound though it’ll depend on the implementation. For my data cycles are a very rare edge case, and one of the main reasons for testing it is to invalidate certain graphs and avoid running algorithms that may not terminate in the presence of cycles over the data.

refset 2021-06-28T20:58:01.202500Z

> the graphs I’m working with are small enough to live in memory anyway ah okay, that definitely tips it in favour of just doings things outside of whatever durable store you use 🙂

bocaj 2021-06-25T18:09:35.181700Z

Great! I use ubergraph in memory now to find connected components, so I think it will work just fine!

quoll 2021-06-25T18:13:16.183300Z

I ought to write up how best to use these things. They’re there but they are probably hard to find 😕

quoll 2021-06-25T18:14:25.185Z

It’s a case of identifying a need that my team has (or occasionally, something that I think seems interesting) and then my team decides not to bother using it.

😀 1
quoll 2021-06-25T18:15:37.186800Z

Identifying all disconnected subgraphs was an operation that my team definitely needed. And then they decided to do something else after I made it. Sigh.

quoll 2021-06-25T18:16:04.187800Z

But it’s open source, so I’m hoping that others may gain benefit from it 😊

bocaj 2021-06-25T18:42:17.188100Z

I know the feeling! I like the docs a lot, actually

quoll 2021-06-25T18:42:39.188300Z

Thank you!

quoll 2021-06-25T18:43:16.188500Z

But this is prompting me that I need to write up some of the analytics operations

bocaj 2021-06-25T18:43:18.188700Z

I’m build an id graph from our company id’s (netid, email, hr-id) to find issues. I think this will work

refset 2021-06-25T18:53:45.189Z

Cool! I talked a bit about cyclic queries in the context of Crux for a recent Clojure Provo meetup, and wrote some tests to benchmark performance: https://github.com/juxt/crux/pull/1499 Crux is slower than, e.g. DataScript, but doesn't OOM as the data set increases. How have you found memory usage to be whilst detecting cycles?