asami

Asami, the graph database https://github.com/threatgrid/asami
quoll 2020-11-19T07:07:32.167Z

That’s Thanksgiving Day in the USA. My family would kill me!

quoll 2020-11-20T14:17:20.181700Z

I should be available, though later in the day would be better, since @noprompt can’t attend if it’s early (he is in California and has young children to get to school)

noprompt 2020-11-20T17:19:35.181900Z

I’d say just ping me and I will join if I can.

whilo 2020-11-24T05:03:25.184Z

Ok, I can do later in the day, it is just bad for others in Europe. Let me ask who else wants to join and then we can see whether we can do it later.

quoll 2020-11-19T07:15:03.171400Z

Meanwhile… I’m a little disappointed at the speed, but right now I’m able to take a document, split it by spaces, and then index every resulting string: • Document size: 725060 bytes • 117797 strings • Index size: 885455 bytes On my notebook computer: Time to index: 21 seconds (yes, this is disappointing) Time to rebuild document from the index: 3 seconds This is my first attempt. I want to tweak the index a little, to see what speed/size changes I get. The tree nodes are currently large, which I thought would be OK, but maybe not.

quoll 2020-11-19T07:16:26.172400Z

This is exercising the Data Pool. Now that this works correctly, I’m moving onto the Triple Store (OK, it’s a quad store. Shh)

quoll 2020-11-19T11:49:09.178Z

To explain the operation above… for every string to be inserted, the code looked it up in the index. If it was there, then the appropriate ID for the string was returned. If not, it was inserted, and a new ID was created. This was done at a rate of about 5600 times per second, or ~180μs each time.

quoll 2020-11-19T11:51:13.180800Z

This is a little bit vague, because sometimes a string was converted into a number instead of being stored. I am thinking of turning that feature off to get a benchmark on the storage.