Alpha5 is now done. Same as Alpha4, but queries are significantly faster on large datasets
putting Alpha5 thru some paces
Well, itās alpha so we can find the big problems and address them before itās called a āreleaseā š
gosh I'm rusty at clojure 8^)
also, do not buy Brach's 24 Flavours tiny Jelly Beans
their attempt at Jelly Belly knockoffs... It's like they didn't realize that the flavors must harmonize when you shovel a handful into your maw
ok, think I broked it
New times: lein test netgrok.core-test Importing o1 into asami Importing to asami "Elapsed time: 19520.553168 msecs" Imported 0 Importing o2 into asami Importing to asami "Elapsed time: 208748.131329 msecs" Imported 0
So yah, I can confirm your estimate on perf win with Alpha6
Iām guessing that youāre saying āimported 0ā to mean a count of tempids?
yah, just ignored that this time since I havent' updated my tests yet -- still drinking first cup of coffee
Have a look at the count on tx-data
. Thatās the number of statements inserted
If you want the number of entities insertedā¦ do a count on your input š
The tempids, is so you can provide a negative number for :db/id
on an entity and it will generate an ID for you and tell you what your negative number got mapped to (like Datomic)
I have some utility functions for exploring the shape of the data and the schema
next step is to do some query clause generators
for functional composition of where clauses...
just doing export-data a bunch helped me grok what is happening
export-data
gives you a view of everything, but if you insert individual things (or just small numbers of entities) then have a look at the contents of tx-data
in the results of the transaction. That shows you the triples that were generated and inserted.
Iām curious how many triples you got from your data that took 3m28s to load. (I have to work to improve this)
data coming up...
ein test netgrok.core-test Importing o1 into asami Importing to asami "Elapsed time: 19489.224782 msecs" Imported 31881 statements Importing o2 into asami Importing to asami "Elapsed time: 209601.057757 msecs" Imported 296271 statements
Thanks for that
heading out for brunch
For anyone wondering, Craig is allowed to push me around in here. Heās no longer at Cisco, but it was his bright idea that I write my own graph database.
Yup? Whatās happened?
Locked up importing a few thousand objects
I'll break up the txn
and we'll see what's happening. I must first eliminate my own stupidity...
Actually, breaking up the transaction is a bad thing to do. How big is the file that youāre importing?
(bad, because you end up expanding the indexes significantly)
Also, if youāve made a mistake, Iād like to know about that too. I should document gotchas, and mitigate some of the more obvious ones
yah, so it's not locked up, but just slow. Mind you I'm throwing a lot of large complex objects at it
I'll get data to you shortly
ālarge complexā is going to be an issue. Zuko (the module that breaks it up into triples) is now faster than it used to be, but thereās still a lot of work for it to do
Yah, I'm thinking it's a chance to intrument the whole thing with metrics data
Importing o1 into asami Importing to asami "Elapsed time: 53402.866849 msecs" Imported 4503 Importing o2 into asami Importing to asami "Elapsed time: 644524.77233 msecs" Imported 41484
FAIL in (load-test) (core_test.clj:21) Test loading and parsing expected: (= (count o1) (count (:tempids tx1))) actual: (not (= 271 4503)) lein test :only netgrok.core-test/load-test FAIL in (load-test) (core_test.clj:22) Test loading and parsing expected: (= (count o2) (count (:tempids tx2))) actual: (not (= 2570 41484))
So the failures are me expecting the entity count to be the input object count. The difference tells you just how complex some of the objects are, with many nested entities...
It creates lots of temporary IDs, but unless you ask, I would generally think theyād match the provided objects. :thinking_face:
Iām assuming that some or all of this data can be shared?
not sure. it's packet dumps form my home network
I will find some representative data
The :tempids in the tx would include the nested objects right?
I didnāt think so (unless you asked it to). So unless Iāve forgotten something itās a problem
It creates lots of IDs, but that map is supposed to just be for the top level entities, and things youāve provided your own temporary IDs to
yah, so that seems wrong then
I provided no temp IDs for anything
So, I ran the same thing with in mem db
lein test netgrok.core-test Preparing o1 "Elapsed time: 0.005525 msecs" Preparing o2 "Elapsed time: 8.82E-4 msecs" Importing o1 into asami Importing to asami "Elapsed time: 262.289293 msecs" Imported 4503 Importing o2 into asami Importing to asami "Elapsed time: 2622.369329 msecs" Imported 41484
Is zuko involved in that too?
yes
Itās a library that pulls entities apart into triples
ok, so it's not in zuko then eh
So I am coercing string keys in JSON to keywords... OUt of.. habit?
Would I be violating any assumptions of Asami if I did not do that?
no
orā¦ I hope not š
I need to check my undertanding here:
[:tg/node-929806 "ip.flags" "0x00000040"]
[:tg/node-623767 "layers" :tg/node-623768]
[:tg/node-623767 :db/ident :tg/node-623767]
[:tg/node-623767 :tg/entity true]
netgrok.core> (d/entity (d/db (conn)) :tg/node-623767)
{}
I would not expect that to be an empty entity
the triples are from: (d/export-data (d/db (conn)))
this is asami alpha5 running in memory
(conn) is just (d/db-connect URI) ...
so I'm making a new connection using the DB uri, and a new DB...
Ah, ok, if I don't coerce keys to keywords in the structs, entiy loading fails
So I think that's a boog?
could be
interesting!
this is in memory DB
Iām working with the cleaned up data right now, so the attributes are all keywords. Iāll try with the strings shortly
Oh, thatās interesting too
yah, since the entity func is basically per storage...
export-data is my new pal
yah, so the file I sent you, I beleive has string keys
it does, yes
:smiling_face_with_3_hearts: this is a pleasant way to explore data
BTW, the large number of entities in tempids was expected, but Iām revisiting it, and I think they should not be included.
So Iām going to update Zuko to remove them
ok, gotten more familiar with the query language. Able to identify all the devices on my network, and start digging into their behavior
The query language is impressie Paula
calling it a day tho, so I stop obsessing over it