đ was wondering if asami had support for datalog rules?
Itâs done through Naga
Asami was originally part of Naga. It got extracted as a graph DB as it gained more features
Nice, are there any examples on how to integrate the two?
By default, itâs going to use Asami
It takes a little effort to ask for Datomic, though it works with that as well
Itâs not really a command line tool. Itâs designed as an API
The Command line program was built as an example of how to use it. Have a look at the run-all
function for this:
https://github.com/threatgrid/naga/blob/main/src/naga/cli.clj#L74
Thanks for the detailed explanation! Whatâs the best way to learn about whatâs supported in rules? I was looking at the schemas in zuko but Iâm still not quite sure if what Iâm trying to do is supported.
(I was looking to implement a rule whose head would increment a counter matched in the body, but not sure those eval style patterns are valid)
Zuko was just a place to put internal libraries that were shared between Naga and Asami so that there wasnât an interdependency between them (no one wants to load up Asami if theyâre doing rules on Datomic)
That is possible, actually
Using the rule macro (so not in predicate form):
[?entity :valueâ ?new-v] :- [?entity :type âcounterâ] [?entity :value ?v] [(inc ?v) ?new-v)]
Pulling that apart:
[?entity :type "counter"] [?entity :value ?v]
This part just finds the entity (I just chose to use âtype counterâ, but you may have an ID, or something else), and pics up the numerical value in ?v
[(inc ?v) ?new-v]
This increments the value of ?v
and binds it to ?new-v
Then the head of the clause uses:
[?entity :value' ?new-v]
Note how the :value
property is annotated with a trailing quote character? This means âupdateâ the property, rather than just asserting it. If it were simply asserted, then the entity would have 2 :value
attributes (which is legal), and youâve have to look for the max of them. But by using the update annotation it will remove the old value and add the new one
The problem here is how often the rule runs! This rule will be triggerable by: ⢠another rule setting an entity to be a counter (unlikely) ⢠an entity being given a new value
Updates are breaking the seminaĂŻve reasoning mechanism, so if anyone else updates their :value
then it wonât retrigger this rule (hey, thatâs lucky!)
(when I said âentity being given a new valueâ I meant a value that didnât exist before. Updates wonât affect it)
In general, you could only expect that rule to be run one time per engine-invocation. Iâm just explaining the edge cases to cover my bases đ
đ
Interesting, do I understand correctly that in this examples the idb facts derived by naga get transacted back to asami and will be available for both a) querying idb facts directly in asami, and b) deriving more facts in another pass of naga?
Also curious about time-space complexities, I assume each naga execution will compute rules against all facts in asami? I.e. thereâs no rete-style network that would only consider facts inserted since the previous run?
On the first question⌠a) The data is transacted back into Asami and can be accessed via querying. b) It iterates until completion. You wonât get any more facts unless you input more data. That second one raises an important issue: Naga extends Datalog to allow declarations of new entities. This means that you can shoot yourself in the foot if you create a loop. For instance, donât say:
Parent(X,Y) :- Person(X).
Person(Y) :- Parent(X, Y).
i.e. All people have parents, and all parents are people. This will result in an OOM> I assume each naga execution will compute rules against all facts in asami? I.e. thereâs no rete-style network that would only consider facts inserted since the previous run? This is incorrect. There is a rete-style network. However, instead of populating each node of the network with data that flows through it, it uses the indexes to do this work instead. i.e. The memory required for each node in the network is actually a slice out of one of the indexes
> There is a rete-style network. However, instead of populating each node of the network with data that flows through it, it uses the indexes to do this work instead Very cool, is this what the queues and constraint data achieve with the dirty tracking?
I gather that youâre talking about the code in naga.engine
?
Quick overview⌠1. The initialization find the connection between productions (or the heads) of rules, and the dependencies (or the bodies) of other rules. 2. Every rule get scheduled by putting them on the queue, with each element of the bodies marked as âdirtyâ 3. The engine picks up the head of the queue, and checks each element in the body to see if theyâre dirty. If the queue is empty, then the engine is finished. 4. If everything is clean, then the engine moves onto the next rule. 5. Otherwise, the engine checks the dirty elements to see if the data they rely on has actually changed (âdirtyâ just means that they potentially changed). 6. All âdirtyâ elements that hasnât experienced a change will be marked as clean. If there have been no changes, then move onto the next rule. 7. If there were any changes, then execute the rule, inserting the productions. 8. Mark all downstream dependencies as dirty, and schedule those rules by adding them to the queue. 9. Return to step 3
There are a couple of features not being used⌠⢠Salience. âWay back whenâŚâ I was thinking that it would be nice to offer some rules priority over other rules. This is the âsalienceâ value. Higher values get inserted into the queue before anything of a lower value. This works, but for now everything has the same salience value, so it just works as a FIFO. ⢠Productions. At the moment the only kind of production is to insert data. The plan was always to offer a function that could be called with the data that led to the production. The most flexible thing here would be a message queue or a pub/sub on Redis. This should still happen, but Iâm still waiting for someone to ask for it đ
I gave a description of how all of this works at Clojure/conj 2016: https://youtu.be/8rRzESy0X2k?list=PLZdCLR02grLofiMKo0bCeLHZC0_2rpqsz