@simongray Naga has a data syntax for rules; it’s not the datomic syntax but instead looks to be a direct translation of the datalog horn clause syntax naga supports: https://github.com/threatgrid/naga/blob/964a71e300c947ead28f1b344ed9bccdbf97ff9b/test/naga/test_rules.cljc#L21 I’m not familiar with datomic rules, and have only played with naga briefly but after looking at them again it looks like the main differences are: 1. (I think) naga rules must be either unary or binary predicates. 2. Naga rules are materialised once at startup; datomic’s are AFAICT materialised at query time. I’m guessing datomics rules through a process similar to macro-expansion just expand into datomic query fragments… presumably with gensyms etc to prevent unintended bindings.
Ooh, that Clojure data Prolog style syntax looks interesting.
I was asking about Zach Oakes' O'Doyle Rules, though, which uses Datomic-style triples/facts, but otherwise doesn't have anything to do with Datomic.
ah sorry — I don’t know… IIRC clara is based on the RETE algorithm… so you might learn more by looking at the differences between rete and datalog. Not sure about O’Doyle, I only heard of it the other day… Looking at the README it implies it’s possible to create infinite loops in it though; which means it may be turing complete. Datalog guarantees termination (and consequently so should Naga).
I wouldn’t doubt that for a second given that RETE is more powerful than datalog; but consequently misses datalogs safety properties (re termination guarantees etc)
Definitely. It’s essentially a rule scheduler. Datalog imposes policies that ensure boundary conditions on that scheduling
Actually, that was an interesting thing for me to learn about. I implemented RETE, and I needed to configure what each rule did, and what the scheduling dependencies between the rules were. At that point, I was describing all of that using an RDF graph. But the rule descriptions were in Horn clauses (because that’s how everyone wrote them at the time, and I was used to writing them that way too). I was thinking that instead of writing Horn clauses on paper, and then translating from Horn to the graph representation, and entering all the graph data, then I should write a parser that could read Horn clauses, and convert them into the graph representation automatically. The only problem was figuring out the connections between them. But then I had my epiphany… I knew how to automatically find those dependencies! (I can even tell you which street in Chicago that I was walking along when I realized this).
So I implemented that, and it worked really well.
At this point, I realized that due to some of the things I’d done in the implementation of RETE, that I had a fully compliant Datalog engine
It was a total surprise 🙂
So I put in the effort to learn more about Datalog
And my implementation experience informed me a lot
Sorry, not following what the epiphany was?
That I could identify the dependencies between rules automatically.
Those dependencies are an important part of configuring RETE
ok by querying for them?
No. The form of horn clause is:
pred_y(v1,v2) :- pred_x(v3,v4), ...
If the head of any clause matches any expression in the body of any clause, then this forms a dependency of the latter on the formeryup
(I did a bit of prolog many moons ago)
“Matching” means that one for each of the 3 values in pred(v1,v2)
then either they contain identical concrete values (atoms, numbers, etc), or when either is variable
Note that pred
is included in that match, which means that 2nd order expressions are also allowed
and the simple translation of pred(v1,v2)
to [v1 pred v2]
simplifies this significantly
Also, pred(v)
=> [v :type pred]
yes I saw that in naga
and if you really want it…
pred(v1,v2,v3...)
=> [v1 pred ?gen2] [?gen2 :tg/first v2] [?gen2 :tg/rest ?gen3] [?gen3 :tg/first v3] …
rdf lists 😬
but yeah makes sense
slightly different… I don’t terminate them with nil. I found it was just not needed
(this is internally, where I can presume a closed world)
Is that the epiphany, that n-ary predicate matching is simplified by reducing everything to triples?
no
The epiphany was just that I could find the dependencies automatically
oh ok 🙂
Because everything was unary and binary predicates (expressed in triples), then it was easy to see this
yeah
when it was n-ary predicates, then it wasn’t clear. But since I was working with the triples architecture it became very obvious
I guess that’s kind of what I meant by querying for them
weeeeeeell… yes, but… this was while parsing and creating an internal representation. It wasn’t yet in a graph, and therefore wasn’t queryable
Naga is my 3rd implementation of this (my first was in Java, and my second was in Clojure).
ok yeah — I guess you’ll want to unify all the clauses to the same identity when they match.
The dependency identification in Naga is done here: https://github.com/threatgrid/naga/blob/main/src/naga/engine.cljc#L149
Recursive work through the body is done in the function https://github.com/threatgrid/naga/blob/main/src/naga/rules.cljc#L174
Back when it was simply: pred1(v1,v2), pred2(v3,v4)…
then it was much easier
thanks for the explanation 🙇
but now the body can be:
[?a :attr ?b] [?b :attr2 "x"] (not (or [?b :attr3 "y"] [?a :attr3 "z"]))
Also, the head can contain multiple statements as well. So that’s an extra complexity 🙂
But yeah… you’ll see that the code I just gave you can process rules that have bodies with expressions like this. No “graph” is created to describe the rule, just standard Clojure structures.
It has shortcomings, but in general I’m a little proud of it 😊
ah ok so the rules aren’t actually stored in the graph
you just use the representation to materialise the RETE network
correct
Though, that very first implementation in Java did so.
But it did the whole analysis of the dependencies first, and only when it had all of the information, then it inserted it all into the graph in one step
Strictly speaking, it could have done it in 2 steps… insert the basic rules first, then use queries to extract what was needed to determine dependencies, and then insert those dependencies. But it would actually put more work on the system to do that
I’d imagine it’d be trivial to implement a subset of owl/rdfs in naga
Here you go. This is skos, using a subset of OWL
RDFS inferencing was actually my test case for the first implementation (in Java)
fantastic
Most of it is declarative. Lines 126 onwards are the most interesting
Like I said, I was very proud of this 🙂
ah I forgot you can use predicates as variables
That’s what I was referring to when I mentioned “2nd order expressions”
yeah that’s handy here
I thought those horn clauses looked a little strange at first!
😄
They definitely do!
That’s really tough to pull off in Prolog
does this perform well?
(though, I’m still learning how to implement Prolog)
cut is a mind melter
Oh yes. It performs VERY well. After all, it’s well bounded and it runs in RETE
My first attempt at building the rules structures were for Mulgara (that skos code was actually from the Mulgara project). The directory containing it all is: https://github.com/quoll/mulgara/tree/master/src/jar/krule/java/org/mulgara/krule
ah thanks for sharing
what is the license for those rules?
It was part of Mulgara, which is OSL
ok OSL
But it’s copyright to me. I should explicitly CC them
I want to point out something else, since I’m here. Naga include adapters to different databases, has a priority queue implementation, and various things in other files. But if you keep it to JUST the structure definitions used for rules, and the code that executes the engine, then:
$ wc -l rules.cljc engine.cljc schema/structs.cljc
215 rules.cljc
176 engine.cljc
76 schema/structs.cljc
467 total
Meanwhile, the code that defines the structures and executes the engine (with far fewer features) in Java is:
$ wc -l *.java
139 ConsistencyCheck.java
1079 KruleLoader.java
52 KruleStructureException.java
243 QueryStruct.java
389 Rule.java
360 RuleStructure.java
2262 total
I had to trace through the Mulgara code to figure out out rules get executed, how they’re loaded, etc, and it took me over 10 minutes. (it’s in a whole lot of other directories). But that’s all glue to incorporate it into the database system. The meat of it is in that single directory. But the whole thing is embarrassingly complex!
That’s my little love letter to Clojure ❤️
AFAIK you can do the same with RETE (create infinite recursions)
Ah, I just addressed some of this in the thread above (sorry... I’m slow on my phone)
Naga was Datalog, with the restriction of binary/unary predicates. (Higher arity actually isn’t hard to do, but I never needed it :woman-shrugging:)
But it’s been extended in ways that are described in the thread
Full query syntax is allowed in the rules too.
Including not
, or
, filters, bindings, etc. The head can also have multiple triples to be asserted for a single match (allowing multiple attributes for a single generated entity)
Also, Naga rules are materialized as needed. If you’re using a macro, then yes, that has to be at startup in ClojureScript (because ClojureScript macros are evaluated as Clojure code during compilation), but not in Clojure. But the r
macro is just a convenience around the rule
function, which takes:
• head: a seq of triple patterns
• body: a query :where
clause expression
• name: optional
This function can be called at runtime, even in Cljs.
We do it at Cisco. Build rules on the fly, then execute them against a connection to Asami
Just wanted to say thank you for this massive explanation 🙂
Datalog can be implemented via RETE