I found a paper about the LEAPS algorithm, which apparently out-performs RETE by using gasp laziness. I wonder if any LEAPS stuff was incorporated into Clara?
This one: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.96.5371&rep=rep1&type=pdf
@eraserhd I don’t see the date on this some reason
It's old, like 94 or something.
but I know I’ve read about LEAPS before.
It is comparing to the traditional rete used in the OPS5 and perhaps a few others at the time
things were quite a bit different in those I believe
Clara takes advantage of batch-oriented fact propagation
I am of the opinion that that is the biggest perf win
over any sort of laziness. However, Drools (popular JVM/Java based rules engine) went to a custom algo they decided was sufficiently different enough to get a new name
“PHREAK”
in Drools 6, they wrote some good stuff on it
but it was meant to be lazier and to do things like cut parts of the rete tree off when they aren’t needed
the unfortunate part of that upgrade was that Drools went from eager and single fact propagation to this lazier and batched propagation
and I think the batching is the bigger win
So the topic of being lazier in Clara has came up before
but hasn’t been done since it isn’t clear how much you really gain from that over the batched propagations.
that is assuming you have queries that you intend to use
if you had like 10 queries and were only going to want to perform 1 of them a lot of the time or something like that, then there may be a bigger win to delaying things
I'm only a little way in, and I haven't seen the laziness part yet (even though it's claimed in the abstract). It so far has claimed that the biggest win is not needing to materialize facts in memory. I don't understand it yet.
also, keep in mind, these older papers
they have some good material for sure, but have to be aware of the environments there were dealing with
e.g. sometimes they are really emphasizing using minimal memory (was more constrained then), or higher allocation costs etc
it’s just something to be aware of, still good material out there and most of it is pretty old
also, sometimes things are explaining a situation that is most helpful when dealing with a large number of rules, other times its for dealing with a large number of facts, and occasionally perhaps a large number of both is discussed
yup. But I read things like this as a hobby, honestly.
I'm not suddenly suggesting that we must implement this 😄
no, it’s good to discuss and to think about
In fact, it would be neat if there was a bibliography for Clara Rules.
quite a bit of perf-related work has been done in Clara already
some of that was just impl details and other things were tweaks to propagation or often accumulators
Neat... I got that impression. It performs super well for us.
@eraserhd Agreed with @mikerod’s previous comments - Another interesting way that this played out (both myself and Mike spent a while doing perf optimizations on Clara) is that in practice, it turned out that even for large cases (hundreds of thousands of facts/tens of thousands of rules) the constant factors seemed to be most important. Hashing turned out to be a large percentage of work performed for example. A lot of these optimizations (on the Clojure side) are in the memory.cljc namespace with lots of Java interop etc. That said, this was in use-cases where we didn’t really have lots of data that we were just going to end up discarding, and there’s definitely cases where more laziness could be useful.
Clara make some small mention to background here https://github.com/cerner/clara-rules/wiki/Introduction#the-rules-engine
The paper referenced there is http://reports-archive.adm.cs.cmu.edu/anon/1995/CMU-CS-95-113.pdf
It is concerned with the ops system I think. It’s long and not all that relevant to current stuff, but there are some nice fundamental chapters in it
Starts with a very basic description of a simple rete impl, and then discusses some interesting ways to improve it, such as left and right unlinking
and Drools has a lot of extra stuff going on that is left out of Clara, but they do have some good docs on approaches https://docs.jboss.org/drools/release/6.2.0.CR2/drools-docs/html/HybridReasoningChapter.html#ReteOO
That’s their newer one. Not the same as Clara, but there are some commonalities in some the ideas. It also may explain some of the deficiencies in the overly simple approach traditionally taken.
Nice, thank you! I've queued all that up for bedtime reading.
sure, sorry it isn’t super organized - reference dump
I'm trying out Clara, but I'm hitting an issue, perhaps I'm using it wrong, here's the simplest snippet that shows my issue:
(defrecord Request [resource])
(defn condition
[x y]
(= x y))
(defrule some-rule
[Request (condition ?resource resource)]
=>
(println ?resource))
(mk-session)
I'm getting an error about ?resource
not being bound, however if I replace condition
with =
this works.
I didn't see anything in the Boolean Expressions documentation about not being able to use other functions, however if that's the case, how does one go about doing this?[Request (= ?resource resource)]
[:test (condition? ?resource)]
=>
(prn ?resource)
Probably not relevant and more of an FYI, but if the condition?
is simply equality it could be done with a third argument to =
.
(r/defrule some-rule
[Request (= ?resource resource "GET")]
=>
(println ?resource))
[Request (= resource "GET")]
=>
(prn "GET")
Onde it will only match if resource
is GET
😅you can also do
[?request <- Request (= resource "GET")]
=>
(prn ?request)
Will print the "full record"Thanks @ethanc ! This is actually something that I did need (for a different rule)
How does this work? In this contrived example condition
takes two arguments but here it's just one. :thinking_face:
Or perhaps you could point me to the literature for this
(= ?resource resource)
it not a function call. it just a DSL to bind the value of resource
(form record) to ?resorce
(symbol)
Ah gotcha, thank you! This points me in the right direction, I'll continue to play with Clara 😃
some functions you can use like (contains? #{:foo} resource)