clara

http://www.clara-rules.org/
eraserhd 2018-10-24T13:42:22.000100Z

I found a paper about the LEAPS algorithm, which apparently out-performs RETE by using gasp laziness. I wonder if any LEAPS stuff was incorporated into Clara?

2018-10-24T14:02:18.000100Z

@eraserhd I don’t see the date on this some reason

eraserhd 2018-10-24T14:02:46.000100Z

It's old, like 94 or something.

2018-10-24T14:02:49.000100Z

but I know I’ve read about LEAPS before.

2018-10-24T14:03:07.000100Z

It is comparing to the traditional rete used in the OPS5 and perhaps a few others at the time

2018-10-24T14:03:14.000100Z

things were quite a bit different in those I believe

2018-10-24T14:03:25.000100Z

Clara takes advantage of batch-oriented fact propagation

2018-10-24T14:03:51.000100Z

I am of the opinion that that is the biggest perf win

2018-10-24T14:04:20.000100Z

over any sort of laziness. However, Drools (popular JVM/Java based rules engine) went to a custom algo they decided was sufficiently different enough to get a new name

2018-10-24T14:04:23.000100Z

“PHREAK”

2018-10-24T14:04:31.000100Z

in Drools 6, they wrote some good stuff on it

2018-10-24T14:04:48.000100Z

but it was meant to be lazier and to do things like cut parts of the rete tree off when they aren’t needed

2018-10-24T14:05:16.000100Z

the unfortunate part of that upgrade was that Drools went from eager and single fact propagation to this lazier and batched propagation

2018-10-24T14:05:22.000100Z

and I think the batching is the bigger win

2018-10-24T14:05:45.000100Z

So the topic of being lazier in Clara has came up before

2018-10-24T14:06:12.000100Z

but hasn’t been done since it isn’t clear how much you really gain from that over the batched propagations.

2018-10-24T14:06:25.000100Z

that is assuming you have queries that you intend to use

2018-10-24T14:06:51.000200Z

if you had like 10 queries and were only going to want to perform 1 of them a lot of the time or something like that, then there may be a bigger win to delaying things

eraserhd 2018-10-24T14:06:51.000300Z

I'm only a little way in, and I haven't seen the laziness part yet (even though it's claimed in the abstract). It so far has claimed that the biggest win is not needing to materialize facts in memory. I don't understand it yet.

2018-10-24T14:07:06.000100Z

also, keep in mind, these older papers

2018-10-24T14:07:17.000100Z

they have some good material for sure, but have to be aware of the environments there were dealing with

2018-10-24T14:07:57.000100Z

e.g. sometimes they are really emphasizing using minimal memory (was more constrained then), or higher allocation costs etc

2018-10-24T14:08:21.000100Z

it’s just something to be aware of, still good material out there and most of it is pretty old

2018-10-24T14:08:59.000200Z

also, sometimes things are explaining a situation that is most helpful when dealing with a large number of rules, other times its for dealing with a large number of facts, and occasionally perhaps a large number of both is discussed

eraserhd 2018-10-24T14:09:21.000100Z

yup. But I read things like this as a hobby, honestly.

eraserhd 2018-10-24T14:09:34.000100Z

I'm not suddenly suggesting that we must implement this 😄

2018-10-24T14:09:56.000200Z

no, it’s good to discuss and to think about

eraserhd 2018-10-24T14:09:58.000100Z

In fact, it would be neat if there was a bibliography for Clara Rules.

2018-10-24T14:10:24.000100Z

quite a bit of perf-related work has been done in Clara already

2018-10-24T14:10:45.000200Z

some of that was just impl details and other things were tweaks to propagation or often accumulators

eraserhd 2018-10-24T14:11:25.000100Z

Neat... I got that impression. It performs super well for us.

👍 1
2018-10-25T16:55:18.000100Z

@eraserhd Agreed with @mikerod’s previous comments - Another interesting way that this played out (both myself and Mike spent a while doing perf optimizations on Clara) is that in practice, it turned out that even for large cases (hundreds of thousands of facts/tens of thousands of rules) the constant factors seemed to be most important. Hashing turned out to be a large percentage of work performed for example. A lot of these optimizations (on the Clojure side) are in the memory.cljc namespace with lots of Java interop etc. That said, this was in use-cases where we didn’t really have lots of data that we were just going to end up discarding, and there’s definitely cases where more laziness could be useful.

2018-10-24T14:12:00.000100Z

Clara make some small mention to background here https://github.com/cerner/clara-rules/wiki/Introduction#the-rules-engine

2018-10-24T14:12:14.000100Z

The paper referenced there is http://reports-archive.adm.cs.cmu.edu/anon/1995/CMU-CS-95-113.pdf

2018-10-24T14:12:51.000100Z

It is concerned with the ops system I think. It’s long and not all that relevant to current stuff, but there are some nice fundamental chapters in it

2018-10-24T14:13:13.000100Z

Starts with a very basic description of a simple rete impl, and then discusses some interesting ways to improve it, such as left and right unlinking

2018-10-24T14:14:02.000100Z

and Drools has a lot of extra stuff going on that is left out of Clara, but they do have some good docs on approaches https://docs.jboss.org/drools/release/6.2.0.CR2/drools-docs/html/HybridReasoningChapter.html#ReteOO

2018-10-24T14:16:13.000100Z

That’s their newer one. Not the same as Clara, but there are some commonalities in some the ideas. It also may explain some of the deficiencies in the overly simple approach traditionally taken.

eraserhd 2018-10-24T14:26:56.000100Z

Nice, thank you! I've queued all that up for bedtime reading.

2018-10-24T14:38:51.000100Z

sure, sorry it isn’t super organized - reference dump

2018-10-24T22:15:51.000100Z

I'm trying out Clara, but I'm hitting an issue, perhaps I'm using it wrong, here's the simplest snippet that shows my issue:

(defrecord Request [resource])

(defn condition
  [x y]
  (= x y))

(defrule some-rule
  [Request (condition ?resource resource)]
  =>
  (println ?resource))

(mk-session)
I'm getting an error about ?resource not being bound, however if I replace condition with = this works. I didn't see anything in the Boolean Expressions documentation about not being able to use other functions, however if that's the case, how does one go about doing this?

souenzzo 2018-10-24T22:21:01.000100Z

[Request (= ?resource resource)]
[:test (condition? ?resource)]
=>
(prn ?resource)

ethanc 2018-10-25T13:25:46.000100Z

Probably not relevant and more of an FYI, but if the condition? is simply equality it could be done with a third argument to =.

(r/defrule some-rule
  [Request (= ?resource resource "GET")]
  =>
  (println ?resource))

souenzzo 2018-10-25T13:42:59.000100Z

[Request (= resource "GET")]
=>
(prn "GET")
Onde it will only match if resource is GET 😅

souenzzo 2018-10-25T13:43:51.000100Z

you can also do

[?request <- Request (= resource "GET")]
=>
(prn ?request)
Will print the "full record"

2018-10-25T13:59:30.000100Z

Thanks @ethanc ! This is actually something that I did need (for a different rule)

souenzzo 2018-10-24T22:21:17.000100Z

@jvtrigueros

2018-10-24T22:33:28.000100Z

How does this work? In this contrived example condition takes two arguments but here it's just one. :thinking_face: Or perhaps you could point me to the literature for this

souenzzo 2018-10-24T22:39:17.000100Z

(= ?resource resource) it not a function call. it just a DSL to bind the value of resource (form record) to ?resorce (symbol)

👍 2
2018-10-24T22:47:46.000100Z

Ah gotcha, thank you! This points me in the right direction, I'll continue to play with Clara 😃

souenzzo 2018-10-24T23:58:08.000100Z

some functions you can use like (contains? #{:foo} resource)