clara

http://www.clara-rules.org/
jdt 2018-05-30T12:53:26.000158Z

The problem is with ?other-user-id, I suspected it would be a problem when I wrote it. I'm guessing I need to assert two facts first, one that binds together the first two clauses and their user, and one that binds together the two :and clauses with ?other-user-id, and then do my existence check. Is there an easier way?

jdt 2018-05-30T12:57:26.000002Z

I'm just building this little prototype rule-based scheduler in clara to see if it will handle things at scale. It won't have many rules, but it will have a pragmatic maximum of 1-2 million base facts before firing the rule engine. Am I wasting my time, or will that overwhelm the system? (The bulk of the facts are job requests, from which we then infer things like whether there's a worker with adequate resources, round robin behaviors, and so on.)

jdt 2018-05-30T13:12:22.000266Z

(I'm building the new intermediate binding of WorkerViableJob and ActiveUserJobCount, I just wondered if there wasn't a way to correlate the two records in the :and subclause that considers them without the use if an intermediate fact).

2018-05-30T15:48:26.000295Z

@dave.tenny I’d perhaps worry about that being a fairly slow way to do what you are saying given that you have so many facts in the session

jdt 2018-05-30T15:49:07.000129Z

"that" being using a rules system?

2018-05-30T15:49:29.000360Z

I think if you have enough memory for that many objects, then the rules could perform alright. However, you will have to look out for bad algorithmic complexity in some rules if they are going to be processed over a very large amount of facts, like if you do some wide-open joins between sets of facts with size N and size M, you’ll get NxM comparisons in the join criteria

jdt 2018-05-30T15:49:58.000045Z

yeah, trying to keep things tightly joined, and avoid derived facts that I don't need where possible

2018-05-30T15:50:03.000849Z

Oh, I mean the rule you gave above may be slow, if there are millions of WorkerViableJob and ActiveUserJobCount

jdt 2018-05-30T15:50:17.000175Z

No, there's only millions of raw job request.

2018-05-30T15:50:23.000780Z

I think that a rules engine, such as Clara, may be capable of having reasonable perf characteristics, but you’d have to be careful for the “hot spots”

jdt 2018-05-30T15:50:54.000049Z

There are relatively fiew WorkerViableJobs (those are jobs for which we know we have worker resources available, and we limit those to a small number of oldest/most-applicable jobs in any fire-rules run)

2018-05-30T15:50:54.000287Z

It’s best to quickly filter down to a smaller set of facts before performing the more involved joins between sets of facts

2018-05-30T15:51:28.000318Z

I would not fear intermediate facts either

2018-05-30T15:51:41.000105Z

I think it’ll help your rule given above even

jdt 2018-05-30T15:52:04.000839Z

Yeah, already fixed that by asserting facts representing bindings of the clause.

jdt 2018-05-30T15:52:17.000360Z

I just wondered if there was a better way

jdt 2018-05-30T15:53:24.000573Z

I also repeatedly fall into the trap of [:not A] => A in my rules, because I only want to make an A if some other particular thing isn't true. Finding my way around it, but feels like I fight the problem a lot, whether or not the inserted A is unconditional.

2018-05-30T15:53:25.000030Z

perhaps

jdt 2018-05-30T15:54:48.000842Z

Even though I don't have too many rules now, there's actually a lot we want to do in our job management, affinities, special cases for "new job submitters" to give them optimal user experience in interacting with the jobs whose results they want, etc.

jdt 2018-05-30T15:55:12.000096Z

So that's why I'm spending a bit too much time to see if this is viable, I think rules could be a real win here.

2018-05-30T15:55:30.000520Z

So in your above example, I wonder if you could make use of an accumulator like:

[?lowest <- (acc/min :n-jobs) :from [ActiveUserJobCount (= ?job-type job-type)]]

2018-05-30T15:55:43.000340Z

however, I don’t know that I immediately get the full semantics (like what the RHS) does

2018-05-30T15:55:58.000852Z

but doing that would give you the min job count for a given :job-type in terms of the ActiveUserJobCount facts

2018-05-30T15:56:24.000735Z

For this one: > I also repeatedly fall into the trap of [:not A] => A in my rules, because I only want to make an A if some other particular thing isn’t true I don’t know of a fix-all. It’s a case by case thing. Not sure what sort of scenario keeps getting you into it.

jdt 2018-05-30T15:56:28.000511Z

For one of m y [:not A] =&gt; A scenarios I tried [acc/count ... and checked for count < than the limit I wanted, but the problem is it won't fire if the count is zero, even if the accumulator initializes with zero

jdt 2018-05-30T15:58:08.000119Z

My scenarios are things like "this job is something we want to proceed with if some other job isn't eligible", a gross generalism, could be any fact, not just jobs. It often boils down to counting situations. E.g. only dispatch at most two at a time on a worker in one fire-rules loop.

jdt 2018-05-30T15:58:50.000747Z

My approach now is to generate very minimal sets of candidates in a fire-rules session, then query the results, dispatch jobs, update relevant counters-as-facts, then run fire-rules again in a loop.

jdt 2018-05-30T15:59:45.000941Z

The counters I need to maintain are mainly worker resource availability and active user job counts partitioned by type of job.

jdt 2018-05-30T15:59:59.000779Z

Okay, well, hopefully I'm nearing some kind of first load test, we'll see what happens.

jdt 2018-05-30T16:00:13.000590Z

Advice always appreciated.

2018-05-30T17:11:46.000565Z

@jdt The Clara count accumulator should fire with an initial value of 0. Do you have an example where it does not? Also keep in mind that you can create your own accumulators with arbitrary domain-specific logic. So say "choose the top two at most" could be done. My instinct here is that it sounds like the problems you describe might be addressable with accumulators without any insert-unconditional logic, although as always hard to say without knowing the problem space. It sounds like @mikerod was suggesting that as well.

➕ 1
2018-05-30T17:13:34.000400Z

See the writing accumulators section at http://www.clara-rules.org/docs/accumulators/

2018-05-30T17:14:58.000397Z

Also regarding the cost of joins, that varies depending on the type of join, I'd suggest reading http://www.clara-rules.org/docs/hash_joins/ if you're working with millions of facts

jdt 2018-05-30T17:59:22.000193Z

I thought the behavior I observed with the acc/count condition not firing seemed consistent with the documented behavior on this page: http://www.clara-rules.org/docs/accumulators/, however I suspect I read it wrong and they were talking about other accumulators not firing when there weren't facts matching the condition, instead of acc/count. I don't have the example in code any more so will have to revisit it later if necessary. Meanwhile I'll checkout those other links you posted.

2018-05-30T18:38:43.000294Z

@dave.tenny if an accumulator has a “truthy” :initial-value, it’s condition in a rule will be considered satisfied even if no facts exists to match the accumulators fact match criteria

2018-05-30T18:39:32.000367Z

default :initial-value is nil, so the default would not be true, however acc/count initializes to 0, so a condition that uses it will be satisfied when no facts match the condition.

2018-05-30T18:40:12.000090Z

e.g. [?count &lt;- (acc/count) :from [NoMatchEver]] would bind ?count to 0 and the condition would be satisfied.

jdt 2018-05-30T18:40:15.000330Z

I definitely had a [?x &lt;- (acc/count ...)] that was not being successful, or at least a [:test (do (prn ...) true) ] following that accum condition was not printed, but perhaps they're not evaluated sequentially. My rule definitely wasn't firing, but again I no longer have the code to reason about it.

2018-05-30T18:41:00.000145Z

weird, I’d have to see it