I understood that the order of my queries should make no difference in Datalog, is that correct?
I am wondering, because these two give wildly different results, despite the only difference being the order of the missing?
and subproject
clauses:
[:find [?subproject ...]
:in $ % ?project-phid
:where
[?project :object/phid ?project-phid]
[(missing? $ ?subproject :project/milestone)]
(subproject ?subproject ?project)]
[:find [?subproject ...]
:in $ % ?project-phid
:where
[?project :object/phid ?project-phid]
(subproject ?subproject ?project)
[(missing? $ ?subproject :project/milestone)]]
@urzds Not correct at all, order makes all the difference.
You should put the one that reduces the resultset the most first (if that's possible to know).
Datalog is effectively a constraint solver, and it applies each constraint in order.
Can you give an example? I expected it to behave like (filter pred-b (filter pred-a db)), where I would expect pred-a and pred-b to be interchangeable.
Actually, I'm not sure it should affect the resultset, but it definitely can affect performance by limiting the intermediate resultsets.
So it does make some difference
When I used negations, order radically affected the result set.
Order seems to matter a lot.
How I do make a variable a union of two sets? e.g. when I want to select all ?task that are both in ?required-project and in either ?top-level-project or ?subproject:
'[:find ?task
:in $ % ?top-level-project
:where
[(subproject ?subproject ?top-level-project)
[?project is either ?top-level-project or ?subproject] ; <<--
[?task :task/projects ?project]
[?required-project :project/name "Required"]
[?task :task/projects ?required-project]]]
The last thing I tried was:
'[:find ?task
:in $ % ?top-level-project
:where
(or-join [?task]
(and
(subproject ?subproject ?top-level-project)
[?task :task/projects ?subproject])
[?task :task/projects ?top-level-project])
[?required-project :project/name "Required"]
[?task :task/projects ?required-project]]
But that appears to ignore the whole part inside or-join
and only selects ?task in ?required-project. I noticed, because I can set ?top-level-project to something that definitely does not exist, and it still claims to find a lot of matches.'[:find ?task
:in $ % ?top-level-project
:where
(or-join [?project]
(subproject ?project ?top-level-project)
(and
[?top-level-project :unique-id ?top-level-id] ; stunt to make ?project also an alias for ?top-level-project
[?project :unique-id ?top-level-id]))
[?task :task/projects ?project] ; match against the union constructed above
[?required-project :project/name "Required"]
[?task :task/projects ?required-project]]
also has the same result as:
'[:find ?task
:in $ % ?top-level-project
:where
[?required-project :project/name "Required"]
[?task :task/projects ?required-project]]
even if ?top-level-project refers to something that does not existThis variant works:
'[:find [?task ...]
:in $ % [?top-level-project ...]
:where
(subproject-or-self ?project ?top-level-project)
[?task :task/projects ?project]
[?required-project :project/name "Required"]
[?task :task/projects ?required-project]]
with rules:
'[[(subproject ?e1 ?e2)
[?e1 :project/parent ?e2]]
[(subproject ?e1 ?e2)
[?e1 :project/parent ?t]
(subproject ?t ?e2)]
[(subproject-or-self ?e1 ?e2)
[?e1]
(subproject ?e1 ?e2)]]