datascript

Immutable database and Datalog query engine for Clojure, ClojureScript and JS
urzds 2019-05-16T13:17:14.110400Z

I understood that the order of my queries should make no difference in Datalog, is that correct?

urzds 2019-05-16T13:18:53.111500Z

I am wondering, because these two give wildly different results, despite the only difference being the order of the missing? and subproject clauses:

[:find [?subproject ...]
:in $ % ?project-phid
:where
[?project :object/phid ?project-phid]
[(missing? $ ?subproject :project/milestone)]
(subproject ?subproject ?project)]

[:find [?subproject ...]
:in $ % ?project-phid
:where
[?project :object/phid ?project-phid]
(subproject ?subproject ?project)
[(missing? $ ?subproject :project/milestone)]]

2019-05-16T13:24:05.111800Z

@urzds Not correct at all, order makes all the difference.

2019-05-16T13:24:32.112400Z

You should put the one that reduces the resultset the most first (if that's possible to know).

2019-05-16T13:26:58.113800Z

Datalog is effectively a constraint solver, and it applies each constraint in order.

urzds 2019-05-16T13:37:33.114800Z

Can you give an example? I expected it to behave like (filter pred-b (filter pred-a db)), where I would expect pred-a and pred-b to be interchangeable.

2019-05-16T13:38:08.115500Z

Actually, I'm not sure it should affect the resultset, but it definitely can affect performance by limiting the intermediate resultsets.

✔️ 2
2019-05-16T13:38:28.115800Z

So it does make some difference

2019-05-16T13:43:43.117400Z

When I used negations, order radically affected the result set.

2019-05-16T13:43:55.117700Z

Order seems to matter a lot.

urzds 2019-05-16T15:42:09.121600Z

How I do make a variable a union of two sets? e.g. when I want to select all ?task that are both in ?required-project and in either ?top-level-project or ?subproject:

'[:find ?task
  :in $ % ?top-level-project
  :where
  [(subproject ?subproject ?top-level-project)
   [?project is either ?top-level-project or ?subproject] ; <<--
   [?task :task/projects ?project]
   [?required-project :project/name "Required"]
   [?task :task/projects ?required-project]]]

urzds 2019-05-16T16:12:29.124300Z

The last thing I tried was:

'[:find ?task
  :in $ % ?top-level-project
  :where
  (or-join [?task]
    (and
      (subproject ?subproject ?top-level-project)
      [?task :task/projects ?subproject])
    [?task :task/projects ?top-level-project])
  [?required-project :project/name "Required"]
  [?task :task/projects ?required-project]]
But that appears to ignore the whole part inside or-join and only selects ?task in ?required-project. I noticed, because I can set ?top-level-project to something that definitely does not exist, and it still claims to find a lot of matches.

urzds 2019-05-16T16:42:50.128600Z

'[:find ?task
  :in $ % ?top-level-project
  :where
  (or-join [?project]
    (subproject ?project ?top-level-project)
    (and
      [?top-level-project :unique-id ?top-level-id] ; stunt to make ?project also an alias for ?top-level-project
      [?project :unique-id ?top-level-id]))
  [?task :task/projects ?project] ; match against the union constructed above
  [?required-project :project/name "Required"]
  [?task :task/projects ?required-project]]
also has the same result as:
'[:find ?task
  :in $ % ?top-level-project
  :where
  [?required-project :project/name "Required"]
  [?task :task/projects ?required-project]]
even if ?top-level-project refers to something that does not exist

urzds 2019-05-16T17:26:58.130Z

This variant works:

'[:find [?task ...]
    :in $ % [?top-level-project ...]
    :where
    (subproject-or-self ?project ?top-level-project)
    [?task :task/projects ?project]
    [?required-project :project/name "Required"]
    [?task :task/projects ?required-project]]
with rules:
'[[(subproject ?e1 ?e2)
     [?e1 :project/parent ?e2]]
    [(subproject ?e1 ?e2)
     [?e1 :project/parent ?t]
     (subproject ?t ?e2)]
    [(subproject-or-self ?e1 ?e2)
     [?e1]
     (subproject ?e1 ?e2)]]