Looks much better š You could possibly tidy it a little further like this: https://github.com/threatgrid/naga/pull/134 but arguably youāll be hitting diminishing returns pretty soon.
Itās coincidental that youād be mentioning some of this, since it came up this week.
Iām tempted to move cli.clj
to a separate project entirely, or at least a module under this one (Iāve never played with modules before, so I donāt know how useful that would be).
The CLI was actually written as an example of how to use Naga. It ended up doing more than I first expected, and it was kinda cool. But itās really not supposed to be part of the project.
Iām also tempted to remove Asami as a dependency, since itās not needed at all if you want to run Naga with Datomic. But since weāre using it here, it wasnāt going to hurt me to leave it where it was. It also makes the CLI useful. Maybe if the Asami and Datomic adapters are made into modules, then the CLI can be separated out entirely, and depend on Naga + Asami-adapter? This is more Leiningen than I know right now.
The reason it came up this week was because the CLI was being referenced by :main
which compiled Naga and all itās dependencies. That was a horrible mistake to discover! š I started by removing :main
altogether, but put it back when I discovered the ^:skip-aot
metadata. But if I split it out, then a lot of these problems go away.
Finallyā¦ Zuko needs Cheshire, due to its ability to parse JSON. I could use clojure/data.json
but thatās not as fast, so I was reluctant to go that way. The JSON related code is not actually core Naga functionality, but itās something we use a lot.
Iāll give some thought to all of this, and Iāll also look into how modules are put together. If you have anymore to contribute Iād love to hear it!
Honestlyā¦ Naga isnāt all that complex. Executing a rule just means turning the body into a where clause, and then projecting the results into groups of 3, which then get inserted as statements for every result row.
The tricks are in things like:
1. identifying the parts of the where clauses which can be affected by parts of the output from any rule.
2. fill the queue with rules
3. If the queue is empty, exit
4. take the first rule from the queue, and check if any of the parts in its body (the :where
clause patterns) have changed. If not, return to step 3.
5. cache the new results of the parts of the :where
clause patterns (for comparison next time). We use semi-naĆÆve reasoning which means that we need only store the count. If we start getting more aggressive about negation operations, then this may need to turn into a hash (which is, more expensive)
6. something changed, so run the rule. This executes the :where
clause, and projects each row into a group of triples. These are all inserted.
7. check if any rules had parts that can be changed by this rule. If so, add them to the queue. (The queue will ignore any duplicates)
8. go back to step 3
Also, when generating new entities (unbound variables in the head of the rule), then the :where
clause is updated to exclude any results which will generate an entity that is exactly equal to one that already exists (this is a cute bit of query rewriting, and was actually the impetus to get not
into Asami).
Thereā¦ now you know how to build a rule engine š
BTW, Iām not at work tomorrow
In https://github.com/threatgrid/asami/wiki/Entity-Structure:
(#datom [:tg/node-10499 :db/ident :tg/node-10498 1 true]
#datom [:tg/node-10499 :tg/entity true 1 true]
#datom [:tg/node-10499 :name "Fitzwilliam" 1 true]
#datom [:tg/node-10499 :home "Pemberley" 1 true])
Am I correctly assuming that's a typo and should be:
#datom [:tg/node-10499 :db/ident :tg/node-10499 1 true]
Fixed this. Thank you for the feedback!
No problem, I just was trying to grok what the :db/ident
was (and how it differs from Datomic's :db/ident
); and the example made me do a double-take. Two questions that I had unanswered after reading the docs:
1. Why have both :db/id
and :db/ident
, if I can explicitly set :db/id
myself and :db/ident
is also treated as a global identifier of a single node? Is it an indexing issue? Or are there certain api functions that expect ident, but not id? For example, I checked MemoryDatabase d/entity
but it actually considers both as valid inputs.
2. Is there any interest in supporting a lookup ref syntax ala Datomic (e.g. [:email "<mailto:x@y.com|x@y.com>"]
) in the future? Or is that considered out of scope? This also came up as I tried to understand Asami's identities. All I could find was a closed issue without a followup: https://github.com/threatgrid/asami/issues/97
Iāll address the #1 to start with:
Itās a little different to Datomic. :db/ident
is an explicit attribute added to entities. It can be any value.
If you donāt supply one, then Asami allocates it, defaulting to using a loopback on the node. For the in-memory value, that node is represented by a keyword with a prefix of :tg/node-
. In Datomic, you might explicitly state that you want a node using datomic.Peer/tempid
, and after itās inserted then it looks like a number (distinguished from actual long
values be appearing in the āentityā position of a statement, or if itās in the āvalueā position, then it gets determined by the attribute datatype). Asamiās in-memory store just uses these magic keywords. (the on-disk storeā¦ which is Real Soon Nowā¦ uses long values internally, not keywords). Anyway, the :db/ident
will either be what you specify, or it will refer to itself.
:db/id
is different. It is an implicit attribute that does not appear in the database. Instead, itās used to refer to the entity that is represented by the node.
To explain this, I want to explicitly describe the entity structures in the graph (I realize that youāll know a lot of it, but I want to make sure weāre in the same place). Consider a simple entity:
{:db/ident "simple"
:foo "bar"}
This has 2 attributes: :db/ident
and :foo
. To represent this in a graph, I need to have a node that will represent them. Letās call that node my-entity
. The graph for this entity can then be specified with the edges:
[my-entity :db/ident "simple"]
[my-entity :foo "bar"]
Allocating a node to represent structures like this means that we can also build nested structures:
{:db/ident "nested"
:foo "hello"
:bar {:foo "world"}}
The top level structure will be allocated a node (call it outer
) and the nested structure will be allocated its node (call it inner
):
[outer :db/ident "nested"]
[outer :foo "hello"]
[outer :bar inner]
[inner :foo "world"]
Of course, in an in-memory database in Asami, then outer
may be :tg/node-1
and inner
might be :tg/node-2
The question is, how can I refer to the node that represents an entity if I am just using the entity map style of structure? The entity has various attributes (like :db/ident
and :foo
, but no direct way to refer to the node itself. This is what :db/id
does. In fact, Datomic uses :db/id
to do exactly the same thing when inserting.
So if I say that I want to insert an entity of:
{:db/id :my-marvelous-entity
:db/ident "mine"
:foo "hello"
:bar {:foo "world"}}
Then the statements that this will be turned into are:
[:my-marvelous-entity :db/ident "mine"]
[:my-marvelous-entity :foo "hello"]
[:my-marvelous-entity :bar :tg/node-3]
[:tg/node-3 :foo "world"]
I can even add a :db/id
to the nested entity.I hadnāt really thought about the ref syntax, but it should be doable. Itās similar to using :db/ident
in that it has to do a lookup. I just need to make sure I donāt forget any codepaths that could be affected by it
Actuallyā¦ :db/ident
is already a kind of lookup ref, so the machinery is basically there
The main difference is that I used the {:db/ident value}
syntax for that, instead of [:db/ident value]
(I donāt recall when Lookup Refs were introduced into Datomic, but they werenāt there early on. Asami reflects an older set of APIs in Datomic)
Yeah, thanks for the explanation. The way I see it, the primary difference is :db/ident
in Asami is a "global" lookup ref and in Datomic is a "namespaced" lookup ref.
Iāll probably just do the lookup ref, look up the data, and if thereās more than one value, throw an ex-info
Thatāll offer the best compatibility, I think
I was wondering if perhaps it's more difficult, because of the open-world assumption and no schema that explicitly states that some attribute will be used as a reference lookup-ref
But offering a way to do a lookup + throwing error if more than one exists may be a nice way for compatibility.
This is my understanding of the difference:
;; Transactions - Datomic - need to make sure :person/name is unique for people
[{:db/id "parent"
:person/name "Jill"}
{:person/name "Susie"
:person/parent "parent"}]
;; or
[{:person/name "Susie"
:person/parent {:person/name "Jill"}}]
;; or, assuming Jill already exists...
[{:person/name "Susie"
:person/parent [:person/name "Jill"]}]
;; Transactions - Asami - need to make sure :db/ident is unique for all datoms
[{:db/ident "jill"
:person/name "Jill"}
{:person/name "Susie"
:person/parent {:db/ident "jill"}}]
;; or
[{:person/name "Susie"
:db/ident "susie"
:person/parent {:db/ident "jill"
:person/name "Jill"}}]
;; or, assuming Jill already exists...
[{:person/name "Susie"
:db/ident "susie"
:person/parent {:db/ident "jill"}}]
;; Datomic - queries
[?e :person/parent [:person/name "Jill"]
?e :person/name ?name]
;; Asami - queries
[?p :db/ident "jill"
?e :person/parent ?p
?e :person/name ?name]
Are you sure about queries allowing that?
From the Datomic docs: > Lookup refs have the following restrictions: > - The specified attribute must be defined as eitherĀ :db.unique/valueĀ orĀ :db.unique/identity. > - When used in a transaction, the lookup ref is evaluated against the specified attributeās index as it exists before the transaction is processed, so you cannot use a lookup ref to lookup an entity being defined in the same transaction. > - Lookup refs cannot be used in the body of a query though they can be used asĀ https://docs.datomic.com/on-prem/query/query.html#multiple-inputs.
That last thing suggests that you canāt using them in a query
https://docs.datomic.com/cloud/transactions/transaction-data-reference.html#entity-identifiers
I'm pretty sure queries allow it, because we write a lot of code that depends on that sugar syntax :]
(sorry, need to go afk ~1h)
Itās a reasonably easy transformation on the query, but a surprising one, given that there is explicit documentation that says that you canāt do it š
@quoll from the Datomic docs: > Resolving Entity Identifiers in V Position > Datomic performs automatic resolution of https://docs.datomic.com/on-prem/schema/identity.html#entity-identifiers, so that you can generally use entity ids, idents, and lookup refs interchangeably. https://docs.datomic.com/on-prem/query/query.html#limitations
ah, I see now: > Lookup refs cannot be used in the body of a query though they can be used asĀ https://docs.datomic.com/on-prem/query/query.html#multiple-inputs.
I think it might actually be easier to just let them through in a query. So instead of their example of:
(q '[:find ?artist-name
:in $ ?country
:where [?artist :artist/name ?artist-name]
[?artist :artist/country ?country]]
db [:country/name "Belgium"])
it would actually be easier for me to just accept this instead:
(q '[:find ?artist-name
:in $
:where [?artist :artist/name ?artist-name]
[?artist :artist/country [:country/name "Belgium"]]]
db)
If I accept that, then it would automatically support the parameter query as well^ I just verified in a REPL both versions of that kind of query work in Datomic (assuming :artist/country
is a unique identity)
so apparently the docs are a little misleading (or not up-to-date)
probably not up to date
Datomicās API keeps evolving
lookup-refs were introduced here: https://docs.datomic.com/on-prem/changes.html#0.9.4556
Which is one reason for certain missing features. I havenāt kept up with Datomic over time
But maybe it changed b/c of this?
> Allow lookup refs for V position in users of VAET index, including :db.fn/retractEntity
.
https://docs.datomic.com/on-prem/changes.html#0.9.4766
I've been using them like that in queries since I can remember; although your links to the docs had started to make me doubt my sanity. š
Thatās OK
I was using Datomic several years before Lookup refs came out. Iāve learned a few new things since, but I havenāt put the time in to learn everything
(they came out in February 2014)
Anyway, itās not going to be top of my queue, but Iāve added this: https://github.com/threatgrid/asami/issues/112
Thanks for taking the time and sorry for the long tangents! š
No problem. I want to interact with people more, even if itās just to convince everyone that the project is alive š
ā¦ and responsive
Huh... yes. I must have copy pasted from something else and then tried to hand tweak for consistency
Iām in a car right now. Iāll try to fix when I get home