asami

Asami, the graph database https://github.com/threatgrid/asami
2021-02-24T10:14:36.024700Z

@quoll: So I’m toying with reasoning about RDF triples with naga, starting with rdfs ranges. I have the following pabu rule: rdf:type(Z,X) :- rdfs:range(A, X), A(Y, Z). However to make that inference work, I also need the axiomatic triples loaded into naga. Obviously the easiest thing to do is to just load all my data as axioms into naga, which works, however most (if not all) of my axioms will be defined in user supplied data. e.g. for the above range rule to be inferred I also need at least two more axiomatics triple e.g. [:skos/inScheme :rdfs/range :skos/ConceptScheme] (which is trivially provided by loading vocabularies as axioms), however another triple is also required as an axiom e.g. something like [:ref/female :skos/inScheme :scheme/genders]. This final class of axiomatic triples will more commonly be found in my user supplied data, so I need to have a way for any given graph of user data to find and load them into naga. I can see I have all the information to do this. Presumably I just need to look at the graph of user data, and filter it to the set of predicates that are grounded in the antecedents of my rules?

2021-02-24T10:35:35.032Z

One other thing… I can see when building the naga program that the axioms are defined defined as a lazy sequence, that then appears to get indexed by the storage layer. This makes sense as even though it’s forward chaining you presumably don’t want to do a linear search across axioms in rules that reference multiple axioms. My question is presumably in code like this:

(def program (rules/create-program (:rules rdfs-rules)
                                   (concat user-data
                                           (:axioms rdfs-rules))))

(defn apply-rules [db]
  (engine/run db program))

(-> index/empty-graph
    (graph/graph-transact 0 [user-data] nil)
    asami/as-connection
    apply-rules
    first
    :connection
    asami/db
    asami/graph
    (graph-q '[:find ?a ?v :where [:scheme/genders ?a ?v]])
    )
If I use engine/run I’ll presumably index those axioms twice. I’m guessing to avoid this I’ll want to reimplement engine/run to take the axioms from my graph of user-data?

2021-02-24T10:36:26.032300Z

(the above is obviously scratch code btw)

quoll 2021-02-24T14:18:46.035800Z

The idea of having “axioms” in a program is for statements that are necessary and the rest of the program won’t run without them, hence you want to ensure that they are present every time. But there is nothing particularly special about them, and they can be inserted whenever and wherever you like… so long as it’s before the program runs. Assuming that a used may want to insert data once was why I took the approach of loading the data separately in the CLI program: https://github.com/threatgrid/naga/blob/main/cli/src/naga/cli.clj#L76 I apologize that this code uses the Naga APIs for storage and not the transact function. That’s historical.

quoll 2021-02-24T14:30:57.040800Z

You could try something like this instead:

(def program (rules/create-program (:rules rdfs-rules) []))

(-> index/empty-graph
    (graph/graph-transact 0 user-data nil)             ;; load user data
    (graph/graph-transact 0 (:axioms rdfs-rules) nil)  ;; load axioms explicitly
    asami/as-connection
    (engine/run program)                               ;; program does not contain axioms
    first
    :connection
    asami/db
    asami/graph
    (graph-q '[:find ?a ?v :where [:scheme/genders ?a ?v]])
    )

2021-02-24T23:38:39.041900Z

@quoll: Thanks that’s essentially what I wrote first of all, but it doesn’t seem to work, it returns an empty seq:

2021-02-24T23:38:55.042Z

2021-02-25T09:24:42.043900Z

Pretty sure the error is happening between lines 34 and 32. As inspecting the data returned by line 34 you can see the connection contains only the explicit triples and none of the inferred ones:

2021-02-25T09:24:57.044100Z

2021-02-25T09:27:23.044500Z

Inspecting the data at line 32 looks like this:

2021-02-25T09:27:50.044700Z

2021-02-25T09:31:16.045100Z

which yeah you’re right — I think it looks a bit garbled.

2021-02-25T09:34:11.045400Z

ok looks like the issue might be with asami’s graph-transact?! Inspecting this :

(-> index/empty-graph
    (graph/graph-transact 0 [user-data] nil)
    )

2021-02-25T09:34:18.045600Z

yields:

2021-02-25T09:34:21.045800Z

{:spo
 {[:ref/female :skos/inScheme :scheme/genders]
  {[:skos/inScheme :rdfs/range :skos/ConceptScheme] #{nil}}},
 :pos
 {[:skos/inScheme :rdfs/range :skos/ConceptScheme]
  {nil #{[:ref/female :skos/inScheme :scheme/genders]}}},
 :osp
 {nil
  {[:ref/female :skos/inScheme :scheme/genders]
   #{[:skos/inScheme :rdfs/range :skos/ConceptScheme]}}}}

2021-02-25T09:38:34.046400Z

ok I think I just spotted the issue…

2021-02-25T09:40:36.046600Z

Ok @quoll you’ll be relieved that this is a bug in my code, not in asami/naga 🙂

(def user-data [[:ref/female :skos/inScheme :scheme/genders]
                [:skos/inScheme :rdfs/range :skos/ConceptScheme]
                ])
,,,

    (graph/graph-transact 0 [user-data] nil)
that graph-transact line should be: (graph/graph-transact 0 user-data nil)

2021-02-25T09:41:37.046800Z

With that correction, asami does indeed return the correct / expected results!! :partywombat:

quoll 2021-02-25T15:51:31.047Z

OK… but I should probably have some better data checking in there. Silently failing while building invalid data structures isn’t particularly user friendly!

2021-02-25T15:58:26.047200Z

Perhaps… I did wonder if your plumatic schema stuff might have caught it, if I’d turned it on in dev with s/set-fn-validation! but I’ve just tried and it seems it still gets through

quoll 2021-02-25T15:59:08.047400Z

There are diminishing returns with checking for everything, and the cost can be high

quoll 2021-02-25T15:59:48.047600Z

My main reason for using plumatic schema has been documenting APIs. It really helps a lot! It’s also nice when it catches a bug 🙂

2021-02-25T15:59:59.047800Z

agreed — but it can work well to do it in development/repl contexts

2021-02-25T16:02:18.048Z

For example I’ve just added

(comment
  (require '[schema.core :as s])
  (s/set-fn-validation! true))
At the end of my file, so when I hack on this next I can evaluate that and hopefully reduce any other mistakes like this.

quoll 2021-02-25T16:02:58.048300Z

I was making shortcuts last night by NOT using schema in my code. You’re going to guilt me into putting it back in 🙂

2021-02-25T16:12:10.049100Z

Well problems like this come with the territory of using a dynamic language. If I thought type systems were more important than everything else I’d be using Haskell or Idris day to day. You’ve clearly pulled in plumatic schema because you feel it benefits cases like this though, so it probably makes sense to keep leveraging it; but please don’t let me guilt you into anything. What you have already is great! 🙇

quoll 2021-02-25T16:13:43.049400Z

99% of why I use Schema is to document what a function does. This helps me write and debug code, since I know what is supposed to go in and out. It also documents it for someone else who wants to look at it, but really, I do it for me 🙂

👍 1
quoll 2021-02-25T16:14:00.049600Z

catching bugs is just a side effect

2021-02-25T16:17:27.049900Z

Anyway now I’m over this hurdle I shall look forward to playing with naga in my vanishingly little spare time

quoll 2021-02-25T16:20:11.050100Z

I have 3 children, so I can relate to this 🙂

2021-02-25T16:21:04.050300Z

I have 2 under 2 😂

❤️ 1
2021-02-24T23:39:53.042800Z

However moving the axioms from the user-data into the naga program does:

2021-02-24T23:41:27.042900Z