asami

Asami, the graph database https://github.com/threatgrid/asami
quoll 2021-02-24T00:17:06.013100Z

@rickmoynihan you may be amused to note that your PR not only prompted me to rearrange Naga, but it also resulted in submitting a PR to lein-modules šŸ˜‚

1šŸ‘
2021-02-24T10:14:36.024700Z

@quoll: So Iā€™m toying with reasoning about RDF triples with naga, starting with rdfs ranges. I have the following pabu rule: rdf:type(Z,X) :- rdfs:range(A, X), A(Y, Z). However to make that inference work, I also need the axiomatic triples loaded into naga. Obviously the easiest thing to do is to just load all my data as axioms into naga, which works, however most (if not all) of my axioms will be defined in user supplied data. e.g. for the above range rule to be inferred I also need at least two more axiomatics triple e.g. [:skos/inScheme :rdfs/range :skos/ConceptScheme] (which is trivially provided by loading vocabularies as axioms), however another triple is also required as an axiom e.g. something like [:ref/female :skos/inScheme :scheme/genders]. This final class of axiomatic triples will more commonly be found in my user supplied data, so I need to have a way for any given graph of user data to find and load them into naga. I can see I have all the information to do this. Presumably I just need to look at the graph of user data, and filter it to the set of predicates that are grounded in the antecedents of my rules?

2021-02-24T10:35:35.032Z

One other thingā€¦ I can see when building the naga program that the axioms are defined defined as a lazy sequence, that then appears to get indexed by the storage layer. This makes sense as even though itā€™s forward chaining you presumably donā€™t want to do a linear search across axioms in rules that reference multiple axioms. My question is presumably in code like this:

(def program (rules/create-program (:rules rdfs-rules)
                                   (concat user-data
                                           (:axioms rdfs-rules))))

(defn apply-rules [db]
  (engine/run db program))

(-> index/empty-graph
    (graph/graph-transact 0 [user-data] nil)
    asami/as-connection
    apply-rules
    first
    :connection
    asami/db
    asami/graph
    (graph-q '[:find ?a ?v :where [:scheme/genders ?a ?v]])
    )
If I use engine/run Iā€™ll presumably index those axioms twice. Iā€™m guessing to avoid this Iā€™ll want to reimplement engine/run to take the axioms from my graph of user-data?

2021-02-24T10:36:26.032300Z

(the above is obviously scratch code btw)

quoll 2021-02-24T14:18:46.035800Z

The idea of having ā€œaxiomsā€ in a program is for statements that are necessary and the rest of the program wonā€™t run without them, hence you want to ensure that they are present every time. But there is nothing particularly special about them, and they can be inserted whenever and wherever you likeā€¦ so long as itā€™s before the program runs. Assuming that a used may want to insert data once was why I took the approach of loading the data separately in the CLI program: https://github.com/threatgrid/naga/blob/main/cli/src/naga/cli.clj#L76 I apologize that this code uses the Naga APIs for storage and not the transact function. Thatā€™s historical.

quoll 2021-02-24T14:30:57.040800Z

You could try something like this instead:

(def program (rules/create-program (:rules rdfs-rules) []))

(-> index/empty-graph
    (graph/graph-transact 0 user-data nil)             ;; load user data
    (graph/graph-transact 0 (:axioms rdfs-rules) nil)  ;; load axioms explicitly
    asami/as-connection
    (engine/run program)                               ;; program does not contain axioms
    first
    :connection
    asami/db
    asami/graph
    (graph-q '[:find ?a ?v :where [:scheme/genders ?a ?v]])
    )

2021-02-24T23:38:39.041900Z

@quoll: Thanks thatā€™s essentially what I wrote first of all, but it doesnā€™t seem to work, it returns an empty seq:

2021-02-24T23:38:55.042Z

2021-02-25T09:24:42.043900Z

Pretty sure the error is happening between lines 34 and 32. As inspecting the data returned by line 34 you can see the connection contains only the explicit triples and none of the inferred ones:

2021-02-25T09:24:57.044100Z

2021-02-25T09:27:23.044500Z

Inspecting the data at line 32 looks like this:

2021-02-25T09:27:50.044700Z

2021-02-25T09:31:16.045100Z

which yeah youā€™re right ā€” I think it looks a bit garbled.

2021-02-25T09:34:11.045400Z

ok looks like the issue might be with asamiā€™s graph-transact?! Inspecting this :

(-> index/empty-graph
    (graph/graph-transact 0 [user-data] nil)
    )

2021-02-25T09:34:18.045600Z

yields:

2021-02-25T09:34:21.045800Z

{:spo
 {[:ref/female :skos/inScheme :scheme/genders]
  {[:skos/inScheme :rdfs/range :skos/ConceptScheme] #{nil}}},
 :pos
 {[:skos/inScheme :rdfs/range :skos/ConceptScheme]
  {nil #{[:ref/female :skos/inScheme :scheme/genders]}}},
 :osp
 {nil
  {[:ref/female :skos/inScheme :scheme/genders]
   #{[:skos/inScheme :rdfs/range :skos/ConceptScheme]}}}}

2021-02-25T09:38:34.046400Z

ok I think I just spotted the issueā€¦

2021-02-25T09:40:36.046600Z

Ok @quoll youā€™ll be relieved that this is a bug in my code, not in asami/naga šŸ™‚

(def user-data [[:ref/female :skos/inScheme :scheme/genders]
                [:skos/inScheme :rdfs/range :skos/ConceptScheme]
                ])
,,,

    (graph/graph-transact 0 [user-data] nil)
that graph-transact line should be: (graph/graph-transact 0 user-data nil)

2021-02-25T09:41:37.046800Z

With that correction, asami does indeed return the correct / expected results!! :partywombat:

quoll 2021-02-25T15:51:31.047Z

OKā€¦ but I should probably have some better data checking in there. Silently failing while building invalid data structures isnā€™t particularly user friendly!

2021-02-25T15:58:26.047200Z

Perhapsā€¦ I did wonder if your plumatic schema stuff might have caught it, if Iā€™d turned it on in dev with s/set-fn-validation! but Iā€™ve just tried and it seems it still gets through

quoll 2021-02-25T15:59:08.047400Z

There are diminishing returns with checking for everything, and the cost can be high

quoll 2021-02-25T15:59:48.047600Z

My main reason for using plumatic schema has been documenting APIs. It really helps a lot! Itā€™s also nice when it catches a bug šŸ™‚

2021-02-25T15:59:59.047800Z

agreed ā€” but it can work well to do it in development/repl contexts

2021-02-25T16:02:18.048Z

For example Iā€™ve just added

(comment
  (require '[schema.core :as s])
  (s/set-fn-validation! true))
At the end of my file, so when I hack on this next I can evaluate that and hopefully reduce any other mistakes like this.

quoll 2021-02-25T16:02:58.048300Z

I was making shortcuts last night by NOT using schema in my code. Youā€™re going to guilt me into putting it back in šŸ™‚

2021-02-25T16:12:10.049100Z

Well problems like this come with the territory of using a dynamic language. If I thought type systems were more important than everything else Iā€™d be using Haskell or Idris day to day. Youā€™ve clearly pulled in plumatic schema because you feel it benefits cases like this though, so it probably makes sense to keep leveraging it; but please donā€™t let me guilt you into anything. What you have already is great! šŸ™‡

quoll 2021-02-25T16:13:43.049400Z

99% of why I use Schema is to document what a function does. This helps me write and debug code, since I know what is supposed to go in and out. It also documents it for someone else who wants to look at it, but really, I do it for me šŸ™‚

1šŸ‘
quoll 2021-02-25T16:14:00.049600Z

catching bugs is just a side effect

2021-02-25T16:17:27.049900Z

Anyway now Iā€™m over this hurdle I shall look forward to playing with naga in my vanishingly little spare time

quoll 2021-02-25T16:20:11.050100Z

I have 3 children, so I can relate to this šŸ™‚

2021-02-25T16:21:04.050300Z

I have 2 under 2 šŸ˜‚

1ā¤ļø
2021-02-24T23:39:53.042800Z

However moving the axioms from the user-data into the naga program does:

2021-02-24T23:41:27.042900Z