datalog

simongray 2020-09-11T10:32:56.012Z

any kind of standard format to import/export datalog triples? I’m thinking something simple like ndjson, but for datalog triples. I realise it could easily be implemented, but I was just wondering if anything had been agreed on. I tried expanding my search for to include newline-delimited EDN and I found this which I guess is good enough: https://github.com/lambdaisland/edn-lines

whilo 2020-09-11T17:21:50.013700Z

@simongray This is how our simple export and import functions in Datahike work as well, we just print one datom in each line. The problem for general import and export is to map the entity and attribute ids between different triple stores. What are you trying to do?

simongray 2020-09-14T14:21:35.032300Z

@whilo I actually have a bunch of RDF data from a WordNet, but I’m not sure if getting it into one of the Clojure Datalog dbs makes sense, or if I’m better off using either Apache Jena or Neo4j. I have no experience with either.

refset 2020-09-14T14:48:26.032500Z

@simongray we've done a lot of benchmarking for Crux using RDF bench suites (LUBM and WatDiv, specifically), so there's quite a bit of code you could use or borrow in crux-bench and crux-rdf e.g. https://github.com/juxt/crux/blob/master/crux-rdf/src/crux/rdf.clj and https://github.com/juxt/crux/blob/master/crux-bench/src/crux/bench/watdiv.clj

🤘 1
simongray 2020-09-11T17:25:08.013800Z

Just doing preliminary research for a research project. Why would mapping entity and attribute ids be an issue? Aren't the mappings implied by the triples?

refset 2020-09-11T18:07:01.014Z

in the classic Datomic datoms model entity ids are intended to be for internal-only usage and shouldn't be communicated (or persisted!) outside of the boundaries of the particular instance of the database system For contrast, Crux took the opposite approach and requires the user to provide an explicit :crux.db/id value for each document

👍 1
refset 2020-09-11T19:00:23.027600Z

Are there any avid users of sub-queries / "nested queries" here? I am curious about the kinds of practical use-cases that people have come across. We added sub-query support to Crux a couple of days ago (releasing next week) with the immediate motivation being: the ability to transpose TPC-H queries without touching Clojure. This is unrelated to our prior crux-sql TPC-H work. Some test examples, for context: https://github.com/juxt/crux/blob/master/crux-test/test/crux/query_test.clj#L1186

lilactown 2020-09-11T19:04:46.028200Z

I did not even know that was a thing. Is that supported by datomic and/or datascript too?

simongray 2020-09-11T19:12:52.029100Z

I see.

whilo 2020-09-11T19:13:57.029900Z

@lilactown Yes, you can just call query again, but you need to provide an explicit binding between the surrounding query and the nested query. Datomic had some restrictions of how you can pass databases around last time I checked.

whilo 2020-09-11T19:18:31.030Z

You can also explicitly pass the entity ids in Datomic, Datahike, datalevin and Datascript, if you want. The difficulty is to figure out mappings between different databases in hindsight typically. RDF has solutions for that, basically you need to scope all the ids properly.

➕ 1
lilactown 2020-09-11T19:55:42.030800Z

“just call query again” you mean calls to d/q?

pithyless 2020-09-11T21:37:31.031800Z

#TIL; also, this doesn't seem to be mentioned anywhere in the on-prem docs: https://docs.datomic.com/on-prem/query.html#built-in-expressions Is this just an oversight, are the cloud docs usually kept more up-to-date, or is this just not supported on-prem?