datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
2020-09-08T17:02:14.294100Z

I posted a question on the dev forum, but because it's on an old thread, I wondering if it'll draw any attention, so I was wondering if there's anyone here in Slack who might be able to provide some insight into the behaviour of unique composite-tuples we are seeing? https://forum.datomic.com/t/upsert-behavior-with-composite-tuple-key/1075?u=jcarnegie

favila 2020-09-09T17:58:23.365700Z

what was it before?

favila 2020-09-09T18:01:03.365900Z

> is your scenario combining upserting of the components of the ref also with upserting of the composite ref itself? That’s indeed what is happening. you are trying to allow repo upserting (one of the components of the ref) while also allowing the commit entity to upsert (the composite ref itself)

favila 2020-09-09T18:05:39.366100Z

I think this may be a complecting in :db.unique/identity itself. To identify an entity you generally can’t use refs--refs are internal identifiers, but :db/unique is to mark external identifiers

favila 2020-09-09T18:05:54.366300Z

but :db.unique/identity also has this upserting behavior you want

favila 2020-09-09T18:06:03.366500Z

which I’m guessing you want to use here for deduplication

favila 2020-09-09T18:07:57.366700Z

so, if the repository id never changes, and the commit->repo reference never changes, and repo id is always available to the application at tx time (I don’t see how it couldn’t be with this schema design) consider denormalizing by putting the repo id on the commit entity

favila 2020-09-09T18:08:42.366900Z

you can do this a few ways

favila 2020-09-09T18:09:28.367100Z

1. add :commit/repo-id, and make the :commit/id use that as one of its components (I suggest putting repo id first for better indexing)

favila 2020-09-09T18:09:54.367300Z

2. just write :commit/id as a tuple with those two values (don’t use a composite, just a tuple). This has the advantage of not adding a datom, but the disadvantage of being less clear

favila 2020-09-09T18:12:09.367700Z

both these have the advantage that you can now produce lookup refs for commits without a db: [:kipz.commit/id [commit-sha-string repo-id-string]]

favila 2020-09-09T18:12:41.367900Z

this is not possible for ref composites generally--it’s that notion of external identity again

favila 2020-09-09T18:14:04.368100Z

alternatively, if you just want to enforce uniqueness, consider not using upserting or a composite attribute at all. You can query first and speculatively create entities if they’re not found, and use :db/ensure to allow the transaction to fail if you violate the constraint.

favila 2020-09-09T18:14:11.368300Z

(i.e. optimistic commit style)

favila 2020-09-09T18:14:47.368500Z

you can use some, none, or all indexes according to your preference and the concurrency of the workload

favila 2020-09-09T18:17:04.368700Z

e.g. here, you could look up the repo and use that id; and if not found create a repo but allow the tx to fail (using :db.unique/value instead of identity) if someone else made the same repo in the meantime. you can recalculate and reissue the tx

favila 2020-09-09T18:17:17.368900Z

(that doesn’t actually need :db/ensure at all)

favila 2020-09-09T18:17:46.369100Z

anyway, those are just some ideas

favila 2020-09-09T18:18:37.369300Z

I don’t think expecting composite tuple upserting constraint resolution is a realistic expectation because of performance: the transactor has a global write lock on the db (essentially) while it’s doing all this tempid resolution and composite tuple maintenance, so it has to be as fast as possible

favila 2020-09-09T18:21:14.369500Z

that said, you can always write a transaction function that does what you want. it would take the repo and the commits plus some DSL for your own tempid replacement for the other assertions you want to make on those entities, do the lookup-or-create, then replace your tempid and return the expanded transaction. Essentially implementing the upserting logic yourself before the transactor does tempid resolution

2020-09-15T14:02:28.448200Z

> so, if the repository id never changes, and the commit->repo reference never changes, and repo id is always available to the application at tx time (I don’t see how it couldn’t be with this schema design) consider denormalizing by putting the repo id on the commit entity Yeah - I kind of added that external repo-id to simplify the example, but perhaps that just confused things. I had wanted repos to have unique composite tuples made from other attributes too.

2020-09-15T14:07:27.448600Z

Again - we've moved forwards with generating our own unique id attributes for all entities grounded in the attributes of those entities, and this leaves us free to use non-unique composite tuples as we like. This gives us the overall behaviour we like. However, to me, this feels like exactly the sort of constraint problem I want my database to solve for me and doesn't seem unreasonable - at least from the outside. In any case, I'm still wondering which uses cases these unique composite tuples (as they are currently implemented) are suitable for.

2020-09-15T14:07:45.448800Z

Thanks for all your insights! 🙂

favila 2020-09-15T14:40:25.449Z

they are suitable for ensuring uniqueness violations fail a tx (vs upsert), and for having more-selective lookups

favila 2020-09-08T17:17:20.294400Z

This is a caveat of unique-identity composite ref tuples

favila 2020-09-08T17:17:34.294600Z

I would even say a “gotcha”

favila 2020-09-08T17:18:35.294800Z

however, I’m not sure there’s an easy fix. “upsertion” works by looking for a to-be-applied assertion with a tempid and an upserting attr and resolving the tempid to an existing id if the value matches an existing id

favila 2020-09-08T17:19:24.295Z

composite tuples need to look at a just-completed transaction, see which composite component attributes were “touched”, and adding an additional datom to update the composite

favila 2020-09-08T17:19:43.295200Z

having upsertion resolve to a composite tuple would create a cycle here

favila 2020-09-08T17:20:30.295400Z

instead of two simple phases, it would become a constraint problem

2020-09-08T17:22:44.295600Z

Yeah, I see what you mean. It's just that this is the sort of thing that transaction isolation could give us, right?

favila 2020-09-08T17:23:26.295800Z

I’m not sure what you mean?

favila 2020-09-08T17:37:49.296Z

is your scenario combining upserting of the components of the ref also with upserting of the composite ref itself?

favila 2020-09-08T17:38:39.296200Z

I’m trying to imagine why you don’t either have an entity id already, or know you are creating the entity and thus cannot conflict

2020-09-08T17:49:03.296400Z

Well, yeah, this can be solved by issuing multiple transactions, but I'm trying to avoid that. The system itself receives events (the entities) from different sources at different times/orders with partial data - enough to create id's of (potentially) new entities that are required refs of other entities. So in general, we can't know, without issuing queries, if a particular entity already exists. So we want to upsert all the time, and we need a single transaction. Like I said in my post, we've solved this by generating unique id fields from our own external definition of composite-ids - and this is a bit of a pain (we have to manage the lifecycle of this schema and related code between clients). I'm wondering how folk are using these (unique) composite-tuples in the real world given how they currently work.

2020-09-08T17:50:07.296600Z

> I’m not sure what you mean? What I mean is, it seems feasible that this constraint could be solved within the transaction if the datomic team wanted to implement this. I understand that it's currently not the case.

2020-09-08T17:52:02.296800Z

> is your scenario combining upserting of the components of the ref also with upserting of the composite ref itself? I'm still trying to grok this 🙂 but I think so. I can post a little schema and transaction if you're interested?

favila 2020-09-08T17:59:13.297Z

yeah, I am

2020-09-08T19:14:16.297200Z

[
     ;; commit entity
     {:db/ident :kipz.commit/sha
      :db/cardinality :db.cardinality/one
      :db/valueType :db.type/string}

     {:db/ident :kipz/repo
      :db/cardinality :db.cardinality/one
      :db/valueType :db.type/ref}

     {:db/ident :kipz.commit/id
      :db/valueType :db.type/tuple
      :db/unique :db.unique/identity
      :db/tupleAttrs [:kipz.commit/sha
                      :kipz/repo]
      :db/cardinality :db.cardinality/one}

     ;; repo entity
     {:db/ident :kipz.repo/id
      :db/cardinality :db.cardinality/one
      :db/unique :db.unique/identity
      :db/valueType :db.type/string}

     {:db/ident :kipz.repo/name
      :db/cardinality :db.cardinality/one
      :db/valueType :db.type/string}

     {:db/ident :kipz.repo/owner
      :db/cardinality :db.cardinality/one
      :db/valueType :db.type/string}]

2020-09-08T19:14:35.297400Z

[[:db/add "r1" :kipz.repo/id "repo-id-1"]
 [:db/add "r1" :kipz.repo/name "repo-name-1"]
 [:db/add "r1" :kipz.repo/owner "repo-owner-1"]
 [:db/add "c1" :kipz.commit/sha "commit-sha-1"]
 ;; always fails without this, fails after first time with it
 ;; this is the line that the docs says we should never do, but only works
 ;; with specific known "r1" eid
 [:db/add "c1" :kipz.commit/id ["commit-sha-1" "r1"]]
 [:db/add "c1" :kipz/repo "r1"]]

2020-09-08T19:16:50.297900Z

:kipz.repo/id has been made a scalar to help show the issue

nando 2020-09-08T22:31:35.304400Z

Datomic beginner here. I have a question about schema evolution from initial experience. Developing a simple web app, I began with the following query to populate a form:

(defn find-nutrient [eid]
  (d/q '[:find ?eid ?name ?grams-in-stock ?purchase-url ?note 
         :keys eid name grams-in-stock purchase-url note 
         :in $ ?eid
         :where [?eid :nutrient/name ?name]
         [?eid :nutrient/grams-in-stock ?grams-in-stock]
         [?eid :nutrient/purchase-url ?purchase-url]
         [?eid :nutrient/note ?note]
       (d/db conn) eid))
Got all CRUD operations working as expected. Delightful. Decided to add categories of nutrients to this app to work through using ref types. Made appropriate changes to schema and codebase, added a few categories, dropdown populates, all good, changed the find-nutrient function to the following:
(defn find-nutrient [eid]
  (d/q '[:find ?eid ?name ?grams-in-stock ?purchase-url ?note ?category-eid
         :keys eid name grams-in-stock purchase-url note category-eid
         :in $ ?eid
         :where [?eid :nutrient/name ?name]
         [?eid :nutrient/grams-in-stock ?grams-in-stock]
         [?eid :nutrient/purchase-url ?purchase-url]
         [?eid :nutrient/note ?note]
         [?eid :nutrient/category ?category-eid]]
       (d/db conn) eid))
Opps, and now the form no longer populates with existing data, because none of the entities have a category.

Nassin 2020-09-08T22:39:28.306800Z

have you read about the pull api?

Nassin 2020-09-08T22:39:39.307100Z

sounds more like a job for it

nando 2020-09-08T22:41:36.307800Z

Ok, I will look into it.

Nassin 2020-09-08T22:44:06.309Z

yes, did you add the new attribute to existing entities?

nando 2020-09-08T22:45:19.310200Z

I'm thinking ahead if this could be a potential issue with the evolution of a production app.

nando 2020-09-08T22:46:23.310900Z

There are only a few entities, so one has and the others don't.

nando 2020-09-08T22:49:08.312Z

I'll work on modifying the function to use the pull api and see what happens.

Nassin 2020-09-08T22:52:51.314300Z

yes, the pull api is designed for this

nando 2020-09-08T23:03:31.316200Z

Ok, got it working. Is there a handy way to specify the keys used in the map that is returned using the pull api? I didn't find one in a scan of the docs.

Nassin 2020-09-08T23:05:21.316800Z

like this? https://docs.datomic.com/cloud/query/query-pull.html#as-option

Nassin 2020-09-08T23:07:29.317700Z

clojure has select-keys

nando 2020-09-08T23:09:25.318900Z

{:db/id 4611681620380876878,
 :nutrient/name "Vitamin A",
 :nutrient/grams-in-stock 40,
 :nutrient/purchase-url "<http://www.link.com>",
 :nutrient/note "beta carotene and palmitate",
 :nutrient/category #:db{:id 96757023244374}}
Here's the data returned from the pull. I'll see what I can do with it.

nando 2020-09-08T23:10:03.319300Z

Maybe I should be using the fully qualified keys in my web forms?

Nassin 2020-09-08T23:20:02.320200Z

don't see why not, as long as you are inside the same process one should rely on them IMO

Nassin 2020-09-08T23:25:56.320700Z

sometimes they aren't pretty to work with though

nando 2020-09-08T23:26:44.321400Z

How would I flatten :nutrient/category #:db{:id 96757023244374}}

nando 2020-09-08T23:27:26.321900Z

I'm not yet familar with what #: designates

nando 2020-09-08T23:28:19.322300Z

I should just try to do it myself ...

Joe Lane 2020-09-08T23:30:37.323900Z

@nando That is just clojure shorthand for :nutrient/category {:db/id 96757023244374} .

nando 2020-09-08T23:31:26.324500Z

Oh! That helps a lot! Then a get-in should do it.

Joe Lane 2020-09-08T23:34:06.326Z

Well, wait, what is the type of ref that :nutrient/category is pointing at?

nando 2020-09-08T23:34:47.326200Z

I'm sufficiently sorted out. Thanks very much @kaxaw75836 & @lanejo01

👍 1
💯 1
nando 2020-09-08T23:37:18.327Z

@lanejo01 It's essentially a string, a category name

nando 2020-09-08T23:38:13.327700Z

{:db/ident :nutrient/category
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/one}
{:db/ident :category/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db/doc "Nutrient category"}

Joe Lane 2020-09-08T23:40:35.329700Z

Cool. If you're using d/pull or using pull in a query you can supply a pull pattern like this.

(d/pull db '[:nutrient/name 
             :nutrient/grams-in-stock 
             :nutrient/purchase-url 
             :nutrient/note
             {:nutrient/category [:category/name]}] eid)

nando 2020-09-08T23:43:04.330400Z

Ok, got it!

Joe Lane 2020-09-08T23:44:50.331100Z

Have fun, reach out if you have more questions!

nando 2020-09-08T23:46:19.331300Z

This is fun!

Joe Lane 2020-09-08T23:47:54.331600Z

Are you using dev-local?

nando 2020-09-08T23:48:21.331900Z

Yes, I am.

Joe Lane 2020-09-08T23:49:46.333200Z

Cool, I would love to hear some feedback on your experience.

Joe Lane 2020-09-08T23:50:44.334900Z

If you were interested, of course 🙂

Joe Lane 2020-09-08T23:51:41.335700Z

Either way, glad to hear you think it's fun!

nando 2020-09-08T23:57:22.340200Z

I've always wanted to use datomic, for years, but it was difficult to find a sensible path in as a solo developer. For me, datomic justified learning clojure. Anyway, some weeks back i decided to bite the bullet and dive into developing an app to learn Clojure. I thought I was going to use next.jdbc and a relational database. Well, I had trouble getting the mysql driver to work with the mysql version installed on my dev laptop ... and the next day I decided to hell with it, I'm going to find a way to use datomic instead! So I started looking around and found Stu's message here that dev-local had been released the day before. 🙂