datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
2020-08-30T04:51:02.120800Z

Does every transaction in Datomic Cloud go via DynamoDB? And if so, can DynamoDB Streams be used to listen/subscribe for transactions?

jaret 2020-09-01T13:51:25.172600Z

Yes, every transaction is written to durable storage. However, Cloud's use of DDB is opaque unless accessed from Datomic. So, I am not sure what insight you would get by reviewing the stream of writes other than "writes are occurring," which you could also see by monitoring with metrics in CloudWatch (txdatoms, txbatchbytes) or by reviewing the Cloud Dashboard (txes,Txbytes DDB usage, write count). I am interested in hearing more about what you're envisioning with DBstreams, maybe there is something here we should look at as a feature for Datomic. 🙂

2020-09-01T19:53:49.180Z

I was thinking of the tx-report-queue functionality in the peer library and how that might be possible in Cloud.

2020-08-30T06:12:55.120900Z

Yes, it could be, but then you need to pull the entity, and use the attribute to get a new dB, and pull the second entity from that

kennytilton 2020-08-30T13:18:13.125500Z

So just getting serious about Datomic I decided to list all the datoms in a newly created DB:

:db.error/insufficient-binding Insufficient binding of db clause: [?eid ?a ?v] would cause full scan
Reminds me of Mommy telling me I am going to put my eye out if I keep doing sth (rather too hopefully, I might add). I just want to see the initial set of predicates. I already listed all the :db/idents, that was fun. Hmm, maybe a lower level API call hitting indexes? I'll see what I can see.

kennytilton 2020-08-30T13:19:24.125600Z

Sorry, query was:

(d/q '[:find ?eid ?a ?v
           :in $
           :where [?eid ?a ?v]]
      (d/db cw))

cjsauer 2020-08-30T13:25:44.126400Z

@hiskennyness you might try using the datoms api to do that: https://docs.datomic.com/cloud/query/raw-index-access.html

cjsauer 2020-08-30T13:28:17.128400Z

(datoms db {:index :eavt}) would give you an Iterable of all datoms in the db (on mobile, untested) https://docs.datomic.com/client-api/datomic.client.api.html#var-datoms

kennytilton 2020-08-30T13:31:27.128800Z

Doh!

:db.error/insufficient-binding Insufficient binding of db clause: [?eid ?a ?v] would cause full scan
That was
(->> (d/datoms (d/db cw)
           {:index :avet})
      (take 3)
      (map :a))
They saw us coming. 🙂

kennytilton 2020-08-30T13:33:56.129Z

Oh, hang on.....

kennytilton 2020-08-30T13:48:01.130600Z

OK, that works, my REPL was a mess. Thcx!

cjsauer 2020-08-30T13:58:32.130900Z

Cool!

kennytilton 2020-08-30T14:26:02.131100Z

In the beginning....

([[:fressian/tag]]
 [[:db/txInstant]]
 [[:db/valueType]]
 [[:db.install/attribute]]
 [[:db/cardinality]]
 [[:db/fulltext]]
 [[:db.install/valueType]]
 [[:db/tupleType]]
 [[:db.install/partition]]
 [[:db/ident]]
 [[:db/unique]]
 [[:db/doc]])
Now I have to google "fressian". 🙂

kennytilton 2020-08-30T19:09:31.131300Z

Bingo: https://github.com/Datomic/fressian

kennytilton 2020-08-30T20:32:58.133100Z

OK, who picked the word "ident" for a "name"? :db/doc "Attribute used to uniquely name an entity." 🙂

Aron 2020-08-31T09:54:04.161700Z

fwiw, ident is also in the dictionary, it just means identification, certainly I see this choice a much better name than name which is such an overloaded name for names that trying to google anything with it would be extremely frustrating. Definition of identification a: an act of identifying : the state of being identified b: evidence of identity

kennytilton 2020-08-31T12:11:12.161900Z

Ah, but the identity of :db/ident is 10. :db/ident is just an enum, an alias, an aka. A good counterargument here is @favila's point that idents survive as external references where the numerid ID does not, but that is just an example of the power of idents as implemented by Datomic and given some operation on a database. (What would alter entity-ids of the "same" entity?) If one looks inside Datomic, one would see that the true identity of :db/ident is 10. 10 gets linked to :db/ident by that :db/ident being the value where the attribute is :db/ident and --wait for it -- the entity-id is 10. :db/ident, after all, is just a namespaced keyword. This brings us to that other quagmire, tempid. With tempid we see we can have the numeric entity-id absent a :db/ident. And not the other way around. Do we sense the walls closing in on :db/ident? :)

favila 2020-08-31T12:50:23.162200Z

I think this is confusing two different concerns. an entity-id’s only purpose is to join facts asserted about the same “thing”. In that sense it is an “identity”, but a very weak one that isn’t aware of the meaning of the data. It’s also weak because it’s kind of an implementation detail of datomic: the only guarantee is that they will be referentially consistent, not that their values will be stable. In the google dictionary, this is the second meaning of the word “identity”

favila 2020-08-31T12:51:17.162400Z

in the data modeling domain, “identity” is the assertion that makes an entity “be” that thing. so :db/ident’s identity is not 10, 10's identity is :db/ident

favila 2020-08-31T12:51:58.162600Z

note also the attribute schema for these : :db/unique :db.unique/identity, i.e., this is an attribute that, when asserted, gives an entity an identity

favila 2020-08-31T12:51:59.162800Z

note also the attribute schema for these : :db/unique :db.unique/identity, i.e., this is an attribute that, when asserted, gives an entity an identity

favila 2020-08-31T12:55:00.163Z

> What would alter entity-ids of the “same” entity? These are rare or hypothetical in practice, but: 1. cognitect has said in the past that it couldn’t rule out entity renumbering in future versions of datomic. This is actually useful for performance because you can rearrange commonly-accessed values to be together in the datom sort order. (You can do this manually with on-prem using partitions, which are the top 20ish bits of an entity id. cloud stopped exposing partition control.)

favila 2020-08-31T12:56:57.163200Z

2. “decanting”, which is essentially a “git rebase”-like operation. You run through the transaction log of a db, and reapply the transactions with transformations to a second db. At Clubhouse we did this in order to renumber entity ids with partitions for performance. A key property of this operation is that entity ids are not guaranteed the same between the two dbs.

favila 2020-08-31T12:58:37.163400Z

3. base schema version changes. Datomic at some point introduced new base schema attributes (the version that introduced tuples.) To do this you need to install new attributes with the d/administer-system function. The entity id of these new attributes depends on what transaction in your particular db performed the upgrade--they are not the same on all dbs like the older attributes are.

kennytilton 2020-08-31T12:59:49.163600Z

Good points all. I am going to save all these for my Datomic tutorial before they scroll off the Clojurian history.

kennytilton 2020-08-31T13:21:23.163800Z

One perhaps acceptable result I am seeing, where tempids are involved, is that datomic identities can be idendtity-less. I just transacted this twice: `

[:db/add "Hi, Mom!" :movie/title "Harvey"]
:movie/title is not unique, in the tutorial schema or life, so good. But then the two entity IDs assigned are meaningless, if we agree that nothing that cannot survive the above DB transformations can be considered meaningful. eg, After, say, a decanting, we can still retrieve those two entities along with their arbitrary eids, but we cannot pair them off before/after. I guess that is OK. The physical DB has identity, but the abstract application DB identity relies on the developer managing identity capably. How'm I doin?

favila 2020-08-31T13:31:24.164Z

sure, but I still think using the word “identity” to think about entity or temp ids is going to be confusing in the long run. Identity is from/for humans, ids are for machines

favila 2020-08-31T13:32:12.164200Z

a tempid is a degenerate case of an entity id: it’s an entity id that is referentially consistent, but scoped to a particular transaction instead of a particular db, so it’s even more short-lived

favila 2020-08-31T13:33:47.164400Z

if necessary to preserve references in the db, the tempid will be “upgraded” to a newly minted entity id that has a longer lifetime; or if it is involved in an assertion about an identity attribute, datomic will ensure that all other facts in the transaction about that tempid will be asserted on the same entity that has the same identity

favila 2020-08-31T13:34:51.164600Z

in an attribute/assertion-centric data model, entities are very intangible

favila 2020-08-31T13:35:20.164800Z

an entity id is even less than an autoincrement column in a relational db, which is already not very much

favila 2020-08-31T13:38:00.165Z

maybe another angle on this: an entity is a map-projection of all the facts asserted about one thing. that thing may not have an identifier--the only thing that identifies it is the collection of assertions about it

favila 2020-08-31T13:38:14.165200Z

(or the entity id--if you peek under the covers)

favila 2020-08-31T13:39:04.165400Z

this is another way of saying “entity ids are only for joining facts” and don’t really grant an entity identity

favila 2020-08-31T13:39:49.165600Z

If this were pure mathematics I’m sure they’d find a way to do this without an entity id

kennytilton 2020-08-31T14:21:41.167Z

"that thing may not have an identifier--the only thing that identifies it is the collection of assertions about it" and "this is another way of saying “entity ids are only for joining facts” and don’t really grant an entity identity". But how do you have the "it" in "collection of assertions about it" without sth designating object identity? Put another way, the "only" in "only for joining facts" seems unjust. 🙂 Joining the facts is where object identity begins, it seems.

favila 2020-08-31T14:30:41.167200Z

yes, that’s the epistemological model. there is no “there” there but what is said of it

favila 2020-08-31T14:33:25.167400Z

here’s a thought question which may clarify: when does an entity exist?

favila 2020-08-31T14:34:05.167600Z

can an entity id be said to exist or not exist? If so, what makes it exist or not exist?

Aron 2020-08-31T14:57:34.167800Z

If you can show it, it exists. Also, ident is not identity nor id. It's evidence of identification (or encoding for a process of identification). If you have an ident, you still need a db and additional work to get an identity. (hopefully I didn't stretch the analogy too far :D)

kennytilton 2020-08-31T19:55:44.168800Z

"when does an entity exist?" An entity exists by definition. "When" depends on the frame of reference. Our frame of reference is a datomic database. There, when we assert a fact we get an entity ID, whether we like it or not. And we get an entity, but perhaps a ghost if we use a tempid with no :db/ident attribute. Ghosts thus are on the user. So I agree that ghosts and the edge cases of re-partitioning and decanting mean entity IDs are not part of object identity: that must be arranged by the user, via db/ident, a great, well, name for it!

favila 2020-08-31T20:29:17.169Z

I think the question is a trick question. Entities (and entity ids) don’t meaningfully exist or not-exist the same way a row in a sql table exists. Only assertions about entities (i.e. datoms) exist. if nothing is said about an entity, you can still pull that entity id and see nothing. I think you can even do this if the entity id has never been used before. You can also use an entity id only as the object (ref value) of a datom--so now you have something which is the object of assertions but never the subject of any assertions. when you write an application, “does this record exist” isn’t usefully answered by looking at entity ids but by a looking for a datom with a unique-value or unique-identity attribute and matching value.

1
Aron 2020-08-31T20:34:20.169300Z

'by definition'? where is the definition stored then? 🙂 how do you access it?

kennytilton 2020-08-31T22:35:47.169700Z

"You can also use an entity id only as the object (ref value) of a datom--so now you have something which is the object of assertions but never the subject of any assertions." I think I tried using 42 as an entity-id (not a ref) and got yelled at. I should think I would at least to have created a :db/entity with a nice name and used the resulting eid. In which case there would be one fact: the :db/ident value, yes?

Aron 2020-09-01T05:16:21.170Z

I don't think you should care so much about entity ids. And no, I don't think an entity can exist just with one datom, but I could be wrong, haven't checked it.

seancorfield 2020-09-01T05:47:42.170200Z

@hiskennyness : > Good points all. I am going to save all these for my Datomic tutorial before they scroll off the Clojurian history. This channel is archived to Zulip as https://clojurians.zulipchat.com/#narrow/stream/180378-slack-archive/topic/datomic which is a full, searchable history back to whenever the @zulip-mirror-bot was added to this channel.

🎉 1
seancorfield 2020-09-01T05:48:24.170600Z

(I figured this was a good opportunity to remind folks of the free, searchable archive of many channels here!)

3
Aron 2020-08-30T20:43:40.133300Z

why

favila 2020-08-30T22:04:26.134200Z

Ident as in “identifier”, stronger than a name

2020-08-30T23:00:54.135Z

Does datmoic keep a log of all the queries that were made?

2020-08-31T16:32:05.168Z

Here is that video: https://www.youtube.com/watch?v=7lm3K8zVOdY

2020-08-31T17:03:57.168200Z

Thanks ill give it a watch.

2020-09-01T10:51:17.171100Z

For development we have a custom implementation of d/q-protocol which sends queries to tap>. it enables us to see what performance is of specific queries. we just wrap the existing connection in our system-map.

2020-08-30T23:05:07.135100Z

It works better then "name" because it would be hard to search for it. As in, name is too widely used to refer to something this specific.

1
2020-08-30T23:07:33.135300Z

Which is exactly what favila is saying upon reflection.

kennytilton 2020-08-30T23:21:58.135500Z

Puh-leaze! 🙂 We are now corrupting our coding to optimize SEO?! Man, is this full circle or what? Btw, had they used the full expansion "identifier" I would not have my shorts in such a bunch. ps. Yes, it is pretty funny when we have to google something that is such an ordinary word. Funny as in hopeless!

2020-08-30T23:49:36.135800Z

@hiskennyness I meant it would be harder to communicate to other developers. In either case the name "name" or "ident" will need further clarification. But i agree "name" also works.