datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
kennytilton 2020-08-31T00:00:19.136200Z

As a noob, ident had my head spinning because everywhere I looked in Datomic I saw IDs. And dto is similar to allegrograph which I grok OK so it is not even alien technology. I guess idents are closest to enum in programmerspeak; they let us give a number a mnemonic (another term I would consider, along with alias). But wait, we are in the Land of Hickey, what does the dictionary say? "name noun 1. a word by which a thing is known, addressed, or referred to." Lovely. Btw, I do not like identifier because the entity-id is an identifier, and the real one needing no translation. Fun stuff. 🙂

2020-08-31T00:18:08.136400Z

Yep. it takes time to learn the language 🙂.

val_waeselynck 2020-08-31T00:29:38.139200Z

I would be very surprised if it did 🙂 that said, it probably keeps a cache of compiled Datalog queries somewhere in the JVM.

val_waeselynck 2020-08-31T00:31:41.139400Z

Datomic has a philosophy of "I'm not slowing down because you're watching me", it seems to me that logging every query would go against that philosophy, especially given the aim that most queries be fast walks through in-memory data structures.

2020-08-31T00:37:40.139600Z

seems reasonable. Having that log isn't a goal of mine, if it existed it might be useful though. I'm considering how to extend the datalog language so it could be used across multiple databases. I have no real plans on doing this, just a thought experiment.

2020-08-31T01:00:44.139900Z

So as-of doesn't give all attached facts past the given time? It gives just what was true as of that time.

favila 2020-08-31T01:10:15.140400Z

It can already do this?

favila 2020-08-31T01:14:04.144300Z

Not sure if you are thinking exactly of this, but you can already supply multiple data sources, :in $ds1 $ds2 then reference in pattern clauses by [$ds1 e-match a-match ...] or in rules by ($ds1 rulename ...)

favila 2020-08-31T01:15:22.145900Z

Not having to mention the default $ name most of the time is syntax sugar

favila 2020-08-31T01:23:35.153500Z

Idents have enough unique properties viz entity ids that it’s worth giving them another name imo. Entity id values are not user-controllable, are not a public contract, and should not be stored durably for long periods outside the system. Idents are all of these, and are also reassignable and guaranteed unique. You can maybe think of them as a special case of lookup ref where the :db/ident attr is implied, although historically lookup refs came later

favila 2020-08-31T01:28:56.154700Z

This May be helpful as a primer on the kinds of “identifying” datomic has: https://docs.datomic.com/on-prem/identity.html

favila 2020-08-31T01:32:01.156700Z

An impl note: at least on on-prem there is a full in-memory map of idents to eids and vice versa, so they are faster forms of reference than normal unique attributes and that is also why you shouldn’t have very large numbers of them

favila 2020-08-31T01:35:53.158700Z

Idents are also valid even after retracted. This is so you can rename an attribute without breaking code—the old name will still work

kennytilton 2020-08-31T02:55:15.159200Z

Thx! I had indeed seen all that. I had not picked up, tho, that idents would work if stored outside the system even when entity-ids had changed. Interesting. But in that same section we see the sentence that actually, I recall, made me stop reading and look for a different tutorial:

When an entity has an ident, you can use that ident in place of the numeric identifier, e.g.
Thy syllables "id" and "ent" are doing the flight of the bumblebees in there. 🐝 The sentence before it was better. A little. The first half was fine. then it got weird.
Idents associate a programmatic name (a keyword) with an entity id, by setting a value for the :db/ident attribute
That's great like NYC street signs are great if we already know where we are going. But yeah, "name" by itself would have its own challenges. I will close by noting that, if a db/ident is a name for an entity, then per Gertrude Stein:
The :db/ident of :db/ident is :db/ident.
I'd give that a 10.

v 2020-08-31T03:12:37.161100Z

There is a really good video on this by folks at nubank. I believe it’s called 4 super powers of Datomic, where they talk about querying across multiple database. Highly recommended

Aron 2020-08-31T09:54:04.161700Z

fwiw, ident is also in the dictionary, it just means identification, certainly I see this choice a much better name than name which is such an overloaded name for names that trying to google anything with it would be extremely frustrating. Definition of identification a: an act of identifying : the state of being identified b: evidence of identity

kennytilton 2020-08-31T12:11:12.161900Z

Ah, but the identity of :db/ident is 10. :db/ident is just an enum, an alias, an aka. A good counterargument here is @favila's point that idents survive as external references where the numerid ID does not, but that is just an example of the power of idents as implemented by Datomic and given some operation on a database. (What would alter entity-ids of the "same" entity?) If one looks inside Datomic, one would see that the true identity of :db/ident is 10. 10 gets linked to :db/ident by that :db/ident being the value where the attribute is :db/ident and --wait for it -- the entity-id is 10. :db/ident, after all, is just a namespaced keyword. This brings us to that other quagmire, tempid. With tempid we see we can have the numeric entity-id absent a :db/ident. And not the other way around. Do we sense the walls closing in on :db/ident? :)

favila 2020-08-31T12:50:23.162200Z

I think this is confusing two different concerns. an entity-id’s only purpose is to join facts asserted about the same “thing”. In that sense it is an “identity”, but a very weak one that isn’t aware of the meaning of the data. It’s also weak because it’s kind of an implementation detail of datomic: the only guarantee is that they will be referentially consistent, not that their values will be stable. In the google dictionary, this is the second meaning of the word “identity”

favila 2020-08-31T12:51:17.162400Z

in the data modeling domain, “identity” is the assertion that makes an entity “be” that thing. so :db/ident’s identity is not 10, 10's identity is :db/ident

favila 2020-08-31T12:51:58.162600Z

note also the attribute schema for these : :db/unique :db.unique/identity, i.e., this is an attribute that, when asserted, gives an entity an identity

favila 2020-08-31T12:51:59.162800Z

note also the attribute schema for these : :db/unique :db.unique/identity, i.e., this is an attribute that, when asserted, gives an entity an identity

favila 2020-08-31T12:55:00.163Z

> What would alter entity-ids of the “same” entity? These are rare or hypothetical in practice, but: 1. cognitect has said in the past that it couldn’t rule out entity renumbering in future versions of datomic. This is actually useful for performance because you can rearrange commonly-accessed values to be together in the datom sort order. (You can do this manually with on-prem using partitions, which are the top 20ish bits of an entity id. cloud stopped exposing partition control.)

favila 2020-08-31T12:56:57.163200Z

2. “decanting”, which is essentially a “git rebase”-like operation. You run through the transaction log of a db, and reapply the transactions with transformations to a second db. At Clubhouse we did this in order to renumber entity ids with partitions for performance. A key property of this operation is that entity ids are not guaranteed the same between the two dbs.

favila 2020-08-31T12:58:37.163400Z

3. base schema version changes. Datomic at some point introduced new base schema attributes (the version that introduced tuples.) To do this you need to install new attributes with the d/administer-system function. The entity id of these new attributes depends on what transaction in your particular db performed the upgrade--they are not the same on all dbs like the older attributes are.

kennytilton 2020-08-31T12:59:49.163600Z

Good points all. I am going to save all these for my Datomic tutorial before they scroll off the Clojurian history.

kennytilton 2020-08-31T13:21:23.163800Z

One perhaps acceptable result I am seeing, where tempids are involved, is that datomic identities can be idendtity-less. I just transacted this twice: `

[:db/add "Hi, Mom!" :movie/title "Harvey"]
:movie/title is not unique, in the tutorial schema or life, so good. But then the two entity IDs assigned are meaningless, if we agree that nothing that cannot survive the above DB transformations can be considered meaningful. eg, After, say, a decanting, we can still retrieve those two entities along with their arbitrary eids, but we cannot pair them off before/after. I guess that is OK. The physical DB has identity, but the abstract application DB identity relies on the developer managing identity capably. How'm I doin?

favila 2020-08-31T13:31:24.164Z

sure, but I still think using the word “identity” to think about entity or temp ids is going to be confusing in the long run. Identity is from/for humans, ids are for machines

favila 2020-08-31T13:32:12.164200Z

a tempid is a degenerate case of an entity id: it’s an entity id that is referentially consistent, but scoped to a particular transaction instead of a particular db, so it’s even more short-lived

favila 2020-08-31T13:33:47.164400Z

if necessary to preserve references in the db, the tempid will be “upgraded” to a newly minted entity id that has a longer lifetime; or if it is involved in an assertion about an identity attribute, datomic will ensure that all other facts in the transaction about that tempid will be asserted on the same entity that has the same identity

favila 2020-08-31T13:34:51.164600Z

in an attribute/assertion-centric data model, entities are very intangible

favila 2020-08-31T13:35:20.164800Z

an entity id is even less than an autoincrement column in a relational db, which is already not very much

favila 2020-08-31T13:38:00.165Z

maybe another angle on this: an entity is a map-projection of all the facts asserted about one thing. that thing may not have an identifier--the only thing that identifies it is the collection of assertions about it

favila 2020-08-31T13:38:14.165200Z

(or the entity id--if you peek under the covers)

favila 2020-08-31T13:39:04.165400Z

this is another way of saying “entity ids are only for joining facts” and don’t really grant an entity identity

favila 2020-08-31T13:39:49.165600Z

If this were pure mathematics I’m sure they’d find a way to do this without an entity id

souenzzo 2020-08-31T14:19:00.166900Z

Who do I get a older version of datomic dev-tools ??? My project uses com.datomic/dev-local {:mvn/version "0.9.184"} and I can't find how to download it.

souenzzo 2020-09-01T14:27:42.173Z

@stuarthalloway bit off but which maven repository do you use/recommend for small corps//personal experimentation?

stuarthalloway 2020-09-01T19:47:56.177Z

I tend to use an S3 bucket -- a maven repo is just a convention about files.

kennytilton 2020-08-31T14:21:41.167Z

"that thing may not have an identifier--the only thing that identifies it is the collection of assertions about it" and "this is another way of saying “entity ids are only for joining facts” and don’t really grant an entity identity". But how do you have the "it" in "collection of assertions about it" without sth designating object identity? Put another way, the "only" in "only for joining facts" seems unjust. 🙂 Joining the facts is where object identity begins, it seems.

favila 2020-08-31T14:30:41.167200Z

yes, that’s the epistemological model. there is no “there” there but what is said of it

favila 2020-08-31T14:33:25.167400Z

here’s a thought question which may clarify: when does an entity exist?

favila 2020-08-31T14:34:05.167600Z

can an entity id be said to exist or not exist? If so, what makes it exist or not exist?

Aron 2020-08-31T14:57:34.167800Z

If you can show it, it exists. Also, ident is not identity nor id. It's evidence of identification (or encoding for a process of identification). If you have an ident, you still need a db and additional work to get an identity. (hopefully I didn't stretch the analogy too far :D)

2020-08-31T16:32:05.168Z

Here is that video: https://www.youtube.com/watch?v=7lm3K8zVOdY

2020-08-31T17:03:57.168200Z

Thanks ill give it a watch.

stuarthalloway 2020-08-31T19:16:33.168600Z

I will look into that -- in the meantime, can you just update the dep to latest. They are all compatible.

1👍
kennytilton 2020-08-31T19:55:44.168800Z

"when does an entity exist?" An entity exists by definition. "When" depends on the frame of reference. Our frame of reference is a datomic database. There, when we assert a fact we get an entity ID, whether we like it or not. And we get an entity, but perhaps a ghost if we use a tempid with no :db/ident attribute. Ghosts thus are on the user. So I agree that ghosts and the edge cases of re-partitioning and decanting mean entity IDs are not part of object identity: that must be arranged by the user, via db/ident, a great, well, name for it!

favila 2020-08-31T20:29:17.169Z

I think the question is a trick question. Entities (and entity ids) don’t meaningfully exist or not-exist the same way a row in a sql table exists. Only assertions about entities (i.e. datoms) exist. if nothing is said about an entity, you can still pull that entity id and see nothing. I think you can even do this if the entity id has never been used before. You can also use an entity id only as the object (ref value) of a datom--so now you have something which is the object of assertions but never the subject of any assertions. when you write an application, “does this record exist” isn’t usefully answered by looking at entity ids but by a looking for a datom with a unique-value or unique-identity attribute and matching value.

1
Aron 2020-08-31T20:34:20.169300Z

'by definition'? where is the definition stored then? 🙂 how do you access it?

kennytilton 2020-08-31T22:35:47.169700Z

"You can also use an entity id only as the object (ref value) of a datom--so now you have something which is the object of assertions but never the subject of any assertions." I think I tried using 42 as an entity-id (not a ref) and got yelled at. I should think I would at least to have created a :db/entity with a nice name and used the resulting eid. In which case there would be one fact: the :db/ident value, yes?