datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
arohner 2020-07-29T12:24:13.491400Z

Are there any recommendations on the size of a cardinalityMany attribute? Is there a problem with storing a million uuids in a single datom EAV?

stuarthalloway 2020-07-29T12:53:26.491800Z

There is no particular limit, but you should keep in mind the memory implications of future use.

stuarthalloway 2020-07-29T12:54:41.492Z

For example, if you gradually build up 1 million EAs, and then retract the entire entity, the transaction that does the retraction will have 1 million datoms in it.

stuarthalloway 2020-07-29T12:55:56.492200Z

Also consider pull expressions, which might have been written (or displayed in a UI) with the presumption that their results are smallish and don't need to be e.g. paginated.

stuarthalloway 2020-07-29T12:58:06.492400Z

Programs consuming a high cardinality attribute may want to use https://docs.datomic.com/cloud/query/query-index-pull.html#aevt to consume in chunks.

arohner 2020-07-29T13:14:11.492600Z

Thanks

souenzzo 2020-07-29T14:28:44.496800Z

Reminder: pull by default get only 1000 elements on ref-to-many https://docs.datomic.com/on-prem/pull.html#limit-option

kschltz 2020-07-29T14:37:12.003Z

Hi there. My current cenario is that we're using datomic cloud in one of our major services and it is around 60M entities/3.5B datoms and some particular queries are under performatic. As we plan to grow some orders of magnitude, I was exploring alternatives to escalate both our writes and reads. From my understanding so far, given that I'm able to scale the number of processors to serve my dbs, and transactors dont compete for resources among those dbs, I started experimenting with the following: 1 Have my service write in parallel to multiple dbs (let's say db0 db1 db2 all with the same schema), ensuring that the same entity always end up in the correct db so I don't end up with partial data split across my databases 2 When querying, I issue them in parallel, then merge the results in my application, something like

(pcalls query-for-satellites0 query-for-satellites1 query-for-satellites2)
So far, this parallel read/write cenario has proven to be really performatic Now my question to you guys is if I'm missing on something, or are there any achitectural gotchas that would make this a bad idea?

Nassin 2020-07-29T15:41:09.003100Z

For example, if you gradually build up 1 million EAs, and then retract the entire entity, the transaction that does the retraction will have 1 million datoms in it.

Nassin 2020-07-29T15:41:22.003300Z

Only if isComponent is true correct?

favila 2020-07-29T17:07:59.004700Z

@kaxaw75836 no, isComponent will propagate the delete to other entities

favila 2020-07-29T17:09:21.004900Z

[E A 1millionV] is going to delete one million E datoms regardless of whether A is an isComponent attr.

Nassin 2020-07-29T17:14:04.005100Z

ah true, was thinking it was of type :db.type/ref 👍

kschltz 2020-07-29T20:03:20.005900Z

Yup