datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
joshkh 2020-11-03T12:46:21.282800Z

should i be able to upsert an entity via an attribute which is a reference and also unique-by-identity?

joshkh 2020-11-03T12:50:57.283400Z

for example

(d/transact conn
            {:tx-data
             [; upsert a school entity where :school/president is a reference and unique-by-identity
              {:school/president {:president/id 12345}
               :school/name      "Bowling Academy of the Sciences"}]})

favila 2020-11-03T13:06:51.283500Z

yes

favila 2020-11-03T13:07:13.283700Z

This is actually two upserts isn’t it? :president/id also?

vncz 2020-11-03T14:15:52.284600Z

Is there any specific reason why some kind of selection can only be done using the Peer Server?

favila 2020-11-03T14:16:35.284700Z

What do you mean by “selection”?

vncz 2020-11-03T14:16:48.284900Z

Let me give you an example

vncz 2020-11-03T14:17:49.285100Z

:find [?name ?surname] :in $ :where [?e :p/name ?name] [?e :p/surname ?surname]

vncz 2020-11-03T14:17:59.285300Z

This query cannot be executed by the peer library

vncz 2020-11-03T14:18:38.285600Z

This one can :find ?name ?surname :in $ :where [?e :p/name ?name] [?e :p/surname ?surname]

vncz 2020-11-03T14:18:39.285800Z

@favila

favila 2020-11-03T14:19:08.286Z

ah, ok, those are called “find specifications”

vncz 2020-11-03T14:20:03.286200Z

Yes, these ones. It seems like the Peer Library can only execute the "Collection of List" one

favila 2020-11-03T14:20:11.286400Z

and it’s the opposite: only the peer API supports these; the client API (the peer server provides an endpoint for the client api) does not

vncz 2020-11-03T14:20:38.286600Z

This is weird, I'm using Datomic-dev (which I guess it's using the peer library?!) and I can't execute such queries

favila 2020-11-03T14:21:02.286800Z

dev-local?

vncz 2020-11-03T14:21:13.287Z

Yes

favila 2020-11-03T14:21:32.287200Z

that uses the client api. (require '[datomic.client.api])

favila 2020-11-03T14:21:43.287400Z

the peer api is datomic.api

vncz 2020-11-03T14:22:28.287600Z

favila 2020-11-03T14:22:47.288Z

correct

favila 2020-11-03T14:22:53.288200Z

but you are using a client api

favila 2020-11-03T14:23:00.288400Z

the client api does not support these

vncz 2020-11-03T14:23:07.288600Z

Hmm :thinking_face:

vncz 2020-11-03T14:23:22.288800Z

Ok so in theory I should just change the namespace requirement?

favila 2020-11-03T14:23:35.289Z

no, datomic.api is not supported by dev-local

vncz 2020-11-03T14:23:47.289200Z

Ah ok so there's no way around it basically

favila 2020-11-03T14:24:08.289400Z

Maybe historical background would help: in the beginning was datomic on-prem and the peer (`datomic.api` ), then came cloud and the client-api, and the peer-server as a bridge from clients to on-prem peers.

favila 2020-11-03T14:24:08.289600Z

Maybe historical background would help: in the beginning was datomic on-prem and the peer (`datomic.api` ), then came cloud and the client-api, and the peer-server as a bridge from clients to on-prem peers.

favila 2020-11-03T14:24:20.289800Z

dev-local is “local cloud”

favila 2020-11-03T14:24:24.290Z

that came even later

favila 2020-11-03T14:24:33.290200Z

(like, less than two months ago?)

vncz 2020-11-03T14:24:41.290400Z

Oh ok, so it's a simulation of a cloud environment. I guess I was confused by the fact that's all in the same process

favila 2020-11-03T14:25:28.290600Z

the client-api is designed to be networked or in-process; in dev-local or inside an ion, it’s actually in-process

vncz 2020-11-03T14:25:56.290800Z

Got it. So to keep it short I should either move to Datomic-free on Premise or workaround the limitation in the code

favila 2020-11-03T14:26:48.291Z

as to why they dropped the find specifications, I don’t know. My guess would be that people incorrectly thought that it actually changed the query performance characteristics, but actually it’s just a convenience for first, map first, etc

favila 2020-11-03T14:27:06.291200Z

the query does just as much work and produces a full result in either case

vncz 2020-11-03T14:27:18.291400Z

I could see these conveniente useful though. The idea of having to manually do that every time is annoying.

vncz 2020-11-03T14:27:22.291600Z

Not the end of the world, but still

joshkh 2020-11-03T15:26:42.292400Z

yes, you are correct and that is indeed the problem. it seems that you cannot upsert two entities that reference each other within the same transaction. for example, running this transaction twice causes a datom conflict

(d/transact conn
            {:tx-data
             [
              ; a president
              {:president/id "The Dude" :db/id "temp-president"}

              ; a school with a unique-by-identity 
              ; :school/president reference to the president
              {:school/president "temp-president"
               :school/name      "Bowling Academy of Sciences"}
              ]})
whereas both of these transactions upsert as expected
(d/transact conn
            {:tx-data
             [; a president
              {:president/id "The Dude" :db/id "temp-president"}
              ]})

(d/transact conn
            {:tx-data
             [; a school with a unique-by-identity 
              ; :school/president reference to the president
              {:school/president 101155069755476  ;<- known dbid
               :school/name      "Bowling Academy of Sciences"}]})

joshkh 2020-11-03T15:27:54.292600Z

(note the known eid in the second transaction)

kschltz 2020-11-03T17:09:33.298300Z

Hi there. We've been facing an awkward situation with our Cloud system From what I've seem of Datomic Cloud architecture, it seemed like I can have several databases in the same system, as long as there are transactor machines available in my Transactor group. With that in mind, we scaled our compute group to have 20 machines, to serve our 19 dbs. All went well for a few months, until 3/4 days ago, when we started facing issues to transact data, having "Busy Indexing" errors. If Im not wrong this is due to our transactors being unable to ingest data the same pace we are transacting it, or is there something else I'm missing here? Thanks :D

kschltz 2020-11-03T17:37:53.298800Z

@marciol

kschltz 2020-11-03T21:18:49.299400Z

Another odd thing is that my Dynamo Write Actual is really low, despite my IndexMemDb metric being really high

kschltz 2020-11-03T21:19:22.299600Z

I have 130 Write provisioned, but only 2 is used

tony.kay 2020-11-03T22:07:55.300300Z

are you running your application on the compute group? Or are you carefully directing clients to query groups that service a narrow number of dbs? If you hit the compute group randomly for app stuff, then you’re going to really stress the object cache on those nodes.

tony.kay 2020-11-03T22:08:52.300500Z

which will lead to segment thrashing and all manner of badness

kschltz 2020-11-03T22:09:14.300700Z

Im pointing my client directly to compute group

tony.kay 2020-11-03T22:10:43.301100Z

yeah, I don’t work for cognitect, but my understanding of how it works leads me to the very strong belief that doing what you’re doing will not scale. Remember that each db needs it’s own RAM cache space for queries. The compute group has no db affinity, so with 20 dbs you’re ending up causing every compute node to need to cache stuff for all 20 dbs.

kschltz 2020-11-03T22:11:06.301300Z

@tony.kay would you say it would be best if I transacted to a query group fed by a specific set of databases?

tony.kay 2020-11-03T22:11:40.301500Z

right, so a given user goes with a given db?

tony.kay 2020-11-03T22:12:11.301700Z

(a given user won’t need to query across all dbs?)

kschltz 2020-11-03T22:12:22.301900Z

From what Ive read, transactions to query groups end up in compute group

tony.kay 2020-11-03T22:12:31.302100Z

yes, but that is writes, not memory pressure

kschltz 2020-11-03T22:12:32.302300Z

this application is write only

tony.kay 2020-11-03T22:12:44.302500Z

writes always go to a primary compute node for the db in question. no way around that

tony.kay 2020-11-03T22:13:03.302700Z

the problem is probably that you’re also causing high memory and CPU pressure on those nodes for queries

tony.kay 2020-11-03T22:13:30.302900Z

you could also just be ingesting things faster than datomic can handle…that is also possible

tony.kay 2020-11-03T22:13:50.303100Z

but 20dbs on compute sounds like a recipe for trouble if you’re using that for general application traffic

kschltz 2020-11-03T22:14:37.303300Z

I tried shutting my services down and give time to datomic to ingest, but to no avail. IndexMemDB is just a flat line

kschltz 2020-11-03T22:15:38.303500Z

I will give your suggestion a try, thanks in advance

tony.kay 2020-11-03T22:15:43.303700Z

there’s also the possibility that the txes themselves need to read enough of the 20 diff dbs to be causing mem problems. I’d contact support with a high prio ticket and see what they say.

tony.kay 2020-11-03T22:15:56.303900Z

could be something broke 🙂

kschltz 2020-11-03T22:17:39.304100Z

The way things are built, there is a client connection for each one of the databases, depending on the body of a tx it is transacted to a specific db

tony.kay 2020-11-03T22:18:25.304400Z

the tx determines the db?

kschltz 2020-11-03T22:18:30.304600Z

yes

tony.kay 2020-11-03T22:19:29.304800Z

ooof. much harder to pin limited dbs to a query group then.

tony.kay 2020-11-03T22:19:41.305Z

good luck

kschltz 2020-11-03T22:19:57.305200Z

Thanks