datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
zilti 2020-10-08T01:21:15.310100Z

Is it a known bug that when there's a bunch of datums that get transacted simultaneously, it can randomly cause a :db.error/tempid-not-an-entity tempid '17503138' used only as value in transaction error?

zilti 2020-10-08T01:22:20.311Z

It is often caused by one single entry that is the same structure as many others. Everything is fine, but for some reason, Datomic doesn't like it. Removing that one entry solves the problem.

marshall 2020-10-08T12:55:41.336500Z

why are both of those entity maps in separate vectors? If you’re adding them with d/transact , all of the entity maps and/or datoms passed under the :tx-data key need to be in the same collection

marshall 2020-10-08T12:56:17.336700Z

based on the problem you described, I would expect that error if you transacted the first of those, and then tried the second of those in a separate transaction

marshall 2020-10-08T12:56:26.336900Z

if they’re asserted in the same single transaction it should be fine

zilti 2020-10-08T01:23:18.311300Z

Ordering of the entries in the transaction vector doesn't seem to matter either

zilti 2020-10-08T01:26:53.311600Z

The two datums causing problems:

[{:account/photo
   "REDACTED",
   :account/first-name "REDACTED",
   :account/bio
   "REDACTED",
   :account/email-verified? false,
   :account/location 2643743,
   :account/vendor-skills [17592186045491],
   :account/id #uuid "dd33747e-5c13-4779-8c23-9042460eb3f3",
   :account/vendor-industry-experiences [],
   :account/languages [17592186045618 17592186045620],
   :account/vendor-specialism 17592186045640,
   :account/links
   [{:db/id "REDACTED",
     :link/id #uuid "ea51184c-d027-44d0-8f20-df222e58daf3",
     :link/type :link-type/twitter,
     :link/url "REDACTED"}
    {:db/id
     "REDACTED",
     :link/id #uuid "c9577ca4-332d-41f0-b617-c00e89fc94b4",
     :link/type :link-type/linkedin,
     :link/url
     "REDACTED"}],
   :account/last-name "REDACTED",
   :account/email "REDACTED",
   :account/vendor-geo-expertises
   [17592186045655 17592186045740 17592186045648],
   :db/id "17503138",
   :account/vendor-type 17592186045484,
   :account/roles [:account.role/vendor-admin],
   :account/job-title "Investor"}]
and
[{:account/primary-account "17503138",
   :company/headline "REDACTED",
   :account/accounts ["17503138"],
   :tenant/tenants [[:tenant/name "REDACTED"]],
   :company/name "REDACTED",
   :company/types [:company.type/contact],
   :db/id "REDACTED",
   :company/id #uuid "ee26b11f-53ba-43f9-a59b-f7ad1a408d41",
   :company/domain "REDACTED"}]

favila 2020-10-08T01:36:56.314900Z

The meaning of this error is that the string “17503138” is used as a tempid that is the value of an assertion, but there is no place where the tempid is used as the entityid of an assertion; the latter is necessary for datomic to decide whether to mint a new entity id or resolve it to an existing one

zilti 2020-10-08T01:37:46.315500Z

Well, as you can see in the actual datums I posted, it clearly is being used as :db/id.

zilti 2020-10-08T01:38:29.316600Z

I had my program dump all datums into a file before transacting, and I copied the two that refer to this string over into here

favila 2020-10-08T01:38:52.317300Z

In your example, I see the second item says :account/accounts “17503138”. Are both these maps together in the same transaction?

favila 2020-10-08T01:39:43.318400Z

(Btw a map is not a datum but syntax sugar for many assertions—it’s a bit confusing to call it that)

zilti 2020-10-08T01:40:21.318800Z

Yes, they are both together in the same transaction. True, I mixed up the terminology... Entity would be more fitting

favila 2020-10-08T01:43:48.320800Z

If they are indeed both in the same tx I would call that a bug. Can you reproduce?

favila 2020-10-08T01:44:10.321500Z

Why is each map in its own vector?

zilti 2020-10-08T01:44:26.321700Z

Yes, reliably, every time with the same dataset. Both locally with a dev database as well as on our staging server using PostgreSQL.

zilti 2020-10-08T01:45:08.322500Z

Conformity wants it that way, for some reason

favila 2020-10-08T01:45:27.322900Z

Conformity for data?

zilti 2020-10-08T01:45:32.323300Z

I had that same issue a while back in a normal transaction without conformity as well though

favila 2020-10-08T01:45:45.323800Z

Separate vectors in conformity means separate transactions...

zilti 2020-10-08T01:45:46.324Z

The migration library called conformity

favila 2020-10-08T01:48:01.327300Z

I’ve only ever used conformity for schema migrations; using it for data seems novel; but I’m suspicious that these are really not in the same transaction

favila 2020-10-08T01:49:51.328900Z

See if you can get it to dump the full transaction that fails and make sure both maps mentioning that tempid are in the same transaction

Adrian Smith 2020-10-08T09:02:00.330100Z

During a meetup recording that I haven't uploaded yet I recorded my own maven private token from https://cognitect.com/dev-tools/view-creds.html is there a way I can regenerate that token?

Black 2020-10-08T11:38:00.334300Z

Hey, I just missing something and can't figure out what. I am calling tx on datomic:

(defn add-source [conn {:keys [id name]
                        :or {id (d/squuid)}}]
  (let [tx {;; Source initial state
            :db/id                    (d/tempid :db.part/user)
            :source/id                id
            :source/storage-type      :source.storage-type/disk
            :source/job-status        :source.job-status/dispatched
            :source/created           (java.util.Date.)
            :source/name              name}]
    @(d/transact conn [tx])))

;; and then later API will call
(add-source conn entity-data)
After I call add-source entity is created, but after another call is made old entity is rewritten, only if I call transact with multiple transactions I can create multiple entities, but other than that old entity is being rewritten. I am new to datomic, and I can't find any resources about that, can anyone help?

favila 2020-10-08T12:19:50.334500Z

tempids resolve to existing entities if you assert a :db.unique/identity attribute value on them that already exists. Are any of these attributes :db.unique/identity? are you sure you are not supplying an id argument to your function?

favila 2020-10-08T12:20:25.334700Z

(btw I would separate transaction data creation into a separate function so it’s easier to inspect)

Black 2020-10-08T12:23:18.334900Z

{:db/doc                "Source ID"
           :db/ident              :source/id
           :db/valueType          :db.type/uuid
           :db/cardinality        :db.cardinality/one
           :db/id                 #db/id [:db.part/db]
           :db.install/_attribute :db.part/db}

Black 2020-10-08T12:23:45.335100Z

this is schema for source/id, I am not using :db.unique/identity

Black 2020-10-08T12:24:51.335300Z

And I agree with separation tx and creation but first I would like to get it work

Black 2020-10-08T12:25:43.335500Z

Id I removed :db/id from transaction, I shoud still be able to create new entity, right? But everytime first one is rewritten

favila 2020-10-08T12:26:13.335700Z

can you give a clearer get/expect case? maybe a repl console?

favila 2020-10-08T12:28:10.335900Z

something that shows you calling add-source twice with the returned tx data, and pointing out what you think is wrong with the result of the second call?

Black 2020-10-08T12:42:43.336100Z

Ok I had unique on other parameter:

{:db/doc "Source name"
 :db/ident :source/name
 :db/unique :db.unique/identity
 :db/valueType :db.type/string
 :db/cardinality :db.cardinality/one
 :db/id #db/id [:db.part/db]
 :db.install/_attribute :db.part/db}
If I removed it, all entities are created and it works how I expected. So I will read more about unique attribute, thanks @favila I would not noticed it without your help!

marshall 2020-10-08T13:10:39.337100Z

Can you send an email to <mailto:support@cognitect.com|support@cognitect.com> and we will help with this?

zilti 2020-10-08T14:45:41.338100Z

Well, I guess I am going to do my migrations using a home-made solution now. I just lost all trust in Conformity. It doesn't write anything to the database most of the time I noticed.

zilti 2020-10-08T14:46:22.338300Z

Or are there alternatives?

ghadi 2020-10-08T14:50:23.339100Z

can you describe your problem with conformity in more detail?

Filipe Silva 2020-10-08T15:09:01.344600Z

heya, coming here for a question about datomic cloud. I've noticed that while developing on a repl, I get exceptions as described in the datomic.api.client api:

All errors are reported via ex-info exceptions, with map contents
as specified by cognitect.anomalies.
See <https://github.com/cognitect-labs/anomalies>.
But on the live system, these exceptions don't seem to be ex-info exceptions, just normal errors. At any rate, ex-data returns nil for them. Does anyone know if this is intended? I couldn't find information about this differing behaviour. A good example of these exceptions is malformed queries for q . On the repl, connected via the datomic binary, I get this return from ex-data
{:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message \"Query is referencing unbound variables: #{?string}\", :variables #{?string}, :db/error :db.error/unbound-query-variables, :dbs [{:database-id \"48e8dd4d-84bb-4216-a9d7-4b4d17867050\", :t 97901, :next-t 97902, :history false}]}
But on the live system, I get nil.

marshall 2020-10-08T15:09:56.345100Z

@filipematossilva are you using the same API (sync or async) in both cases?

👋 1
Filipe Silva 2020-10-08T15:11:01.345400Z

think so, yeah

Filipe Silva 2020-10-08T15:11:32.346200Z

have a ion handling http requests directly, and the repl is calling the handler that's registered on the ion

Filipe Silva 2020-10-08T15:11:48.346400Z

so it should be the same code running

Filipe Silva 2020-10-08T15:12:23.346900Z

we can see on the aws logs that the error is of a different shape

Filipe Silva 2020-10-08T15:12:28.347100Z

let me dig it up

Filipe Silva 2020-10-08T15:13:14.347400Z

on the aws logs, logging the exception, shows this

Filipe Silva 2020-10-08T15:13:16.347800Z

{
    "Msg": "Alpha API Failed",
    "Ex": {
        "Via": [
            {
                "Type": "com.google.common.util.concurrent.UncheckedExecutionException",
                "Message": "clojure.lang.ExceptionInfo: :db.error/not-a-binding-form Invalid binding form: :entity/graph {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message \"Invalid binding form: :entity/graph\", :db/error :db.error/not-a-binding-form}",
                "At": [
                    "com.google.common.cache.LocalCache$Segment",
                    "get",
                    "LocalCache.java",
                    2051
                ]
            },
            {
                "Type": "clojure.lang.ExceptionInfo",
                "Message": ":db.error/not-a-binding-form Invalid binding form: :entity/graph",
                "Data": {
                    "CognitectAnomaliesCategory": "CognitectAnomaliesIncorrect",
                    "CognitectAnomaliesMessage": "Invalid binding form: :entity/graph",
                    "DbError": "DbErrorNotABindingForm"
                },
                "At": [
                    "datomic.core.error$raise",
                    "invokeStatic",
                    "error.clj",
                    55
                ]
            }
        ],

Filipe Silva 2020-10-08T15:13:47.348400Z

(note: this was not the same unbound var query as above)

Filipe Silva 2020-10-08T15:14:20.348800Z

printing the error on the repl, we see this instead

#error {
                                      :cause "Invalid binding form: :entity/graph"
                                      :data {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "Invalid binding form: :entity/graph", :db/error :db.error/not-a-binding-form, :dbs [{:database-id "48e8dd4d-84bb-4216-a9d7-4b4d17867050", :t 97058, :next-t 97059, :history false}]}
                                      :via
                                      [{:type clojure.lang.ExceptionInfo
                                        :message "Invalid binding form: :entity/graph"
                                        :data {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "Invalid binding form: :entity/graph", :db/error :db.error/not-a-binding-form, :dbs [{:database-id "48e8dd4d-84bb-4216-a9d7-4b4d17867050", :t 97058, :next-t 97059, :history false}]}
                                        :at [datomic.client.api.async$ares invokeStatic "async.clj" 58]}]

marshall 2020-10-08T15:14:51.349400Z

that ^ is an anomaly

marshall 2020-10-08T15:14:54.349600Z

which is a data map

Filipe Silva 2020-10-08T15:15:49.350400Z

more precisely, (ex-data e) returns the anomaly inside that exception

marshall 2020-10-08T15:16:23.350700Z

ah, instead of ex-info ?

Filipe Silva 2020-10-08T15:17:48.351500Z

I imagine the datomic client wraps the exception doing something like (ex-info e anomaly cause)

Filipe Silva 2020-10-08T15:18:08.352Z

we're not wrapping it on our end, just calling ex-data over it to get the anomaly

Filipe Silva 2020-10-08T15:18:27.352500Z

but on the live system, ex-data over the exception returns nil

Filipe Silva 2020-10-08T15:18:58.353100Z

which I think means it wasn't created with ex-info

Filipe Silva 2020-10-08T15:20:34.353600Z

I mean, I wouldn't be surprised if this is indeed intended to not leak information on the live system

Filipe Silva 2020-10-08T15:20:54.354100Z

that anomaly contains database ids, time info, and history info

zilti 2020-10-08T15:21:15.354200Z

I have a migration that is in a function. Conformity runs the function normally, but instead of transacting the data returned from it, it just discards it. The data is definitely valid; I made my migration so it also dumps the data into a file. I can load that file as EDN and transact it to the db using d/transact perfectly fine.

Filipe Silva 2020-10-08T15:21:42.354700Z

just wanted to make sure if it was intended or not before working around it

zilti 2020-10-08T15:23:55.354800Z

Conformity doesn't even give an error, it just silently discards it.

ghadi 2020-10-08T15:28:12.355100Z

is this cloud or on prem?

ghadi 2020-10-08T15:29:13.355700Z

@filipematossilva are you saying that you are not able to get a :cognitect.anomalies/incorrect from your failing query on the client side?

zilti 2020-10-08T15:30:02.355800Z

On prem, both for the dev backend and the postgresql one

ghadi 2020-10-08T15:33:19.356100Z

not sure what to tell you. you need to analyze this further before throwing up your hands

Filipe Silva 2020-10-08T15:34:20.356700Z

if by client side you mean "what calls the live datomic cloud system", then yes, that's it

ghadi 2020-10-08T15:35:27.357600Z

@filipematossilva so what's different about your "live system" vs. the repl?

ghadi 2020-10-08T15:35:45.358100Z

clearly it's an ex-info at the repl

Filipe Silva 2020-10-08T15:36:01.358800Z

I really don't know, that's what prompted this question

ghadi 2020-10-08T15:36:10.359100Z

perhaps print (class e) and (supers e) in your live system when you get the error

ghadi 2020-10-08T15:36:14.359300Z

or (Throwable->map e)

ghadi 2020-10-08T15:36:53.360100Z

sync api or async api?

Filipe Silva 2020-10-08T15:37:23.360300Z

sync

favila 2020-10-08T15:37:45.360400Z

Conformity does bookkeeping to decide whether a “conform” was already run on that database. If you’re running the same key name against the same database a second time, it won’t run again. Is that what you are doing?

1
favila 2020-10-08T15:38:12.360800Z

Conformity is really for schema management, not data imports

1
Filipe Silva 2020-10-08T15:38:35.361400Z

regarding printing the error

Filipe Silva 2020-10-08T15:39:02.362300Z

I'm printing the exception proper like this:

(cast/alert {:msg "Alpha API Failed"
                   :ex  e})

ghadi 2020-10-08T15:39:14.362800Z

do you have wrappers/helpers around your query? running it in a future?

Filipe Silva 2020-10-08T15:39:15.362900Z

on the live system the cast prints this

ghadi 2020-10-08T15:39:52.363600Z

oh, yeah that's a com.google.common.util.concurrent.UncheckedExecutionException at the outermost layer

ghadi 2020-10-08T15:40:04.364100Z

then the inner exception is an ex-info

Filipe Silva 2020-10-08T15:40:05.364300Z

on the repl, when cast is redirected to stderr, the datomic binary shows this

ghadi 2020-10-08T15:40:09.364700Z

thanks. @marshall ^

Filipe Silva 2020-10-08T15:40:32.365Z

zilti 2020-10-08T15:40:45.365300Z

No, that is not what I am doing.

zilti 2020-10-08T15:41:13.365800Z

Well, the transaction is changing the schema, and then transforming the data that is in there.

zilti 2020-10-08T15:41:23.366200Z

Or at least, that is what it is supposed to be doing.

ghadi 2020-10-08T15:42:18.367Z

jinx

favila 2020-10-08T15:42:20.367200Z

jinx

favila 2020-10-08T15:42:52.367400Z

We’re pointing out a case where it may evaluate the function but not transact

favila 2020-10-08T15:44:09.367600Z

you can use conforms-to? to test whether conformity thinks the db already has the norm you are trying to transact

favila 2020-10-08T15:44:19.367900Z

that may help you debug

Filipe Silva 2020-10-08T15:44:42.368600Z

just realized that the logged response there on the live system wasn't complete, let me fetch the full thing

Filipe Silva 2020-10-08T15:46:15.368900Z

ok this is the full casted thing on aws logs

Filipe Silva 2020-10-08T15:46:36.369Z

ghadi 2020-10-08T15:47:03.369400Z

understood

zilti 2020-10-08T15:47:47.369600Z

Well, what is the second argument to conforms-to? ? It's neither the file name nor the output of c/read-resource

zilti 2020-10-08T15:49:40.370500Z

It wants a keyword, but what keyword?

Filipe Silva 2020-10-08T15:49:41.370700Z

now that I look at the full cast on life, I can definitely see the cause and data fields there

Filipe Silva 2020-10-08T15:50:02.371100Z

which leaves me extra confused 😐

ghadi 2020-10-08T15:50:11.371400Z

let me clarify:

ghadi 2020-10-08T15:51:47.373100Z

in your REPL, you are getting an exception that is: * clojure.lang.ExceptionInfo + anomaly data in your live system you are getting: * com.google.common.util.concurrent.UncheckedExecutionException * clojure.lang.ExceptionInfo + anomaly data

ghadi 2020-10-08T15:52:09.373600Z

where the Ion has the ex-info as the cause (chained to the UEE)

ghadi 2020-10-08T15:52:43.373900Z

make sense? seems like a bug @marshall

ghadi 2020-10-08T15:53:07.374500Z

to work around temporarily, you can do (-&gt; e ex-cause ex-data) to unwrap the outer layer

ghadi 2020-10-08T15:53:17.374900Z

and access the data

Filipe Silva 2020-10-08T15:53:54.375300Z

I can see that via indeed shows different things, as you say

Filipe Silva 2020-10-08T15:54:50.376Z

but the toplevel still shows data and cause for both situations

Filipe Silva 2020-10-08T15:55:27.376800Z

I imagine that data would be returned from ex-data

favila 2020-10-08T15:55:55.377Z

the keyword in the conform map

Filipe Silva 2020-10-08T15:56:13.377500Z

let me edit those code blocks to remove the trace, I think it's adding a lot of noise and not helping

favila 2020-10-08T15:56:32.377600Z

{:name-of-norm {:txes [[…]] :requires […] :tx-fn …}}

favila 2020-10-08T15:56:42.377800Z

the :name-of-norm part

favila 2020-10-08T15:57:12.378100Z

that’s the “norm”

Filipe Silva 2020-10-08T15:57:56.378600Z

done

alexmiller 2020-10-08T15:59:05.380200Z

I think it's important to separate the exception object chain from the data that represents it (which may pull data from the root exception, not from the top exception)

alexmiller 2020-10-08T16:00:10.380900Z

Throwable-&gt;map for example pulls :cause, :data, :via from the root exception (deepest in the chain)

Filipe Silva 2020-10-08T16:02:57.381800Z

@alexmiller it's not clear to me what you mean by that in the current context

Filipe Silva 2020-10-08T16:03:22.382200Z

(besides the factual observation)

Filipe Silva 2020-10-08T16:04:26.383400Z

is it that you also think that the different behaviour between the repl+datomic binary and live system should be overcome by calling Throwable-&gt;map prior to extracting the data via ex-data?

ghadi 2020-10-08T16:05:27.383800Z

root exception is the wrapped ex-info

ghadi 2020-10-08T16:06:08.384700Z

you could do (-> e Throwable->map :data) to get at the :incorrect piece

alexmiller 2020-10-08T16:06:38.385300Z

I’m just saying that the data you’re seeing is consistent with what Ghadi is saying

alexmiller 2020-10-08T16:06:50.385600Z

Even though that may be confusing

Filipe Silva 2020-10-08T16:07:14.385900Z

ok I think I understand what you mean now

Filipe Silva 2020-10-08T16:07:22.386100Z

thank you for explaining

ghadi 2020-10-08T16:07:52.386400Z

but the inconsistency is a bug 🙂

Filipe Silva 2020-10-08T16:19:29.386900Z

currently deploying your workaround, and testing

marshall 2020-10-08T16:20:49.387200Z

@filipematossilva this is in an Ion correct?

Filipe Silva 2020-10-08T16:34:45.387500Z

@marshall correct

Filipe Silva 2020-10-08T16:35:41.387800Z

in a handler-fn for http-direct

Filipe Silva 2020-10-08T16:36:31.388300Z

@ghadi I replaced my (ex-data e) with this fn

(defn error-&gt;error-data [e]
  ;; Workaround for a difference in the live datomic system where clojure exceptions
  ;; are wrapped in a com.google.common.util.concurrent.UncheckedExecutionException.
  ;; To get the ex-data on live, we must convert it to a map and access :data directly.
  (or (ex-data e)
      (-&gt; e Throwable-&gt;map :data)))

Filipe Silva 2020-10-08T16:36:49.388700Z

I can confirm this gets me the anomaly for the live system

Filipe Silva 2020-10-08T16:37:09.389300Z

slightly different than on the repl still

Filipe Silva 2020-10-08T16:37:44.389800Z

live:

{:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "Invalid binding form: :entity/graph", :db/error :db.error/not-a-binding-form}
repl:
{:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message \"Invalid binding form: :entity/graph\", :db/error :db.error/not-a-binding-form, :dbs [{:database-id \"48e8dd4d-84bb-4216-a9d7-4b4d17867050\", :t 97901, :next-t 97902, :history false}]}

1
Filipe Silva 2020-10-08T16:38:32.390800Z

which makes sense, because in the live exception the :dbs property just isn't there

Filipe Silva 2020-10-08T16:38:41.391100Z

but tbh that's the one that really shouldn't be exposed

Filipe Silva 2020-10-08T16:38:49.391300Z

so that's fine enough for me

Filipe Silva 2020-10-08T16:38:54.391600Z

thank you

Nassin 2020-10-08T16:41:20.392300Z

is there are an official method to move data from dev-local to cloud?

Filipe Silva 2020-10-08T16:49:37.392400Z

the workaround is fine enough for me, but maybe you'd like more information about this?

marshall 2020-10-08T17:40:27.392600Z

nope, that’s enough thanks; we’ll investigate

Chicão 2020-10-08T19:54:07.394500Z

Does anyone know how I get the t from tx (d/tx-&gt;t tx), but my tx is a map and the error in the conversion?

{:db-before datomic.db.Db@ad41827d, :db-after datomic.db.Db@d8231b67, :tx-data [#datom[13194139534369 50 #inst "2020-10-08T19:30:19.852-00:00" 13194139534369 true] #datom[277076930200610 169 #inst "2020-10-08T06:00:59.275-00:00" 13194139534369 true] #datom[277076930200610 163 17592186045452 13194139534369 true] #datom[277076930200610 165 277076930200584 13194139534369 true] #datom[277076930200610 170 17592186045454 13194139534369 true] #datom[277076930200610 162 277076930200581 13194139534369 true] #datom[277076930200610 167 #inst "2020-10-08T19:30:19.850-00:00" 13194139534369 true] #datom[277076930200610 168 17592186045432 13194139534369 true] #datom[277076930200610 166 #uuid "5f7f68cb-08f0-4cb2-964b-4e811a34a949" 13194139534369 true]], :tempids {-9223090561879066169 277076930200610}}
java.lang.ClassCastException: clojure.lang.PersistentArrayMap cannot be cast to java.lang.Number

csm 2020-10-08T19:59:48.396100Z

You need to grab the tx from a datom in :tx-data , in your case 13194139534369. I think something like (-&gt; result :tx-data first :tx) will give you it

csm 2020-10-08T20:03:29.396700Z

I think also (-&gt; result :db-after :basisT) will give you your new t directly

Chicão 2020-10-08T20:09:40.396800Z

thks

marshall 2020-10-08T20:28:07.397Z

I’ve reproduced this behavior and will report it to the dev team

Adrian Smith 2020-10-08T21:14:55.397200Z

thank you, I've just sent an email over

steveb8n 2020-10-08T23:07:45.399300Z

Q: I want to store 3rd party oauth tokens in Datomic. Storing them as cleartext is not secure enough so I plan to use KMS to symmetrically encrypt them before storage. Has anyone done something like this before? If so, any advice? Or is there an alternative you would recommend?

steveb8n 2020-10-08T23:11:00.399400Z

One alternative I am considering is DynamoDB

ghadi 2020-10-08T23:11:32.399600Z

how many oauth keys? how often they come in/change/expire?

steveb8n 2020-10-08T23:13:54.399800Z

I provide a multi-tenant SAAS so at least 1 set per tenant

steveb8n 2020-10-08T23:14:50.400Z

Also looking at AWS Secrets Manager for this. Clearly I’m in the discovery phase 🙂

steveb8n 2020-10-08T23:14:57.400200Z

but appreciate any advice

ghadi 2020-10-08T23:29:57.400400Z

interaction patterns within KMS are not supposed to be for encryption/decryption of fine granularity items

ghadi 2020-10-08T23:30:24.400600Z

usually you generate key material known as a "DEK" (Data Encryption Key) using KMS

ghadi 2020-10-08T23:30:42.400800Z

then you use the DEK to encrypt/decrypt a bunch of data

steveb8n 2020-10-08T23:30:47.401Z

ok. I can see I’m going down the wrong path with Datomic for this data

ghadi 2020-10-08T23:31:01.401200Z

that's not the conclusion for me

steveb8n 2020-10-08T23:31:06.401400Z

it looks like Secrets Manager with a local/client cache is the way to do

ghadi 2020-10-08T23:31:11.401600Z

you talk to KMS when you want to encrypt/decrypt the DEK

ghadi 2020-10-08T23:31:45.401800Z

so when you boot up, you ask KMS to decrypt the DEK, then you use the DEK to decrypt fine-grained things in the application

ghadi 2020-10-08T23:32:12.402Z

where to store it (Datomic / wherever) is orthogonal to how you manage keys

ghadi 2020-10-08T23:32:49.402200Z

if you talk to KMS every time you want to decrypt a token, you'll pay a fortune and add a ton of latency

ghadi 2020-10-08T23:33:17.402400Z

the oauth ciphertexts could very well be in datomic

steveb8n 2020-10-08T23:33:31.402600Z

if I am weighing pros/cons of DEK/Datomic vs Secrets Manager, what are the advantages of using Datomic?

ghadi 2020-10-08T23:34:05.402800Z

secrets manager is for service level secrets

steveb8n 2020-10-08T23:34:05.402900Z

it seems like the same design i.e. cached DEK to read/write from Datomic

ghadi 2020-10-08T23:34:22.403200Z

you could store your DEK in Secrets manager

steveb8n 2020-10-08T23:34:28.403400Z

the downside would be no excision c.f. Secrets Manager

ghadi 2020-10-08T23:34:58.403600Z

you cannot put thousands of oauth tokens in secrets manager

steveb8n 2020-10-08T23:35:10.403800Z

excision is desirable for this kind of data

ghadi 2020-10-08T23:35:12.404Z

well, depending on how rich you are

steveb8n 2020-10-08T23:35:26.404200Z

I’m not rolling in money 🙂

ghadi 2020-10-08T23:35:27.404400Z

if you need to excise, you can throw away a DEK

steveb8n 2020-10-08T23:35:54.404600Z

hmm. is 1 DEK per tenant practical?

ghadi 2020-10-08T23:36:02.404800Z

I would google keystretching, HMAC, hierarchical keys

steveb8n 2020-10-08T23:36:16.405Z

seems like same scale problem

ghadi 2020-10-08T23:36:19.405200Z

you can have a root DEK, then create per tenant DEKs using HMAC

ghadi 2020-10-08T23:36:27.405400Z

deteministically

steveb8n 2020-10-08T23:36:40.405600Z

ok. that’s an interesting idea. a mini DEK chain

ghadi 2020-10-08T23:37:15.405800Z

tenantDEK = HMAC(rootDEK, tenantID)

steveb8n 2020-10-08T23:37:26.406100Z

then the root is stored in Secrets Manager

ghadi 2020-10-08T23:37:30.406300Z

right

steveb8n 2020-10-08T23:37:39.406500Z

where would the tenant DEKs be stored?

ghadi 2020-10-08T23:37:42.406700Z

need to store an identifier so that you can rorate the DEK periodically

ghadi 2020-10-08T23:37:50.406900Z

you don't store the tenant DEKs

ghadi 2020-10-08T23:37:56.407100Z

you derive them on the fly with HMAC

steveb8n 2020-10-08T23:38:09.407400Z

ok. I’ll start reading up on this. thank you!

ghadi 2020-10-08T23:38:28.407600Z

sure. with HMAC you'll have to figure out a different excision scheme

ghadi 2020-10-08T23:38:38.407800Z

you could throw away the ciphertext instead of the DEK

ghadi 2020-10-08T23:38:49.408Z

because you can't throw away the DEK (you can re-gen it!)

ghadi 2020-10-08T23:38:54.408200Z

etc.

ghadi 2020-10-08T23:39:12.408400Z

but yeah db storage isn't your issue :)

ghadi 2020-10-08T23:39:17.408600Z

key mgmt is

steveb8n 2020-10-08T23:39:28.408800Z

interesting. that means Datomic is no good for this i.e. no excision

steveb8n 2020-10-08T23:39:55.409Z

or am I missing a step?

ghadi 2020-10-08T23:39:59.409200Z

are you using cloud or onprem?

steveb8n 2020-10-08T23:40:08.409400Z

cloud / prod topo

ghadi 2020-10-08T23:40:31.409600Z

stay tuned

steveb8n 2020-10-08T23:40:45.409800Z

now that’s just not fair 🙂

steveb8n 2020-10-08T23:41:00.410Z

I will indeed

ghadi 2020-10-08T23:41:15.410200Z

how often does a tenant's 3p oauth token change?

steveb8n 2020-10-08T23:41:48.410400Z

It’s a Salesforce OAuth so the refresh period is configurable I believe. would need to check

steveb8n 2020-10-08T23:42:21.410600Z

i.e. enterprise SAAS is why good design matters here

steveb8n 2020-10-08T23:42:40.410800Z

I’ll need to build a v1 of this in the coming weeks

steveb8n 2020-10-08T23:50:01.411Z

now that I think about it, I could deliver an interim solution without this for a couple of months and “stay tuned” for a better solution

steveb8n 2020-10-08T23:50:12.411200Z

I’ll hammock this…

steveb8n 2020-10-08T23:50:31.411400Z

🙏