datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
lambdam 2021-04-13T10:51:12.260Z

Hello, I submitted a new ident to production that I declared as a float. I never used it (no datom submitted for this field yet) but now want it to be a double instead. I tried to excise and alter but didn't make it. Am I forced to find a new name (~ new ident) for this field? Thanks

lambdam 2021-04-13T11:05:56.264600Z

--- Ok, I found a way (every step is a chronological migration): 1. I submitted wrong type for an ident 2. I rename this ident (`:domain/field` > :domain/old-float-field) 3. I declare a new ident that is the same than the old one (`:domain/field`) but with a different type (double in my case). I can see that those idents in the end have different ids. Since I never submitted any datom that use the old ident, is there any downside on the technical side and on the domain modelling side?

jaret 2021-04-13T11:51:50.267100Z

@dam if you didn't use the "old attribute" then you don't have to migrate values to the new ident. However, you should be aware that d/history will still point to the previous entry and :db/ident is not t-aware. But since you didn't transact against it there won't be much of a downside at all.

lambdam 2021-04-13T12:02:11.267200Z

Thanks !

2021-04-13T13:31:18.267400Z

@eagonmeng wow that was a blog post level reply. Thanks so much for the pointers.

1๐Ÿ‘
stuarthalloway 2021-04-13T14:03:14.268200Z

Would love any feedback folks have on https://docs.datomic.com/cloud/tech-notes/writing-a-problem-report.html

stuarthalloway 2021-04-13T14:04:18.269100Z

I also unpinned a bunch of stuff, not sure about the etiquette but when everything is important, nothing is.

1โค๏ธ14๐Ÿ™Œ
futuro 2021-04-13T16:01:25.269600Z

This is a good write-up; specifically, it outlines what makes a good report, roughly describes how one goes from experiencing a problem to creating a good report, talks briefly about the tradeoffs surrounding making a good report and how you might iterate from a not-so-good report to a good one (helping reduce the inertia barrier to submitting good reports), and reiterates in multiple places the concrete things that make a report good or not-so-good which helps reinforce the point.

kenny 2021-04-13T19:33:33.271700Z

Our Datomic Cloud workloads are very bursty. I am evaluating a switch to ddb on-demand mode instead of provisioned. Is there anything to know about Datomic Cloud & ddb on-demand mode?

Joe Lane 2021-04-13T19:39:13.272100Z

What is your hypothesis kenny?

kenny 2021-04-14T15:28:21.287500Z

Oh, cool. So 4 the first hour and 1/hour until the day ends. It's an interesting possibility. Not exactly trivial since each query group points to the same ddb table. You'd need to understand when the load would occur for each compute group in the system.

ghadi 2021-04-14T15:30:20.287700Z

ddb default scaling policy is reactive -- if you know when the load is going to arrive ahead of time, you could make a cron-based policy

ghadi 2021-04-14T15:31:32.288300Z

or you can take fate into your own hands and have a lambda fire periodically that controls scaling

ghadi 2021-04-14T15:31:51.288500Z

but -- should only have one controller in charge. Policies don't compose

adamtait 2021-04-16T23:24:54.319400Z

Could you use the cognitect-labs/aws-api and fire the DDB scaler from the process that starts the batch job? Maybe you could even wait on the scaling completed. I do something similar to start up an expensive EC2 instance running a web driver just before my crawler starts. The aws-api call to start the EC2 instance blocks waiting for the instance to finish starting.

1โ˜๏ธ
kenny 2021-04-25T21:26:52.399300Z

For those interested, we switched to on-demand mode 4 days ago and all the DDB provisioned throughput problems have gone away ๐Ÿ™‚ As an added bonus, our ddb costs dropped ~30%.

1๐Ÿ‘
Joe Lane 2021-04-25T21:44:34.400300Z

Great to hear Kenny !

kenny 2021-04-13T19:53:35.272200Z

It will just work.

kenny 2021-04-13T19:54:35.272400Z

It'd surprise me if there was anything baked in about the particular ddb capacity mode. I'd prefer not to be surprised ๐Ÿ™‚

Joe Lane 2021-04-13T19:56:14.272600Z

No, I mean, why do you think on-demand will make X better? What is X?

kenny 2021-04-13T20:03:19.272800Z

DDB autoscaling is too slow. By the time DDB scales up to meet our load, the event is over. We could increase the ddb provisioned read or write to meet the max utilization but then we need to pay for peak periods 100% of the time.

Joe Lane 2021-04-13T20:05:26.273Z

You know what I'm going to ask, don't you?

kenny 2021-04-13T20:06:32.273200Z

DDB read usage example. Our reads spike very high for a short period of time. DDB auto scales read up. By the time it scaled up, we no longer needed the capacity. We're now paying for a whole bunch of extra capacity until ~17:27 when it scales back down to where it stated. Also, scaling provisioned capacity down is limited to 4 times per day.

kenny 2021-04-13T20:07:32.273600Z

So if that event happened more than 4 times (it does), we're stuck with the ddb scaled up bill for the remainder of the day.

kenny 2021-04-13T20:07:45.273800Z

Perhaps how I know this is happening? ๐Ÿ™‚

Joe Lane 2021-04-13T20:09:22.274Z

What happens at 16:00 that causes reads to spike?

kenny 2021-04-13T20:09:42.274200Z

Batch job.

kenny 2021-04-13T20:17:44.274400Z

Example showing ddb failing to scale down due to hitting max scale down events per day.

Joe Lane 2021-04-13T20:21:50.274800Z

You should measure the cost differences between on-demand and provisioned.

Joe Lane 2021-04-13T20:24:02.275Z

I was worried you were assuming that increasing the ddb capacity was going to improve your throughput, but it sounds like you're just trying to optimize cost. FWIW, on-demand is more expensive than provisioned throughput, so you should measure very carefully to make sure you don't end up losing money instead of saving it.

kenny 2021-04-13T20:24:36.275200Z

Yep. Sounds like you haven't heard of anyone having issues with switching provisioning modes?

Joe Lane 2021-04-13T20:33:05.275400Z

It always depends ๐Ÿ™‚

Joe Lane 2021-04-13T20:35:36.275600Z

Whatever you do, just pay attention to the bill and the perf and make sure it was worth it.

1โœ”๏ธ
kenny 2021-04-13T20:36:37.275800Z

What does it depend on?

Joe Lane 2021-04-13T20:37:45.276100Z

What problem that customer actually had.

kenny 2021-04-13T20:47:45.276300Z

> increasing the ddb capacity was going to improve your throughput Does this not hold when a compute group is already saturated with ops?

Joe Lane 2021-04-13T20:49:32.276500Z

Recur, see above ๐Ÿ™‚

kenny 2021-04-13T20:52:18.276800Z

Right, it depends. What is an example situation in which increasing ddb capacity does not improve throughput?

Joe Lane 2021-04-13T20:53:37.277Z

You're issuing transactions at a faster rate than datomic can index, then you'll get an anomaly back, no matter how much ddb you provision. That sustainable throughput rate is specific to your system though, and can vary between different systems / customers.

kenny 2021-04-13T20:57:14.277300Z

Makes sense. Is this the anomaly to which you are referring?

{:cognitect.anomalies/category :cognitect.anomalies/busy, :cognitect.anomalies/message "Busy indexing", :dbs [{:database-id "f3253b1f-f5d1-4abd-8c8e-91f50033f6d9", :t 83491833, :next-t 83491834, :history false}]}

Joe Lane 2021-04-13T20:57:20.277500Z

yep

kenny 2021-04-13T20:58:00.277700Z

So until indexing is finished, all writes will return that anomaly?

Joe Lane 2021-04-13T20:58:21.277900Z

for that database, yes

1
kenny 2021-04-13T21:01:22.278200Z

No effects for other databases? (totally different topic at this point ๐Ÿ™‚ I just encountered this exact anomaly a day ago so it's of particular interest)

kenny 2021-04-13T21:02:44.278400Z

Also, I'm assuming "that database" means all databases listed under the :dbs key in the anomaly?

Joe Lane 2021-04-13T21:10:58.278700Z

Well, your primary group nodes are likely under pretty high load (and CPU utilization) at that point, so, yes there are effects on other databases, because it's allocating resources away to do this big indexing job and process transactions.

kenny 2021-04-13T21:19:30.278900Z

Hmm, I guess I'm confused by your "for that database, yes." It sounds like one of these things are true when that anomaly is returned: 1. Writes to any database will fail if a Busy indexing anomaly is returned. 2. All writes to the database currently being indexed (the databases listed under the :dbs keys) will fail and writes to other databases may or may not succeed. 3. Writes to any database may or may not succeed.

2021-04-13T22:19:30.279500Z

are there any issues with using both memcached and valcache in the same peer (on-prem) deployment at the same time?

2021-04-15T13:26:14.304900Z

Great thanks, that makes a lot of sense

ghadi 2021-04-13T22:26:11.280700Z

If you understand when your load is going to occur, you could write an lambda that imposes a โ€œperfectโ€ autoscaling policy

ghadi 2021-04-13T22:26:53.282Z

In other words you could take the policy into your own hands, rather than rely on ddbโ€™s scaler, which is reactive

favila 2021-04-13T22:33:33.282800Z

Same process? AFAIK you canโ€™t do this at all

kenny 2021-04-13T22:47:47.283600Z

While that is true, youโ€™re still limited by DDBโ€™s maximum of 4 downsizing events per day.

ghadi 2021-04-13T22:54:01.284Z

27, not 4

ghadi 2021-04-13T22:54:41.285Z

You accumulate an extra event each hour that elapses

2๐Ÿคฏ