Hello, I submitted a new ident to production that I declared as a float. I never used it (no datom submitted for this field yet) but now want it to be a double instead. I tried to excise and alter but didn't make it. Am I forced to find a new name (~ new ident) for this field? Thanks
---
Ok, I found a way (every step is a chronological migration):
1. I submitted wrong type for an ident
2. I rename this ident (`:domain/field` > :domain/old-float-field
)
3. I declare a new ident that is the same than the old one (`:domain/field`) but with a different type (double in my case).
I can see that those idents in the end have different ids.
Since I never submitted any datom that use the old ident, is there any downside on the technical side and on the domain modelling side?
@dam if you didn't use the "old attribute" then you don't have to migrate values to the new ident. However, you should be aware that d/history will still point to the previous entry and :db/ident is not t-aware. But since you didn't transact against it there won't be much of a downside at all.
Thanks !
Would love any feedback folks have on https://docs.datomic.com/cloud/tech-notes/writing-a-problem-report.html
I also unpinned a bunch of stuff, not sure about the etiquette but when everything is important, nothing is.
This is a good write-up; specifically, it outlines what makes a good report, roughly describes how one goes from experiencing a problem to creating a good report, talks briefly about the tradeoffs surrounding making a good report and how you might iterate from a not-so-good report to a good one (helping reduce the inertia barrier to submitting good reports), and reiterates in multiple places the concrete things that make a report good or not-so-good which helps reinforce the point.
Our Datomic Cloud workloads are very bursty. I am evaluating a switch to ddb on-demand mode instead of provisioned. Is there anything to know about Datomic Cloud & ddb on-demand mode?
What is your hypothesis kenny?
Oh, cool. So 4 the first hour and 1/hour until the day ends. It's an interesting possibility. Not exactly trivial since each query group points to the same ddb table. You'd need to understand when the load would occur for each compute group in the system.
ddb default scaling policy is reactive -- if you know when the load is going to arrive ahead of time, you could make a cron-based policy
or you can take fate into your own hands and have a lambda fire periodically that controls scaling
but -- should only have one controller in charge. Policies don't compose
Could you use the cognitect-labs/aws-api and fire the DDB scaler from the process that starts the batch job? Maybe you could even wait on the scaling completed. I do something similar to start up an expensive EC2 instance running a web driver just before my crawler starts. The aws-api call to start the EC2 instance blocks waiting for the instance to finish starting.
For those interested, we switched to on-demand mode 4 days ago and all the DDB provisioned throughput problems have gone away ๐ As an added bonus, our ddb costs dropped ~30%.
Great to hear Kenny !
It will just work.
It'd surprise me if there was anything baked in about the particular ddb capacity mode. I'd prefer not to be surprised ๐
No, I mean, why do you think on-demand will make X
better?
What is X
?
DDB autoscaling is too slow. By the time DDB scales up to meet our load, the event is over. We could increase the ddb provisioned read or write to meet the max utilization but then we need to pay for peak periods 100% of the time.
You know what I'm going to ask, don't you?
DDB read usage example. Our reads spike very high for a short period of time. DDB auto scales read up. By the time it scaled up, we no longer needed the capacity. We're now paying for a whole bunch of extra capacity until ~17:27 when it scales back down to where it stated. Also, scaling provisioned capacity down is limited to 4 times per day.
So if that event happened more than 4 times (it does), we're stuck with the ddb scaled up bill for the remainder of the day.
Perhaps how I know this is happening? ๐
What happens at 16:00 that causes reads to spike?
Batch job.
Example showing ddb failing to scale down due to hitting max scale down events per day.
You should measure the cost differences between on-demand and provisioned.
I was worried you were assuming that increasing the ddb capacity was going to improve your throughput, but it sounds like you're just trying to optimize cost. FWIW, on-demand is more expensive than provisioned throughput, so you should measure very carefully to make sure you don't end up losing money instead of saving it.
Yep. Sounds like you haven't heard of anyone having issues with switching provisioning modes?
It always depends ๐
Whatever you do, just pay attention to the bill and the perf and make sure it was worth it.
What does it depend on?
What problem that customer actually had.
> increasing the ddb capacity was going to improve your throughput Does this not hold when a compute group is already saturated with ops?
Recur, see above ๐
Right, it depends. What is an example situation in which increasing ddb capacity does not improve throughput?
You're issuing transactions at a faster rate than datomic can index, then you'll get an anomaly back, no matter how much ddb you provision. That sustainable throughput rate is specific to your system though, and can vary between different systems / customers.
Makes sense. Is this the anomaly to which you are referring?
{:cognitect.anomalies/category :cognitect.anomalies/busy, :cognitect.anomalies/message "Busy indexing", :dbs [{:database-id "f3253b1f-f5d1-4abd-8c8e-91f50033f6d9", :t 83491833, :next-t 83491834, :history false}]}
yep
So until indexing is finished, all writes will return that anomaly?
for that database, yes
No effects for other databases? (totally different topic at this point ๐ I just encountered this exact anomaly a day ago so it's of particular interest)
Also, I'm assuming "that database" means all databases listed under the :dbs
key in the anomaly?
Well, your primary group nodes are likely under pretty high load (and CPU utilization) at that point, so, yes there are effects on other databases, because it's allocating resources away to do this big indexing job and process transactions.
Hmm, I guess I'm confused by your "for that database, yes." It sounds like one of these things are true when that anomaly is returned:
1. Writes to any database will fail if a Busy indexing anomaly is returned.
2. All writes to the database currently being indexed (the databases listed under the :dbs
keys) will fail and writes to other databases may or may not succeed.
3. Writes to any database may or may not succeed.
are there any issues with using both memcached and valcache in the same peer (on-prem) deployment at the same time?
Great thanks, that makes a lot of sense
If you understand when your load is going to occur, you could write an lambda that imposes a โperfectโ autoscaling policy
In other words you could take the policy into your own hands, rather than rely on ddbโs scaler, which is reactive
Same process? AFAIK you canโt do this at all
While that is true, youโre still limited by DDBโs maximum of 4 downsizing events per day.
27, not 4
You accumulate an extra event each hour that elapses