@lanejo01 I was able to deploy on beanstalk, but when I use the following line:
(d/client {
:server-type :ion
:region "us-east-1" ;; e.g. us-east-1
:system "humboi-march-2021"
:creds-profile "humboi"
:endpoint "<http://entry.humboi-march-2021.us-east-1.datomic.net:8182>"
:proxy-port 8182
})
I get “Unable to connect to localhost:8182” in the beanstalk logsHere are the logs. Notice line 363.
I’m guessing that’s because the following command isn’t run on beanstalk:
./datomic client access humboi-march-2021 -p humboi -r us-east-1
But then how can one run this command on beanstalk?datomic solo up <system name> --wait
fails with
Upping <system name>-TxAutoScalingGroup-89CAXXFQ591C
Upping <system name>-BastionAutoScalingGroup-128DD1GX31OU7
Waiting for gateway to start.
Execution error (ExceptionInfo) at datomic.tools.ops.process/sh! (process.clj:64).
Shell command failed
it doesn't throw this error without the --wait
option though, so at least it's usable.
anyone with the same error or anyone who is using it successfully?
couldn't find an posts on http://forum.datomic.com about this issue, so i will make one eventually.ehhhh, it's shelling out to the aws
cli command, which i don't have in my specific environment:
[{:type clojure.lang.ExceptionInfo,
:message "Shell command failed",
:data
{:args
("aws"
"ec2"
"wait"
"instance-running"
"--filters"
"Name=tag:Name,Values=xxx-datomic-system-bastion"
"Name=tag:datomic:system,Values=xxx-datomic-system"
"Name=instance-state-name,Values=running"),
i got this error from the stack trace which was saved into /var/folders/dm/bjgtcwgx7nqfh3flbpq7m0qc0000gn/T/clojure-927549602744154670.edn
such trace file paths are always printed at the end of clojure cli errors, but i've noticed that ppl often forget to look inside them.
anyone using divert-system? I’m not really sure what it means actually… I’ve only used local dev with locally created database in file storage. Now I’m looking into having test envs that have a copy of a cloud database as basis
what does divert actually do? do queries copy data from the diverted system
you use import-cloud
to import (a subset of your) data from your cloud system to your local storage, then divert-system
will direct queries to be answered via the local storage instead of prod
yeah, I read that page but it wasn’t clear to me what it does… ok so import-cloud is the one I need
The DesiredCapacity knob is a bit strange when deploying a query group. We have our query groups deployed with auto scaling, so the ASG is "managing" the desired count. If I change a parameter (e.g., MaxSize) and have DesiredCapacity set, CloudFormation will actually set the DesiredCapacity to the value passed in. This is pretty nasty, particularly in the situation where you'd be increasing the MaxSize (e.g., DesiredCapacity is set to 2 and MaxCapacity is set to 4. You're currently running at MaxCapacity and need to immediately increase MaxCapacity to meet current demand. You set MaxCapacity to 6 and update the CF script. CF will set MaxCapacity and DesiredCapacity to 2. This makes the high demand situation worse!). The safest workaround seems to be to always have DesiredCapacity set to MaxCapacity. Does the Datomic team have any advice on how to handle a situation like this?
FWIW, we deploy our services onto Fargate via Pulumi. Pulumi has the ability to set "ignoreChanges" on certain properties (e.g., ignoreChanges: ["desiredCount"]
). This lets me set an initial desiredCount for my Fargate service and ignore any changes that have occurred since initialization. It would see a similar knob for query groups would solve this issue.
@kenny are you making use of drift detection?
No. I've seen it in the console before but never used it and I am not familiar with it. Does it help in this situation?
yes it helps identify drift between reality and the CF template
CF manages the ASG which manages the instances
the ASG is not the head honcho
I would recommend consulting the drift detection before making manual changes to resources
In the case of DesiredCapacity, why do I care about "drift"?
A change in DesiredCapacity doesn't seem like drift.
Drift sounds like something unexpected. I fully expect DesiredCapacity to change 🙂
desired capacity is an ASG parameter
it's part of the ASG API, right?
I get that it's the thing most likely to be manually tweaked outside of source control
bbiab
I read https://aws.amazon.com/blogs/aws/new-cloudformation-drift-detection/ on drift detection. It sounds like a heck of a lot of extra work. When updating an ASG, you don't need to set desired capacity again.
Drift detection also doesn't solve the problem. Say I run the drift detection and see the actual ASG desired capacity is different than the capacity set in the CF parameters. What action am I supposed to take? Change my CF DesiredCapacity param update to match the current state? That's not what I actually want. I want to update MaxCapacity and leave DesiredCapacity unchanged. I don't want all future updates to set DesiredCapacity to the value I am forced to set it to for this particular update. Worse, DesiredCapacity could have changed from the time I ran the drift detect and the time I run the CF update.
drift detection is just a tool that exposes changes made to resources under management by a cloudformation stack Is there a particular reason you don't want to change Min/Max/Desired via CloudFormation?
just trying to understand, not suggest anything specific
If you are importing an existing db. If you just want to create a local test db divert-system
will direct calls to local.
@kenny this is happening in what context? A datomic version upgrade A datomic parameter update Something else?
Kenny, I think it's required because we are setting up an ASG with the CFT (per AWS requirements) when you launch the CFT. I will double check that with the team.
Looks like it's optional: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html#cfn-as-group-desiredcapacity
From the doc ^ > If you do not specify a desired capacity when creating the stack, the default is the minimum size of the group.
Since this conversation is likely to fall off of the Slack retention window, I have added a question here: https://ask.datomic.com/index.php/603/desiredcapacity-optional-parameter-query-group-template
I am changing the Max via CF. I'm saying, perhaps poorly 🙂, that changing via CF has undesirable side effects.
Changing MaxCount from 4 -> 6.
The example I gave is the exact thing that happened to us 🙂
What does this have to do with datomic cloud? Are you reporting a bug or bemoaning how CF works?
With the exception of DesiredCapacity, all query group CF parameters are managed (updated/changed) via a CF update. The DesiredCapacity is only ever controlled by the ASG scaling policy.
(I'm not saying it doesn't have anything to do with Datomic cloud, I just don't understand in what usage scenario you're running into this.)
So, is this a feature request?
I think Datomic's CF implementation leads to undesirable results (though, I am not a CF expert so it could be a problem with CF itself). If DesiredCount were an optional parameter, I think this would not be an issue.
Or bug report. Was trying to start from the problem to ensure that the problem was actually a problem.
I thought Desired was part of the cf stack params
That would explain our misunderstandings
So you're saying that the QueryGroup CF template always sets the DesiredCapacity and the MaxCapacity equal to the same value? • When upgrading a stack to a new version? • When updating the stack for some reason? • Something else is happening?
The problem is that CF is trying to manage the DesiredCount parameter (ensure actual DesiredCount matches the DesiredCount set in the params). This is problematic because the DesiredCount is entirely managed by the ASG.
Not exactly. The problem is that the CF template is always setting the DesiredCount param.
It is.
It’s a required param.
Besides the very first run, I never want CF to touch the desired count parameter.
And it shouldn't because....?
Ok
And so it's "Always setting the DesiredCount param" • When you're upgrading a version? • Updating a stack with some new config value (like an env-var)? • Something else?
This is the situation: DesiredCapacity is set to 2 and MaxCapacity is set to 4. You're currently running at MaxCapacity and need to immediately increase MaxCapacity to meet current demand. You set MaxCapacity to 6 and update the CF script. CF will set MaxCapacity and DesiredCapacity to 2. This makes the high demand situation worse
Updating MaxCount.
I'm not sure on the other situations.
Intuitively I would expect the same behavior (i.e., DesiredCount gets set to the CF param). Would need to test it to verify ofc.
> DesiredCapacity is set to 2 and MaxCapacity is set to 4. In the ASG or CF? > You set MaxCapacity to 6 and update the CF script Again, ASG or CF?
What is "CF script"? Are you referring to your scripts or the QG CF template parameters? This is an update right?
Oh, sorry. In all cases I mean query group CF template.
And yes, an update.
So, In the template you have the DesiredCount at 2, MaxCount at 4. You're at your max, because you manually changed the ASG DesiredCount to 4 (to match your load). The problem is that when you update your MaxCount from 4 to 6 but leave the DesiredCount at 2, you think it should keep the 4 you manually set in the ASG console? Am I close? If I'm wrong, can you copy the above prose and edit it, then paste it back here in this thread?
So, In the template you have the DesiredCount at 2, MaxCount at 4. You're at your max, because the ASG scaling policy scaled up the group DesiredCount to 4 (to match your load). The problem is that when you update your MaxCount from 4 to 6 via a CF template update but leave the DesiredCount at 2, I think it should keep the 4 the ASG scaling policy set.
What is the ASG scaling policy based upon? Why has it scaled up to 4 machines? CPU, mem?
Cpu
Target track 50%.
Worth opening an ask.Datomic topic on this?
> You set MaxCapacity to 6 and update the CF script. CF will set MaxCapacity and DesiredCapacity to 2. This makes the high demand situation worse Curious, if you set MaxCapacity to 6, why would CF set MaxCapacity to 2? The problem makes sense though, as a dynamic parameter DesiredCapacity is modified by ASG to control behavior. But CF template updates want to set it as a default param. Maybe an optional tick box in the template would solve the issue without breaking changes?
@kenny how are you updating? from CFT? CloudFormation does not change your ASG settings when you update or upgrade. It reads your ASG settings. Maybe I am missing something here, but adding machines is going to potentially bounce the process monitoring CPU or lower the CPU average, right?
Yes, the objective would be lower cpu. Since I’m not fiddling things in the console directly, there could be something in our deployment process affecting this. Let me create a minimal repro and get back to you in a couple hours.
@kenny what do you have set for your warmup time?
300
So the default, then
and did you disable scale in?
And you're using i3.xlarges?
m5
Have you timed how long they take to come up and start accepting traffic? If you start reporting metrics before that, they may start reporting CPU metrics against your ASG policy, lowering the utilization, and killing instances.
I have not. I do not think that was what happened. I recall checking in the ASG activity log and saw a DesiredCount change due to CF (I think). Need to double check. Not at a computer atm and AWS mobile console is hard to navigate.
@kenny I think I understand your issue, let me know if this clarifies it for you. I think the missing piece is that AWS CF looks at parameters it sets. It only knows about parameters it has set. If you have changed your ASG settings outside of CF (i.e. your policy changed your settings) the CF defaults the parameter to the last CF-set parameter. Does that make sense? So you need to manually set these parameters or update your script to query for the currently set "DesiredCount" in you asg and inject that into your DesiredCount CloudFormation parameter.
Yes, I think that’s exactly what happened. I don’t think those extra steps are necessary though. Why is DesiredCount a required parameter?
(Your solution is inherently racy 🙂)