An interesting discussion on Twitter about Clojure deployment on AWS https://twitter.com/pesterhazy/status/1304131064835772416?s=20
I’m doing a deployment now and I’m seeing these metrics: • Env. update is starting -> registering first batch to load balancer and waiting to be healthy: 40s • First batch passed health checks: 5m • Deploying to second batch and waiting for health checks: 36s • Second batch passed health checks: 5m
@pesterhazy ^^
@orestis thanks for bringing this up here
what deployment config are you using?
Just two instances, JDK11 / Coretto, Rolling updates, 1 instance at a time.
I've written up my findings and experiments, with some alternatives sketched (but I haven't figured this out yet by any means): https://gist.github.com/pesterhazy/d0030f559f600d0ce1b3a090173c9c9c
Any comments appreciated
“Currently we use Rolling policy with BatchSize=1” -> that means you’re doing the update one-by-one, have you tried using a percent-based batch size? 25% would do two instances at a time, so it would half your deployment time.
I feel your pain though. We were hosted on rackspace before, using plain old VMs and updates took seconds.
There’s a few things I mean to try but it’s low priority for us ATM: 1. There’s a new split traffic update mechanism https://aws.amazon.com/about-aws/whats-new/2020/05/aws-elastic-beanstalk-traffic-splitting-deployment-policy/ 2. There’s a (literally yesterday) ability to share a non-EB load balancer between different EB environments: https://aws.amazon.com/blogs/containers/amazon-elastic-beanstalk-introduces-support-shared-load-balancers/
Elastic Beanstalk was nice to get us some peace of mind with minimal ops investment, as we migrated to AWS. But it feels creaky. OTOH, it is Dockerless, and there was some movement lately which makes me hopeful that it’s actively developed and improved.
> have you tried using a percent-based batch size Yeah that's definitely something we'll try, along with RollingWithAdditionalBatch. I figured we'd try Immutable first, on the assumption that it'd be faster on principle, because it spins up all 8 instances concurrently. But that doesn't seem quite true
Do you think these have the potential of speeding up deployments?
Another thought I've had is this: • in the normal day-to-day deployment case, 26 min is probably acceptable; so we can keep using Rolling deployments (or Immutable deployments) • when there's a problem, however, and you need to deploy a hotfix or roll back a change, 26 min is unacceptable. In this case we could manually switch to AllAtOnce before uploading the new app version
The Immutable spins up new instances from scratch which takes some time.
Don’t you have to wait to do a configuration deployment to specify AllAtOnce?
Ah wait — the configuration policy is different than the new application version policy. So you could keep the configuration policy at AllAtOnce indefinetely.
Just tested this. You can manually switch to AllAtOnce in the console. It takes about a minute. Then deploying a new application version takes less than a minute. The downside is that you have downtime, in our case 12 minutes
This tradeoff may be acceptable when hotfixing a customer facing bug
The split traffic will probably fix the “mixed results” problem. I’m using session stickyness to overcome it, but I don’t like session stickyness in general.
The shared ELB gives you great flexibility, since you can mix-and-match environments — but I don’t think you can get faster deployments without significant engineering investment in automation, so it’s probably not that relevant.
I use Fargate to deploy services
ask me anything
@ghadi how long does it take for traffic to hit a new version of code, and how fast is rolling back? Also, does Fargate necessitate Docker? Does it have a nice console to get started? Is it a good fit for “monoliths”? How do you develop locally? What about monitoring things like traffic, memory use etc etc? Do you pick instance sizes eg if you want large machines with lots of memory? What about static assets, do you bundle nginx together with a JVM or do you make separate containers?
I’m clueless and suspicious about containers and beanstalk gave me an entry point where many things are reasonably automated, but we’re outgrowing it :)
traffic: 2-5 minutes
yes fargate necessitates docker
Fargate is "hostless" docker, where AWS manages scheduling your containers magically
monolith is broad, so I can't assess if it's a good fit, but if you can containerize your app, it's a start
I use an ALB as a load balancer, connecting to Pedestal/Jetty on the containers
A few more questions: do you use AWS cli to deploy new versions? Is a new version a fresh docker image that you push to a registry, then notify Fargate to run it? How does auto scale work? Is it also suitable for “background” jobs eg if I have some cron jobs can I keep a Fargate container running forever?
I think I just need to sit down and read the Fargate docs :)
What about the subjective stuff instead, are you happy with it? Would you choose it for a new project?
I do everything through cloudformation... I haven't taken up the CDK yet.
upstream build job makes a docker image, downstream deployment job updates cloudformation stack
since java is so easy to deploy, sometimes I use a stock container that downloads the jar upon startup
other times I bake the jar into the container
autoscaling in ECS/Fargate https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html
yes Fargate containers can run indefinitely
e.g. we have some queue pollers that are in fargate
Subjectively, I like Fargate, but I've used it for several years now and have some expertise
It used to be really expensive compared to EC2, but now it's not as much of a premium
I despise managing machines, and prefer to only manage my application
don't have to worry about security updates, or ssh access with a container
well, not as much
We used to use Fargate, and I quite liked it. Had to switch due to lack of persistent storage at the time. It's gotten some very nice features since, however. They added capacity providers so using the spot market is easy. You can now attach EFS volumes to a task for persistent storage. We use Datadog for metrics, logs, APM. If you use something similar, all tasks need a sidecar container running which is a small additional added cost per task replica. Deploy everything using Pulumi. Overall Fargate experience has been fantastic. Would also recommend.
Ah, right, so different vendors. Interesting POV, I thought that using AWS for everything was the norm but it seems not.
To be honest I don’t want to do anything with this kind of thing so the more hands off the better :)
I don't know how folks can possibly stand using CW logs. It's terrible compared to Datadog's offering.
It’s effectively free ;)
People time is usually far more expensive tho