aws

http://status.aws.amazon.com/ https://www.expeditedssl.com/aws-in-plain-english
orestis 2020-09-11T09:14:17.029900Z

An interesting discussion on Twitter about Clojure deployment on AWS https://twitter.com/pesterhazy/status/1304131064835772416?s=20

orestis 2020-09-11T09:27:03.032200Z

I’m doing a deployment now and I’m seeing these metrics: • Env. update is starting -> registering first batch to load balancer and waiting to be healthy: 40s • First batch passed health checks: 5m • Deploying to second batch and waiting for health checks: 36s • Second batch passed health checks: 5m

orestis 2020-09-11T09:27:12.032500Z

@pesterhazy ^^

pesterhazy 2020-09-11T11:12:57.033200Z

@orestis thanks for bringing this up here

pesterhazy 2020-09-11T11:13:15.033500Z

what deployment config are you using?

orestis 2020-09-11T11:19:05.034900Z

Just two instances, JDK11 / Coretto, Rolling updates, 1 instance at a time.

pesterhazy 2020-09-11T11:19:07.035Z

I've written up my findings and experiments, with some alternatives sketched (but I haven't figured this out yet by any means): https://gist.github.com/pesterhazy/d0030f559f600d0ce1b3a090173c9c9c

pesterhazy 2020-09-11T11:20:05.035500Z

Any comments appreciated

orestis 2020-09-11T11:20:50.036400Z

“Currently we use Rolling policy with BatchSize=1” -> that means you’re doing the update one-by-one, have you tried using a percent-based batch size? 25% would do two instances at a time, so it would half your deployment time.

orestis 2020-09-11T11:22:19.037Z

I feel your pain though. We were hosted on rackspace before, using plain old VMs and updates took seconds.

orestis 2020-09-11T11:24:55.038200Z

There’s a few things I mean to try but it’s low priority for us ATM: 1. There’s a new split traffic update mechanism https://aws.amazon.com/about-aws/whats-new/2020/05/aws-elastic-beanstalk-traffic-splitting-deployment-policy/ 2. There’s a (literally yesterday) ability to share a non-EB load balancer between different EB environments: https://aws.amazon.com/blogs/containers/amazon-elastic-beanstalk-introduces-support-shared-load-balancers/

orestis 2020-09-11T11:32:58.039800Z

Elastic Beanstalk was nice to get us some peace of mind with minimal ops investment, as we migrated to AWS. But it feels creaky. OTOH, it is Dockerless, and there was some movement lately which makes me hopeful that it’s actively developed and improved.

pesterhazy 2020-09-11T12:41:07.041300Z

> have you tried using a percent-based batch size Yeah that's definitely something we'll try, along with RollingWithAdditionalBatch. I figured we'd try Immutable first, on the assumption that it'd be faster on principle, because it spins up all 8 instances concurrently. But that doesn't seem quite true

pesterhazy 2020-09-11T12:41:44.041400Z

Do you think these have the potential of speeding up deployments?

pesterhazy 2020-09-11T12:43:59.043900Z

Another thought I've had is this: • in the normal day-to-day deployment case, 26 min is probably acceptable; so we can keep using Rolling deployments (or Immutable deployments) • when there's a problem, however, and you need to deploy a hotfix or roll back a change, 26 min is unacceptable. In this case we could manually switch to AllAtOnce before uploading the new app version

orestis 2020-09-11T13:27:01.044500Z

The Immutable spins up new instances from scratch which takes some time.

orestis 2020-09-11T13:27:28.045100Z

Don’t you have to wait to do a configuration deployment to specify AllAtOnce?

orestis 2020-09-11T13:28:01.045700Z

Ah wait — the configuration policy is different than the new application version policy. So you could keep the configuration policy at AllAtOnce indefinetely.

pesterhazy 2020-09-12T12:43:54.080800Z

Just tested this. You can manually switch to AllAtOnce in the console. It takes about a minute. Then deploying a new application version takes less than a minute. The downside is that you have downtime, in our case 12 minutes

pesterhazy 2020-09-12T12:44:49.082Z

This tradeoff may be acceptable when hotfixing a customer facing bug

orestis 2020-09-11T13:28:41.045800Z

The split traffic will probably fix the “mixed results” problem. I’m using session stickyness to overcome it, but I don’t like session stickyness in general.

orestis 2020-09-11T13:29:34.046Z

The shared ELB gives you great flexibility, since you can mix-and-match environments — but I don’t think you can get faster deployments without significant engineering investment in automation, so it’s probably not that relevant.

👍 1
ghadi 2020-09-11T13:58:44.046400Z

I use Fargate to deploy services

ghadi 2020-09-11T13:58:51.046700Z

ask me anything

orestis 2020-09-11T14:15:25.051400Z

@ghadi how long does it take for traffic to hit a new version of code, and how fast is rolling back? Also, does Fargate necessitate Docker? Does it have a nice console to get started? Is it a good fit for “monoliths”? How do you develop locally? What about monitoring things like traffic, memory use etc etc? Do you pick instance sizes eg if you want large machines with lots of memory? What about static assets, do you bundle nginx together with a JVM or do you make separate containers?

orestis 2020-09-11T14:16:07.053100Z

I’m clueless and suspicious about containers and beanstalk gave me an entry point where many things are reasonably automated, but we’re outgrowing it :)

ghadi 2020-09-11T14:16:08.053200Z

traffic: 2-5 minutes

ghadi 2020-09-11T14:16:13.053400Z

yes fargate necessitates docker

ghadi 2020-09-11T14:16:40.054Z

Fargate is "hostless" docker, where AWS manages scheduling your containers magically

ghadi 2020-09-11T14:17:11.054500Z

monolith is broad, so I can't assess if it's a good fit, but if you can containerize your app, it's a start

ghadi 2020-09-11T14:17:56.055200Z

I use an ALB as a load balancer, connecting to Pedestal/Jetty on the containers

orestis 2020-09-11T14:23:12.058400Z

A few more questions: do you use AWS cli to deploy new versions? Is a new version a fresh docker image that you push to a registry, then notify Fargate to run it? How does auto scale work? Is it also suitable for “background” jobs eg if I have some cron jobs can I keep a Fargate container running forever?

orestis 2020-09-11T14:24:09.059Z

I think I just need to sit down and read the Fargate docs :)

orestis 2020-09-11T14:25:15.059900Z

What about the subjective stuff instead, are you happy with it? Would you choose it for a new project?

ghadi 2020-09-11T14:52:58.061300Z

I do everything through cloudformation... I haven't taken up the CDK yet.

ghadi 2020-09-11T14:53:35.062100Z

upstream build job makes a docker image, downstream deployment job updates cloudformation stack

ghadi 2020-09-11T14:54:20.063100Z

since java is so easy to deploy, sometimes I use a stock container that downloads the jar upon startup

ghadi 2020-09-11T14:54:32.063600Z

other times I bake the jar into the container

ghadi 2020-09-11T14:55:06.063800Z

autoscaling in ECS/Fargate https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html

ghadi 2020-09-11T14:55:44.064700Z

yes Fargate containers can run indefinitely

ghadi 2020-09-11T14:56:00.065100Z

e.g. we have some queue pollers that are in fargate

ghadi 2020-09-11T14:56:47.066100Z

Subjectively, I like Fargate, but I've used it for several years now and have some expertise

ghadi 2020-09-11T14:57:07.066600Z

It used to be really expensive compared to EC2, but now it's not as much of a premium

ghadi 2020-09-11T14:57:30.067100Z

I despise managing machines, and prefer to only manage my application

ghadi 2020-09-11T14:57:41.067400Z

don't have to worry about security updates, or ssh access with a container

ghadi 2020-09-11T14:57:53.067600Z

well, not as much

kenny 2020-09-11T15:35:27.071100Z

We used to use Fargate, and I quite liked it. Had to switch due to lack of persistent storage at the time. It's gotten some very nice features since, however. They added capacity providers so using the spot market is easy. You can now attach EFS volumes to a task for persistent storage. We use Datadog for metrics, logs, APM. If you use something similar, all tasks need a sidecar container running which is a small additional added cost per task replica. Deploy everything using Pulumi. Overall Fargate experience has been fantastic. Would also recommend.

orestis 2020-09-11T15:48:16.073200Z

Ah, right, so different vendors. Interesting POV, I thought that using AWS for everything was the norm but it seems not.

orestis 2020-09-11T15:48:46.074100Z

To be honest I don’t want to do anything with this kind of thing so the more hands off the better :)

kenny 2020-09-11T16:26:36.074300Z

I don't know how folks can possibly stand using CW logs. It's terrible compared to Datadog's offering.

orestis 2020-09-11T16:43:30.075900Z

It’s effectively free ;)

kenny 2020-09-11T16:44:13.076500Z

People time is usually far more expensive tho