I understand one usually types their bash script in User data
but I can't seem to find a Clojure library that allows me to do this? Not sure if this is how to do automate the spinning up of a EC2 cluster
Am not allowed to use managed services such as EMR, RDS, Kubernetes (it is a school project)
I understand this exists https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html but automatically do this instead of having to go through the process via a website?
@zackteo did you check this recipe from Spark docs already? https://spark.apache.org/docs/1.6.2/ec2-scripts.html
@valtteri I actually don't understand how this works. Is this a preconfigured OS image?
ohhh I'm not sure if i can expect spark to be locally installed - if this is what it requires
It's a script that you run locally with your AWS credentials and it will spin up the EC2 cluster for you.
Where is this Sparkβs ec2
directory suppose to be located
Probably inside the Spark distribution
I've never used it myself but I'd guess that does all the steps you need to achieve. π
Thanks @valtteri π Let me look into it - I'm not sure if I am allowed to do it this way
Ahh, if the goal is to learn how to set it up from scratch manually then that might be too automated. π However, you can check clues from the script what kind of things are involved and try to reproduce them using Amazonica or whatever
I guess there are many steps involved (firing up the instances, installing the libs, setting up networking, security groups, autoscaling etc..)
I guess my issue is that I understand how I might want to use Amazonica to initialise the instance. But how do I access that instance in an automated way to run script to just install dependencies etc
Yeah, Amazonica just delegates stuff to the AWS Java SDK afaik.. So you need to check how that's done on Java side and then figure out how to tell Amazonica to do that. You can provide the 'user data script' with the runInstances
request if I remember correctly.
It's been a while since I've been bootstrapping EC2-instances..
Perhaps you can shortcut by setting up one of the instances manually and then saving it as a machine image (AMI) and then just telling Amazonica to launch an instance that uses that AMI
unfortunately we need to start off from a base ami :x
aws ec2 run-instances --image-id ami-abcd1234 --count 1 --instance-type m3.medium \
--key-name my-key-pair --subnet-id subnet-abcd1234 --security-group-ids sg-abcd1234 \
--user-data <file://my_script.txt>
user-data
script will be run when the instance launches
Think they expect us to use boto3
from python but that's also just the aws sdk
That's bash right?
Yes
Was trying to find that in the aws sdk but maybe ill just have to use bash
βοΈ there's how to do it with boto
ooo! Thanks π
I tried to find the equivalent on amazonica but to no avail - also considered using babashka with https://github.com/tzzh/pod-tzzh-aws but think it isn't worth the trouble ><. I'm just gonna approach this with boto3 after all π makes more sense to revisit the project and rewrite this part in Clojure if I am interested
I'm thinking that it might be easier to do the setup with Terraform for example, is has great support for the AWS APIs and makes it easier to destroy resources after creation. There is some learning curve, but for a school project, one could use local state file store for example.
that said, I'm too biased, since I like Terraform too much π
@zackteo Shelling out to the AWS cli is pretty common with babashka
@vlesti unfortunately am only allowed to use cloud formation at max ... would like to learn Terraform at some point!
@borkdude ohhh. Are there any AWS and/or spark examples? Though I think that I'll likely end up sticking to bash given my use case of mainly installing dependencies and doing hadoop and spark setup
@zackteo For example: https://github.com/borkdude/babashka/issues/575#issuecomment-712460781 I think @lukaszkorecki might also be doing something like this.
okay!
Btw, could someone explain how I might cloud formation fit into all this? I will also need to create a web app that links to MongoDB and MySQL so probably separate servers for each. I understand that cloud formation just makes things more declarative?
Yes, CloudFormation is a JSON declaration of the AWS resources that you want to setup. It's pretty hard to grasp in the beginning, but I've learned to like it a lot over the years.
If I set it up with cloud formation. How do I insert the instance with scripts I it want executed :o is there also a user-data
part in the declaration?
Or do I need to access the instances in another way?
You embed the script into the cloudformation template JSON. Yes it's ugly. π
"UserData": { "Fn::Base64": { "Fn::Join": [ "",
"#!/bin/bash\n",
"echo 'doing stuff'\n"
] } }
@borkdude yes, that's my current approach.
FWIW I'd recommend Terraform over cloudformation as it covers 99% of the usecases with much nicer syntax and better primitives. And yes, you can specify user-data scripts there. Or use Packer to pre-bake machine images
When calling :GetExport with the cognitect aws api for apigateway I get the following response:
{:logref "4b35cdac-1783-4c1d-bdc2-ef84bc1fa68e",
:message "Unable to build exporter: null",
:cognitect.anomalies/category :cognitect.anomalies/incorrect}
Here is my invocation:
(aws-api/invoke
client
{:op :GetExport
:request {:restApiId api-id
:stageName stage-name
:exportType "swagger"}})
When I use the exact same arguments with boto3 it works, so I think the arguments are right. Anything seem wrong that I am missing?@markbastian this will work:
(aws-api/invoke
client
{:op :GetExport
:request {:restApiId api-id
:stageName stage-name
:exportType "swagger"
:accepts "application/json"}})
I posted an issue https://github.com/cognitect-labs/aws-api/issues/158. Please keep the convo going there if youβre interested.Thanks!