aws-lambda

valtteri 2017-10-21T06:55:48.000047Z

Read/write units really are confusing. By default the UI proposes 5 units for both but for dev and testing you can happily drop them to minimum (1) to save costs.

valtteri 2017-10-21T06:58:30.000001Z

Another confusing thing is indexes. You need those if you want to query efficiently using any other attribute than the primary key.

valtteri 2017-10-21T07:02:27.000037Z

Best thing with DynamoDB is that it’s fully managed (like S3). If you can live with the limitations, it’s a very care free solution.

qqq 2017-10-21T07:23:16.000007Z

I read that DynamoDB shards tables in 10GB chunks

qqq 2017-10-21T07:23:23.000002Z

and then splits the read/write units over the chunks.

qqq 2017-10-21T07:23:53.000040Z

So if you have an 100 GB table, and 10 WRU -- what happens is that there are 10 shards and each shard gets only 1 WRU. Is that true? If so, it sounds really limiting.

valtteri 2017-10-21T08:49:41.000045Z

That’s true. If your dataset is really huge you also need to worry about the underlying partitioning. You need to choose your hash keys so that your data gets evenly distributed between the shards.

valtteri 2017-10-21T08:52:46.000001Z

Also if you need to access data by several keys (indexes), the indexes consume their own capacity and not the table’s capacity.. Which practically means that you need to pay extra for each index.. Or to be precise, there are two kinds of secondary indexes, ‘local’ and ‘global’. Global indexes consume their own capacity and local consume table’s capacity.

valtteri 2017-10-21T09:01:13.000013Z

My experience is that if you have complex data access patterns you need to plan really carefully how you’re going to query the data considering all DynamoDB’s limitations. If you can’t predict how you need to access the data, then it might be simpler to go with SQL and RDS. Another common solution is to use a search engine (i.e. ElasticSearch or CloudSearch) with DynamoDB. You can invoke Lambda functions for each operation on your table (DynamoDB Streams) that can do the indexing to the search engine. CloudSearch also provides some kind of OOB solution to setup indexing automagically from DynamoDB.. But at least a while ago it could deal only with flat data structures.

valtteri 2017-10-21T09:10:37.000017Z

Oh, one more DynamoDB limitation which I should mention, is the lack of support for server side encryption. Java SDK supports client side encryption though.

valtteri 2017-10-21T09:35:45.000025Z

Does anyone know if there’s some util/tool/framework to write CLJS, output it to lambda-compatible JS and package&deploy to AWS using Serverless Framework or AWS SAM? Or would it be to trivial to create such setup myself.. Or does it sound like a bad idea?

valtteri 2017-10-23T05:07:28.000045Z

https://github.com/nervous-systems/serverless-cljs-plugin this is what I was after.

qqq 2017-10-21T10:19:33.000002Z

@valtteri (re last question): I have been using https://github.com/portkey-cloud/portkey , which except for being clj instead of cljs, is exactly what you are asking for, and it's amazing

qqq 2017-10-21T10:20:28.000033Z

@valtteri: thanks for your detailed analysis of dynamodb: as weird as it sounds, I don't need SQL / relational queries, I can tolerate eventual consistency -- but it's the provisioned IO that's turning me away (afaik, Google DataStore / BigTable has no such limitation)

qqq 2017-10-21T10:20:55.000089Z

For truly serverless, if there was no p;rovisioned IO, and it was just "pay for bandwidth / # of read/write OPs, DynamocDB would be perfect)

valtteri 2017-10-21T10:43:44.000030Z

Yes I watched the presentation about Portkey and it seems really cool! However I’d like to leverage Serverless Frameworks power to setup the infra as well (using CloudFormation). Also I’m a bit worried about Java’s slower startup in Lambda compared to Node. I was wondering if there’s some easy way to write lambda code using cljs instead of JavaScript and then use the Serverless Framework to manifest the infra and manage deploys, logs etc. I’m currently using Serverless Framework mostly with Node but I’d really enjoy writing the code in clj/cljs instead.

qqq 2017-10-21T12:23:30.000024Z

The way I intend to get around the jvm startup time is: 1. keep a 512MB machine active at all times (and just pay for it) 2. auto scaling takes care of the rest ... because if it starts spinning up at 70% capacity (or whatever the number), the new nodes should be up by the time the old gets overwhelmed (except in cases of flash traffic -- in whih case all existing tech is going to stall anyway)