portkey

Portkey: from REPL to Serverless in one call
qqq 2017-10-21T14:06:46.000003Z

What is good for storing state for portkey?

qqq 2017-10-21T14:07:04.000042Z

S3, SimpleDB, RDS, DynamoDB ?

baptiste-from-paris 2017-10-21T14:07:21.000064Z

Use case ?

qqq 2017-10-21T14:08:16.000002Z

I just need a KV store. DynamoDB's provisioned IO spread out over shards makes me uncomfortable.

qqq 2017-10-21T14:08:26.000006Z

RDS gui seems much more complex than SimpleDB / DynamoDB.

qqq 2017-10-21T14:08:33.000070Z

SimpleDb's 10GB limit / domain + 250 domais seems annoying.

qqq 2017-10-21T14:08:50.000058Z

Serverless is supposed to be 'infinitel yscalable', but I can't seem to pick the right state storage.

chris_johnson 2017-10-21T14:45:59.000057Z

> I just need a KV store. DynamoDB's provisioned IO spread out over shards makes me uncomfortable. Yet, DynamoDB is literally the AWS service that provides “a KV store”. Compare with running your own Mongo instance on EC2 for a sense of how much Dynamo does “for free”

tatut 2017-10-21T16:17:20.000086Z

:thinking_face: I guess datomic and the like would be a really poor fir for Lambda

qqq 2017-10-21T16:40:32.000051Z

@tatut: why would datomic be a bad fit for Lambda ?

tatut 2017-10-21T16:41:06.000057Z

from what I understand, datomic peers need potentially large amounts of data pulled from storage

tatut 2017-10-21T16:41:12.000024Z

as queries are run on the peers

tatut 2017-10-21T16:41:37.000015Z

and with Lambda your servers are ephemeral

qqq 2017-10-21T16:42:46.000018Z

I didn't even realize that the peers were "machine doing the calling" rather than "dedicated ec2 machihnes"

qqq 2017-10-21T16:43:50.000017Z

here's an insane idea : 1. assume that assoc-in can also do deletes (via nil) 2. the 'serverless' equiv of storing state would be -- I don't care if it's s3, rds, dynamodb, or whatever else ... 3. you store state for me, I send you a bunch of assoc-ins ... and you charge me based on network traffic + size of data structure

qqq 2017-10-21T16:45:10.000039Z

@chris_johnson: does the auto spreading of provisioned IO across shards not bother you? I find this hard to accept

qqq 2017-10-21T16:45:44.000040Z

the other insane thing: provisioned IO is NOT shared across tables, it's per table; which I also find annoying

chris_johnson 2017-10-21T17:05:22.000022Z

@qqq sharding never has bothered me, but then I use Dynamo primarily as an opaque KV store for Datomic. Without understanding your use case and what difficulty shards cause for you (are you doing BI queries across all your data? Multi-stage data processing where data locality matters across steps? Just don’t like the idea of sharding happening outside your control?) it’s very difficult to share any insight or experience that you would find illuminating.

qqq 2017-10-21T17:09:43.000059Z

@chris_johnson: I agree with your analysis. I read a few blog posts on Google complaining about dynamodb sharding -- and now, whether it's a real issue or not, is deeply bothering me.

qqq 2017-10-21T17:10:41.000015Z

With portkey / lambda, is it possible ot create "private" lambda functions? By default, (pk/mount!) gives me a URL -- and anyone who has that URL can execute the function. I'm interested in something with a bit more security.

chris_johnson 2017-10-21T17:25:14.000054Z

I have just barely begun reading about portkey, but it’s definitely possible to create a Lambda function that isn’t tied to API Gateway at all, or one that exposes an endpoint into a VPC for example

chris_johnson 2017-10-21T17:26:26.000058Z

that might not be the use case pk/mount! is meant to support, of course. Analogous to how the Serverless framework assumes by default that what you want to do is build a REST API without spinning up an EC2 instance.

viesti 2017-10-21T18:09:28.000006Z

@tatut nowadays Datomic has a Client library (http://docs.datomic.com/clients-and-peers.html), queries are run on the server (so object cache survives between redeploy), which makes Datomic usable for serverless (and other thin microservices)

tatut 2017-10-21T18:10:07.000003Z

Nice

viesti 2017-10-21T18:10:30.000025Z

for state, there’s a very tersely described issues here: https://github.com/portkey-cloud/portkey/issues/23

viesti 2017-10-21T18:12:45.000033Z

just before ClojuTre, Christophe presented an idea of having a “state” atom, with bindings for local and Lambda runtime use

viesti 2017-10-21T18:24:09.000055Z

thinking that at the repl, one might want to connect to a local DB (like PostgreSQL or say a local DynamoDB), to a dev DB running in the cloud and then at Lambda runtime to the DB in the cloud

viesti 2017-10-21T18:32:30.000085Z

manual connection configuration might become awkward, for example, Lambda has a way to pass configuration as environment variables for the process (jvm) running the Lambda. Passing these environment variables happens with UpdateFunctionConfiguration API, reading with say (System/getenv "db-passwd"), which is fast compared to a http call to say parameter store service (http://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-paramstore.html). But (System/getenv "db-passwd") at the repl isn’t too neat…

viesti 2017-10-21T18:36:24.000038Z

maybe I’m overthinking this, with (System/getenv "db-passwd") example I had RDS in my mind, for which a connection (pool) could be established at Lambda handler startup with a def, like this: https://github.com/viesti/poll/blob/master/src/poll/core.clj#L6-L12

viesti 2017-10-21T18:39:03.000056Z

with DynamoDB or other DB’s behind HTTP with AWS IAM auth (should write example or it) one would operate relying on IAM roles anyway (Lambda would have role which permits writes to say DynamoDB)

viesti 2017-10-21T18:39:31.000006Z

hum, but I think Christophe’s initial idea was to provide a super simple DB with atom sematics

viesti 2017-10-21T18:42:16.000079Z

pk/deploy! deploys the Lambda, I think we would have other functions for other services, for example pk/schedule for a CloudWatch scheduled Lambda