portkey

Portkey: from REPL to Serverless in one call
baptiste-from-paris 2018-03-26T14:28:37.000830Z

and if someone have 10min to help me out with my regex hell I would be a happy man

cgrand 2018-03-26T14:29:14.000082Z

🖐️

baptiste-from-paris 2018-03-26T14:29:18.000840Z

lol

baptiste-from-paris 2018-03-26T14:29:31.000075Z

give me 1min to create a snippet

baptiste-from-paris 2018-03-26T14:31:12.000009Z

So I am working on tests suite from aws to sign v4 as I find out that some cases were not handled. Anyway, They give a raw text file representing a request and I need to do a req-text->req-map and to capture elements from the file

baptiste-from-paris 2018-03-26T14:31:20.000667Z

some of which are optionnal

baptiste-from-paris 2018-03-26T14:31:55.000606Z

this is my (wrong) regex =>

baptiste-from-paris 2018-03-26T14:32:09.000209Z

(defn req-text->req-map
  "Given  a request  from AWS  test*.req, returns  a clj-http  request
  map."
  [input]
  (let [[_ verb uri host date]
        (re-find #"([A-Z]+)\s(\S+).+\nHost:(\S+)\nX-Amz-Date:(\S+)" input)]
    {:request-method verb
     :uri uri
     :host host
     :date date}))

baptiste-from-paris 2018-03-26T14:32:16.000735Z

here are the results =>

baptiste-from-paris 2018-03-26T14:33:11.000003Z

you’ll find nil because I don’t handle My-Header and params yet

baptiste-from-paris 2018-03-26T14:33:19.000750Z

I tried this one without success

baptiste-from-paris 2018-03-26T14:33:47.000735Z

(def input "GET / HTTP/1.1\nHost:<http://example.amazonaws.com|example.amazonaws.com>\nMy-Header1:value2\nMy-Header1:value2\nMy-Header1:value1\nX-Amz-Date:20150830T123600Z")
  
  (let [[_ &amp; a]
        (re-find #"([A-Z]+)\s(\S+).+\n(My-Header\d:value\d\n)/" input)]
    a)

baptiste-from-paris 2018-03-26T14:34:51.000365Z

and I can’t figure out how capturing the optional multiple My-Header

baptiste-from-paris 2018-03-26T14:35:02.000904Z

hint : I really suxx at regex

cgrand 2018-03-26T14:46:28.000005Z

That’s the last missing case or are there more tests with more headers? I’m not sure regexes are the answer

cgrand 2018-03-26T14:48:26.000582Z

Is GET / HTTP/1.1\nHost:<http://example.amazonaws.com|example.amazonaws.com>\nMy-Header1:value1\n value2\n value3\nX-Amz-Date:20150830T123600Z even valid?

cgrand 2018-03-26T14:53:30.000331Z

>>> Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT.

baptiste-from-paris 2018-03-26T14:54:05.000214Z

@cgrand I don’t catch Param1=value1 also

baptiste-from-paris 2018-03-26T14:54:14.000411Z

how else than regex

baptiste-from-paris 2018-03-26T14:54:15.000168Z

?

cgrand 2018-03-26T14:54:55.000248Z

manual parsing or several regexes stages

cgrand 2018-03-26T14:58:52.000421Z

#“([A-Z]+)\s(\S+).+\nHost:(\S+)\n((?:My-Header\d:.*\n(?:[ \t].*\n)*)*)X-Amz-Date:(\S+)”

cgrand 2018-03-26T14:59:34.000585Z

But seriously, don’t do that

cgrand 2018-03-26T15:03:21.000171Z

A HTTP parsing lib?

baptiste-from-paris 2018-03-26T15:05:22.000661Z

lol

baptiste-from-paris 2018-03-26T15:05:45.000125Z

Ok, I’ll look at some HTTP Parsing lib

baptiste-from-paris 2018-03-26T15:08:09.000375Z

But just for information, If you really had to do regex, it’s possible right ?

cgrand 2018-03-26T15:09:42.000282Z

I would read the file as lines, parse the 1st line as method path protocol

cgrand 2018-03-26T15:10:19.000840Z

hmmm no

cgrand 2018-03-26T15:11:50.000282Z

so you consume the 1st line and then re-seq on headers

cgrand 2018-03-26T15:19:58.001008Z

(let [req “GET / HTTP/1.1\nHost:<http://example.amazonaws.com|example.amazonaws.com>\nMy-Header1:value1\n  value2\n     value3\nX-Amz-Date:20150830T123600Z”
      [_ method path headers] (re-matches #“(?s)([A-Z]+)\s+(\S+).*?\n(.*)” req)
      headers (for [[_ header value] (re-seq #“(?s)(\S+):(.*?\n(?:[\t ].*?\n)*)” (str headers “\n”))]
                [header value])]
  [method path headers])

cgrand 2018-03-26T15:20:18.000450Z

yields

[“GET”
 “/”
 ([“Host” “<http://example.amazonaws.com|example.amazonaws.com>\n”]
  [“My-Header1” “value1\n  value2\n     value3\n”]
  [“X-Amz-Date” “20150830T123600Z\n”])]

baptiste-from-paris 2018-03-26T19:11:13.000639Z

I don’t find libs that could do the job, I tried with org.apache.httpclient but I can’t get the request body when a POST request

cgrand 2018-03-26T19:21:14.000189Z

And my snippet above?

baptiste-from-paris 2018-03-26T19:22:24.000089Z

let me try, I was focusing on parsing raw HTTP with apache httpclient ^^

baptiste-from-paris 2018-03-26T19:49:25.000212Z

headers are not supposed to be unique ?

baptiste-from-paris 2018-03-26T19:49:35.000705Z

key => unique ;

cgrand 2018-03-26T20:18:42.000306Z

No. Some may be multi valued and it’s a way to encode that.

baptiste-from-paris 2018-03-26T20:46:03.000126Z

a first draft that works well for headers but not post param=value

baptiste-from-paris 2018-03-26T20:46:16.000490Z

(defn req-text-&gt;req-map-revisited [req-text]
  (let [is (ByteArrayInputStream. (.getBytes req-text (StandardCharsets/UTF_8)))
        session-input-buffer (doto (SessionInputBufferImpl. (HttpTransportMetricsImpl.) (* 8 2048))
                               (.bind is))
        basic-http-request (.parse (DefaultHttpRequestParser. session-input-buffer))
        headers (for [h (.getAllHeaders basic-http-request)]
                  [(.getName h) (.getValue h)])
        headers (into {}
                      (x/by-key (comp (interpose ",")
                                      x/str))
                      headers)
        request-line (.getRequestLine basic-http-request)]
    (cond-&gt;
     {:uri (.getUri request-line)
      :request-method (.getMethod request-line)}
      (not (or (nil? headers) (empty? headers))) (assoc :headers headers))))

baptiste-from-paris 2018-03-26T20:46:21.000208Z

which returns

baptiste-from-paris 2018-03-26T20:47:24.000391Z

{:uri "/", :request-method "GET", :headers {"Host" "<http://example.amazonaws.com|example.amazonaws.com>", "My-Header1" "value2,value2,value1", "X-Amz-Date" "20150830T123600Z"}}

baptiste-from-paris 2018-03-26T21:10:44.000591Z

For info => find in tests

A note about signing requests to Amazon S3:

In exception to this, you do not normalize URI paths for requests to Amazon S3. For example, if you have a bucket with an object named my-object//example//photo.user, use that path. Normalizing the path to my-object/example/photo.user will cause the request to fail. For more information, see Task 1: Create a Canonical Request in the Amazon Simple Storage Service API Reference: <http://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html#canonical-request>