immutant

http://immutant.org Note: dev discussion happens in #immutant on FreeNode IRC.
2016-02-27T01:47:23.000081Z

By the way, neither the :uri nor the :path-info should be url-decoded.

2016-02-27T01:47:41.000082Z

:uri is always the raw URI. And (str context path-info) == uri

2016-02-27T01:47:51.000083Z

Just to clear things up. :simple_smile:

2016-02-27T13:39:17.000084Z

@weavejester: in the servlet spec, path-info is always returned decoded from HttpServletRequest.getPathInfo()

2016-02-27T13:40:09.000085Z

so (str context (encode path-info)) == uri

2016-02-27T14:30:14.000086Z

@tcrawley: Right, but one of the nice things about Ring is that we don’t need to make the same mistakes as servlets. Because URL encode/decode is lossy, particularly since Java changes “+” into “ “, it makes more sense to keep path-info raw.

2016-02-27T14:31:54.000087Z

ah, so you're saying immutant should instead calculate the raw path-info when constructing the request map instead of using getPathInfo? That might work

2016-02-27T14:32:57.000088Z

Yep. I actually had assumed that the servlet spec didn’t decode PathInfo, which is why there isn’t a huge amount of documentation for it.

2016-02-27T14:33:59.000089Z

getContextPath isn’t decoded, strangely.

2016-02-27T14:34:31.000090Z

So you can just use (ring.util.request/set-context request (.getContextPath servlet-request))

2016-02-27T14:36:28.000091Z

Great, thanks for your input. I'll take a look

2016-02-27T15:01:18.000092Z

@weavejester: i take your point, but since context and path-info are strictly servlet things (not ring), why not follow their spec? someone familiar with getPathInfo isn't going to expect to see that "+" in :path-info

2016-02-27T15:02:13.000093Z

jcrossley3: They’re not strictly servlet things. Other libraries, like Compojure, make use of them.

2016-02-27T15:02:41.000094Z

I also don’t see the point of adopting mistakes made in earlier libraries just for compatibility.

2016-02-27T15:02:45.000095Z

well, i meant that they're not http things, and therefore not ring things

2016-02-27T15:03:11.000096Z

They’re not in the Ring spec, but they are in the Ring library, like :params or :session.

2016-02-27T15:04:18.000097Z

i guess it's only context that is servlet-y. there is a CGI variable PATH_INFO

2016-02-27T15:04:56.000098Z

Sure, but again, why adopt a faulty convention?

2016-02-27T15:04:58.000099Z

still, it does make me wonder why they would explicitly spec it to be decoded

2016-02-27T15:05:11.000100Z

that's a pretty explicit mistake :simple_smile:

2016-02-27T15:05:38.000101Z

I don’t mean that it was accidental; just that it was a mistake in the design.

2016-02-27T15:06:18.000102Z

It looks like the full URI of servlets is: contextPath + servletPath + pathInfo

2016-02-27T15:06:29.000103Z

But servletPath and pathInfo are decoded

2016-02-27T15:06:40.000104Z

and contextPath and the URI isn't

2016-02-27T15:07:10.000105Z

And because URL decodes are lossy, it’s not a direct mapping.

2016-02-27T15:10:15.000106Z

do you know if the http rfc says anything about pathinfo?

2016-02-27T15:10:34.000107Z

It doesn’t.

2016-02-27T15:10:48.000108Z

It’s not something that’s part of the HTTP request.

2016-02-27T15:11:04.000109Z

It’s just something made up originally by CGI scripts.

2016-02-27T15:13:58.000110Z

https://groups.google.com/forum/#!topic/rack-devel/3fRiLmm-pT8 some background, i guess

2016-02-27T15:15:48.000111Z

:path-info might have been a poorly chosen name, considering that it’s slightly different to the PATH_INFO in CGI, and getPathInfo in servlets.

2016-02-27T15:16:07.000112Z

However, at the time I didn’t realise that

2016-02-27T15:16:50.000113Z

I think that's where my confusion came from - since it's not in the ring spec, and named the same as the servlet value, I assumed it was the servlet value

2016-02-27T15:17:10.000115Z

That’s a reasonable assumption.

2016-02-27T15:17:24.000116Z

I’ll make the documentation clearer in the next version of Ring.

2016-02-27T15:18:04.000117Z

I had assumed that :path-info in Ring was the same as the servlet version, but it turns out that’s not the case.

2016-02-27T15:18:57.000118Z

I might use :path in future and deprecate :path-info… but that likely won’t happen for a while.

2016-02-27T15:19:31.000119Z

I was surprised that is was decoded by servlets, I only discovered that yesterday

2016-02-27T15:19:48.000120Z

Yeah. My guess is that it’s to follow the CGI definition

2016-02-27T15:19:54.000121Z

Which makes sense for shell scripts

2016-02-27T15:20:06.000122Z

But doesn’t really make sense for modern web applications.

2016-02-27T15:20:45.000123Z

Reading that Rack thread talks about lossy decoding as well

2016-02-27T15:20:56.000124Z

But because Rack follow the CGI spec they have no choice

2016-02-27T15:21:51.000125Z

Fortunately Ring doesn’t need to, so we can break with tradition a little. Ideally it would be named something different, but for now it can just be documented better.

2016-02-27T15:23:47.000126Z

I know you've stated that it shouldn't be part of the ring spec, but if it was part of the spec, it would be a stronger stance than documentation of the implementation's behavior would be

2016-02-27T15:25:19.000127Z

Well, the Ring spec just covers HTTP. However, it might be worth putting up something else for the keys introduced by the library itself, like :context, :path-info, :params, :session and so forth.

2016-02-27T15:25:26.000128Z

@weavejester: not to dissuade you from fixing the web, but do you see no benefit in keeping ring at least a little in-line with wsgi, rack, servlets, etc?

2016-02-27T15:26:19.000129Z

e.g. after introducing :path, you might then decode :path-info rather than deprecate it

2016-02-27T15:26:43.000130Z

jcrossley3: I do see a benefit in being consistent. However (a) it’s too late now, and (b) the problems a decoded :path-info provides outweigh the advantages of compatibility, IMO.

2016-02-27T15:27:19.000131Z

problems with existing apps/middleware you mean?

2016-02-27T15:28:14.000132Z

Yes. I’d prefer not to change the behaviour of an existing key.

2016-02-27T15:28:39.000133Z

I’m careful about backward compatibility in Ring.

2016-02-27T15:29:42.000134Z

i can respect that, but like you say, it'd be good to be clear about where you diverge with CGI, et. al.

2016-02-27T15:29:53.000135Z

jcrossley3: I don't see users getting confused about :path-info being raw in ring, as long as it is clearly documented

2016-02-27T15:30:05.000136Z

Yes, that I’ll agree on :simple_smile:

2016-02-27T15:30:18.000137Z

The only reason there isn’t documentation is because I didn’t realise that it was different

2016-02-27T15:30:37.000138Z

and clearly no one has complained about it until now, and the complaint was "this shouldn't be decoded"

2016-02-27T15:32:44.000139Z

Yeah

2016-02-27T15:32:54.000140Z

@tcrawley: it confused both of us, duh :simple_smile:

2016-02-27T15:33:06.000141Z

Thanks for bringing this to my attention. I’ll make the documentation clearer.

2016-02-27T15:33:15.000142Z

Gotta go for now. Bye!

2016-02-27T15:33:24.000143Z

@weavejester: thanks!

2016-02-27T15:33:34.000144Z

jcrossley3: sure, but we're not users :)

2016-02-27T15:33:40.000145Z

pb

2016-02-27T20:29:20.000146Z

I just caught this conversation but I’m a bit confused what the outcome was

2016-02-27T20:31:28.000147Z

what is actually going to be documented in the ring documentation?

2016-02-27T20:31:35.000148Z

@weavejester ^

2016-02-27T20:33:01.000151Z

It sounds like nothing is really going to change, and this “bug” such as it is will remain.

2016-02-27T20:34:46.000152Z

It’s certainly not a huge deal to pass in a builder to Undertow to make sure that the url and path info is not decoded, so as to let Ring and Compojure do their thing. But seems like we may need better documentation for both Immutant (in terms of what Undertow is doing) and Ring/Compojure (in terms of where stuff is actually decoded) to help users not flail about when they run into what I did.

2016-02-27T20:35:03.000153Z

Apologies if I’m misunderstanding what the above discussion concluded; I just don’t see where it addresses my particular issue.

2016-02-27T21:08:19.000155Z

@ddellacosta: I’ll add some documentation to the path-info and set-context functions, and probably add a page to the wiki as well documenting the keys that the Ring library adds above and beyond the Ring specification. It’s also a possibility that I can add this into the servlet library as well.

2016-02-27T21:08:55.000156Z

I see—so one will still have to toggle decoding off in Undertow. (I realize this is not your particular concern @weavejester, just clarifying)

2016-02-27T21:09:15.000157Z

That’s right.

2016-02-27T21:09:17.000158Z

Please note also that where I first encountered this was in Compojure’s handling of params extracted from :path-info

2016-02-27T21:09:51.000159Z

Compojure doesn’t decode the path-info as far as I’m aware.

2016-02-27T21:10:12.000160Z

Very deliberately so, since otherwise context wouldn’t work correctly.

2016-02-27T21:10:35.000165Z

whoops

2016-02-27T21:10:39.000166Z

Right, it decodes the parameters once they’re matched.

2016-02-27T21:10:43.000167Z

But not the path-info

2016-02-27T21:11:10.000168Z

hmm, those are extracted from path-info I thought

2016-02-27T21:11:26.000169Z

Right, but it grabs the parameters, and then decodes them.

2016-02-27T21:11:50.000170Z

yes—but if path-info is already decoded, it’s a problem

2016-02-27T21:12:02.000171Z

Sure.

2016-02-27T21:12:09.000172Z

sorry, not sure if I’m simply missing your point here

2016-02-27T21:12:14.000173Z

I’m not sure what the correct behavior is

2016-02-27T21:12:29.000174Z

/should be rather

2016-02-27T21:12:33.000175Z

The correct behaviour is that the :path-info key should not be decoded.

2016-02-27T21:12:37.000176Z

@ddellacosta: I think we will fix this in Immutant by populating the request map with a :path-info that isn't decoded

2016-02-27T21:13:02.000177Z

extracted from the raw uri

2016-02-27T21:13:13.000178Z

okay, that helps clarify things @tcrawley

2016-02-27T21:13:33.000179Z

and in the meantime I will continue to pass in the builder w/DECODE_URL toggled off to handle this, as a workaround

2016-02-27T21:14:09.000180Z

cool, I'm glad you have a workaround for now. I'd love to get an Immutant release out in the next couple of weeks

2016-02-27T21:14:21.000181Z

(and thanks @weavejester too for jumping in and explaining this)

2016-02-27T21:14:33.000182Z

I'll hijack your existing jira for this fix

2016-02-27T21:14:47.000183Z

No problem. Sorry for the confusion.

2016-02-27T21:15:01.000184Z

sure thing—and if I can contribute anything else (happy to write the patch if you’d like) just let me know.

2016-02-27T21:15:13.000185Z

np @weavejester, it all makes sense now re-reading what you wrote initially

2016-02-27T21:17:16.000186Z

if you want to write a patch, that would be swell - you would have to fix it in two places

2016-02-27T21:17:30.000187Z

definitely, would love to contribute

2016-02-27T21:18:06.000190Z

the former is used when inside WildFly or EAP, the latter when using undertow directly

2016-02-27T21:18:15.000191Z

I see, simply ensure that those are not url-encoded?

2016-02-27T21:20:02.000192Z

(I assume I’ll have to possibly extract that info from somewhere else if .getPathInfo/.getRelativePath returns the decoded value by default)

2016-02-27T21:20:12.000193Z

you won't be able to prevent the decoding of the values returned by getPathInfo and getRelativePath - you'll have to ignore those methods and extract the path-info from the uri by stripping off the context, yeah

2016-02-27T21:20:19.000195Z

right, gotcha

2016-02-27T21:20:46.000196Z

okay, think we’re on the same page—will try to get something ASAP

2016-02-27T21:21:01.000197Z

will let you know if I’m going to be slower than by Monday @tcrawley, if that works for you

2016-02-27T21:35:19.000198Z

@ddellacosta: well, this is embarrassing - I did know about path-info being escaped, and how that was the wrong behavior, because I fixed it in Immutant 1: https://github.com/immutant/immutant/tree/1.x/modules/web/src/main/clojure/immutant/web/servlet.clj#L31

2016-02-27T21:35:44.000200Z

haha, whoops!

2016-02-27T21:35:56.000201Z

In fact, I remember it being that way in Immutant 2, but without the comment to explain why

2016-02-27T21:36:04.000202Z

you won’t be the first package maintainer to have forgotten about a fix...

2016-02-27T21:36:27.000203Z

so at some point we said "why aren't we just calling getPathInfo?" and "fixed" it

2016-02-27T21:36:32.000204Z

@jcrossley3: ^

2016-02-27T21:36:56.000205Z

that looks like the right fix too; was trying to find some utils already available to pull that info out of

2016-02-27T21:37:54.000206Z

(sorry, I suppose you would know that better than I)

2016-02-27T21:38:08.000207Z

no problem!

2016-02-27T21:38:16.000208Z

yeah, I think that's the right way to calculate it

2016-02-27T21:38:46.000209Z

in any case, thanks for the heads up on that, I can simply try to port that over

2016-02-27T21:40:52.000210Z

great, thanks! and no rush

2016-02-27T21:41:06.000211Z

and feel free to ping me if you have any questions

2016-02-27T21:41:24.000212Z

will do, thanks @tcrawley !

2016-02-27T21:41:59.000213Z

I did plumb through the commits for Immutant 2, but can't find anywhere where we had it setting a raw path-info, so I feel less dumb

2016-02-27T21:42:37.000214Z

okay, good to know. And you should definitely not feel dumb about this, easy thing to understand

2016-02-27T21:42:49.000215Z

heh, thanks :)