clojure

New to Clojure? Try the #beginners channel. Official docs: https://clojure.org/ Searchable message archives: https://clojurians-log.clojureverse.org/
2021-01-05T01:49:43.362200Z

Anybody has a trick for discarding an argument when using #() ?

borkdude 2021-01-05T08:00:37.380700Z

@didibus I remember we have been over this on Twitter before where @alexmiller said this was undefined behavior

borkdude 2021-01-05T08:06:54.382400Z

If core says this should be supported I would be happy to look into it.

borkdude 2021-01-05T08:18:10.382600Z

Having said that, I will take a look, if it's not an invasive change, then might fix.

alexmiller 2021-01-05T12:58:02.385800Z

I do not believe this is behavior you should rely on. Weโ€™ve even looked at changes to the discard reader recently re tagged literals or reader conditionals and itโ€™s not clear to me that things would still work this way after those changes. So, please donโ€™t do this.

alexmiller 2021-01-05T12:58:54.386300Z

Itโ€™s so much clearer to just use fn

๐Ÿ’ฏ 1
borkdude 2021-01-05T13:05:52.386500Z

:thumbsup: I agree, and thanks for confirming.

seancorfield 2021-01-05T16:53:07.426800Z

Ah, yeah, I remember the tagged literal discussions around the discard reader... true, if you're going to do that much rework on that reader, you might as well also ensure that the side-effecting parts of arg reader are discarded. It occurred to me overnight that people may well expect this to be a single-argument function: #(* 2 %1 #_%2) -- (map #(* 2 %1 #_%2) (range 10)) throws "Wrong number of args (1) passed"

alexmiller 2021-01-05T16:59:21.427500Z

there are no defined semantics for the combination of discard reader and anonymous function arguments

alexmiller 2021-01-05T16:59:33.427700Z

so you should not have any expectations imo

seancorfield 2021-01-05T17:46:45.428300Z

This might be a good thing for a linter to check then, I guess: if % args appear inside a discarded form inside #( .. ), squawk! ๐Ÿ™‚ /cc @borkdude

borkdude 2021-01-05T17:49:37.428700Z

So the discussion has gone from: babashka doesn't support this, to: clj-kondo should forbid this? ๐Ÿ˜†

1
seancorfield 2021-01-05T18:33:44.430200Z

New evidence was brought to the table: a likely rework of the DiscardReader which may invalidate this construct ๐Ÿ™‚

alexmiller 2021-01-05T18:58:03.430500Z

this is regardless of any potential change in the discard reader (that's just one example of a way in which false expectations could come back to haunt you)

2021-01-05T19:23:05.430700Z

That tweeter stream would imply some other people might be using this trick as well hehe. Personally, I think the intuitive behavior is that discard reader would discard the form, even if used in other reader macros. Thus (mapv #(rand-int #_%) (range 10)) should say Wrong number of args (1) passed to: user/eval8628/fn--8629. I feel this is what most people would assume if you quizzed them on it. And I'd be happy actually if that became a guarantee of the reader, and a formal semantic.

borkdude 2021-01-05T19:25:10.430900Z

This is how bb does it right now

seancorfield 2021-01-05T19:49:47.432700Z

I am now officially sorry that my devious mind came up with this idea in the first place... but I'll blame @didibus for asking that intriguing question yesterday... ๐Ÿ™‚

๐Ÿ˜… 2
2021-01-05T19:55:43.434900Z

(defn &-tag-reader
  [[f & args]]
  `(fn [~'& ~'args] (~f ~@args)))

(set! *data-readers* (assoc *data-readers* '& user/&-tag-reader))

(mapv #&(rand-int 5) (range 10))
;;=> [3 0 4 0 2 0 4 4 4 2]
Maybe this is a more sane approach if I want such convenience. Yes, yes, except for the fact that unqualified tagged literals are reserved for Clojure ๐Ÿ˜›

2021-01-05T01:50:10.362600Z

Say I'm doing: (mapv #(rand-int 10) (range 100))

2021-01-05T01:50:57.363100Z

I guess I can do: (mapv #(do % (rand-int 10)) (range 100)) hum... :thinking_face:. But that's the same as using fn[_]

raspasov 2021-01-05T02:01:43.363600Z

Perhaps not what youโ€™re looking for but:

(vec (repeatedly 100 #(rand-int 10)))

2021-01-05T02:11:59.363800Z

Good answer ๐Ÿ˜›, but ya I was just using this as an example. I kind of often stumble on scenarios where I don't care about the input, and would like a short way to wrap my thing in a side-effect. Like when I use agents for example, sometimes I don't actually care what the current agent value is.

raspasov 2021-01-05T02:26:54.364200Z

Rightโ€ฆ yea I donโ€™t know of a way to ignore it in those cases; accidentally, on ClojureScript your example works, but thatโ€™s very much by accident because of the different way JS works ๐Ÿ™‚

๐Ÿ‘ 1
seancorfield 2021-01-05T03:08:33.364500Z

@didibus (mapv #(#_% rand-int 10) (range 100))

seancorfield 2021-01-05T03:12:11.364700Z

You can use it to ignore multiple anonymous arguments too:

user=> (mapv #(#_%1 #_%2 rand-int 100) (range 10) (range 10))
[47 57 65 16 15 28 56 72 10 82]

seancorfield 2021-01-05T03:18:18.365200Z

(I know this is a bit of a repeat, but I wanted the channel to see it, in a single coherent response ๐Ÿ™‚ )

2021-01-05T03:42:28.366100Z

Wow, neat, haha, that's the exact kind of trick I was looking for.

2021-01-05T03:44:17.366300Z

Ya, I'm a bit worried about this maybe being too depended on accidental implementation details of the Clojure reader though

2021-01-05T03:44:45.366600Z

Like it seems the reader first process the forms with #(), and then with #_

2021-01-05T03:44:55.366800Z

But is this order guaranteed?

2021-01-05T03:45:50.367Z

Nice, it works in different orders too: (mapv #(rand-int 10 #_%) (range 100)) which I find a bit more readable

p-himik 2021-01-05T04:02:57.368100Z

Wouldn't simple fn be even more readable?

โ˜๏ธ 3
dpsutton 2021-01-05T04:05:47.368500Z

that's super neat and crazy and phenomenal to know, but if this is for real code and not playing around, that is absolutely not the trick you are looking for ๐Ÿ™‚

2021-01-05T04:07:17.368700Z

Well, yes and no. Yes cause its ugly to put #_% at the end, and most people might be thrown off by it. But no, for the same reason Rich Hickey added `#()` in the first place :stuck_out_tongue:

seancorfield 2021-01-05T04:10:14.368900Z

Re: different orders -- yeah, it's not going to make any difference where the ignored arg form is. I originally stuck it in the middle: #(rand-int #_% 10)

2021-01-05T04:10:35.369100Z

Well, I don't fully know why he did, but for me, there's something visually nice about the parenthesis not repeating.

2021-01-05T04:11:07.369300Z

Looks like it works in ClojureScript as well

2021-01-05T04:11:27.369500Z

But not in Babashka ๐Ÿ˜ž

seancorfield 2021-01-05T04:11:49.369700Z

File a GitHub issue! @borkdude will be thrilled by this weirdness ๐Ÿ™‚

2021-01-05T04:12:10.369900Z

Haha, he does like that kind of stuff

2021-01-05T04:14:56.370100Z

Yes, I've still not decided if I'm that kind of madman or not haha

dpsutton 2021-01-05T04:18:14.370300Z

its 100% madman. and anyone reading it will be quite confused

2021-01-05T04:18:23.370500Z

But I will def use it when messing around at the REPL

seancorfield 2021-01-05T04:21:18.370700Z

I just looked over the source of the Clojure reader and I think you can rely on this behavior: the ArgReader (which processes % and %<n>) is what tracks the highest arg count in an expression, and the DiscardReader (for #_) has to read the next form in order to discard it. The ArgReader is a reader macro so it will be triggered just by reading. So the arguments are always going to be read (and tracked), even if the resulting form containing them is subsequently discarded.

seancorfield 2021-01-05T04:24:50.370900Z

@dpsutton Have you seen my "trick" with discard forms in deps.edn so you can embed code forms that can be eval'd from an editor?

dpsutton 2021-01-05T04:26:07.371100Z

no. is it in your deps.edn repo?

seancorfield 2021-01-05T04:26:49.371300Z

No, I showed it in my RDD talk to Clojure Provo. I'll show it in my London talk too.

dpsutton 2021-01-05T04:28:41.371500Z

i couldn't make the provo one. if london is convenient for my time zone i'm gonna definitely be there

seancorfield 2021-01-05T04:28:55.371700Z

Because I use add-libs from t.d.a to add deps to a running REPL, and I don't want deps.edn to get out of sync, I add an ns with a :require of t.d.a's repl namespace and then put a repl/add-libs call between :deps and the hash map. Then with just a minor edit, you can run add-libs with all your deps, and then with a minor edit, turn it back into valid EDN.

seancorfield 2021-01-05T04:29:16.371900Z

@dpsutton January 12th, 10:30 am Pacific time.

dpsutton 2021-01-05T04:29:48.372100Z

that's not so terrible. 8:30 here in central. i'll be there with some coffee

seancorfield 2021-01-05T04:31:35.372300Z

Surely it's 12:30 Central?

2021-01-05T04:32:03.372500Z

Its not possible that the DiscardReader runs first, modifies the form, and then the ArgReader would run, no longer seeing the discarded code?

2021-01-05T04:32:28.372700Z

Or to reader macros like all run on the same original forms?

2021-01-05T04:38:27.373500Z

Nice, it also works in Clojerl

seancorfield 2021-01-05T04:43:55.374Z

DiscardReader invokes the reader on the following form: The form to be discarded has to be read. The ArgReader is how the % forms are read. Could the code be changed so that any reader side-effects could also be erased by the DiscardReader? I guess it could but it would be a fair bit of effort to do it correctly without breaking anything I suspect. That said, I'm sure Alex and Rich would say, absolutely don't do this ๐Ÿ™‚

๐Ÿ˜ 1
seancorfield 2021-01-05T04:46:49.374400Z

@p-himik Yeah, if ever I find myself reaching for %2 I take a step back and think about readability!

p-himik 2021-01-05T04:48:49.374600Z

I guess it goes to show how little excitement I have in my life, but seeing #_% within a # function riles me up just as much as reading political news does. And honestly, I'm a bit surprised by that myself. :)

2021-01-05T04:54:52.374900Z

Ya, but:

(let [agents (repeatedly 5 #(agent []))]
  (run! #(call-api client) agents))
So now its between:
(let [agents (repeatedly 5 #(agent []))]
  (run! #(call-api client #_%) agents))
and:
(let [agents (repeatedly 5 #(agent []))]
  (run! (fn [_] (call-api client)) agents))

p-himik 2021-01-05T04:57:45.375300Z

Maybe I'm just waking up, but how does having agents affect anything in your code? It seems like it can be just

(dotimes [_ 5]
  (call-api client))

2021-01-05T04:58:03.375500Z

Oh sorry

p-himik 2021-01-05T04:58:03.375600Z

(and I absolutely without a shadow of a doubt prefer the fn version)

2021-01-05T04:58:53.375900Z

Ya, bad example, well. I can't remember what the situation was, but, when you do lots of doing side effects inside loops there's a few times it comes in handy

2021-01-05T05:01:07.376500Z

Oh, now I remember, it was:

(send-off agnt #(call-api client))

p-himik 2021-01-05T05:04:06.376700Z

"Handy" does not mean "worth it". :) So many more things come in handy in other languages - and all of those are almost exclusively the reason why I stopped using them. Or rather, over-reliance on such things by colleagues and overall language community. Oh, I just figured out why I feel so strongly about #_% - it brings back the memories of having to write "clever" C++ code that heavily relied on macros, templates, and quirks of a particular version of MSVC.

2021-01-05T05:12:35.376900Z

I don't disagree, but some things are handy and worth it. Now this particular one, I don't think is worth it, cause it does still seem like an accident that it works.

๐Ÿ‘ 1
seancorfield 2021-01-05T05:12:43.377100Z

Hahaha... Ah, yes, that brings back memories of being on the ANSI C++ Committee for eight years and having several discussions with the MS rep about VC++

๐Ÿ˜„ 1
seancorfield 2021-01-05T05:13:24.377300Z

(send-off agnt (fn [_] (call-api client))) ?

2021-01-05T05:14:43.377500Z

That said, in clojure, not all scenarios have the same level of needing to be worthy. Sometimes I code Clojure on my phone for example, and on such a device, edits are really hard lol, so this is a nice trick to know. Same thing, sometimes I do things on a command line REPL with terrible read-line support, so you can't move the cursor back, you have to delete everything, etc. So I can see scenarios where this is useful

2021-01-05T05:19:17.377700Z

But... honestly this syntax is growing on me, I feel somethings like this as well are a matter of idiom, people could pretty easily get used to it. #(rand-int 1 #_%) If I read it as: "call rand-int with arg 1 and discard passed in argument" its not that bad actually. A bit like how using _ is an idiom when you discard an arg in fn

2021-01-05T05:20:31.378Z

I won't send a PR with it though I swear ๐Ÿ˜‹

dpsutton 2021-01-05T05:36:19.378700Z

You are totally correct. Had that backwards. Thanks

emccue 2021-01-05T07:53:27.379200Z

This feels like clojure behavior that needs to be silently smothered in 1.11 before anyone realizes it exists

zendevil 2021-01-05T15:27:21.389300Z

I have the following code that Iโ€™m trying to access a webpage with on the route about/something, but Iโ€™m getting 405 error:

(ns humboiserver.routes.home
  (:require
   [humboiserver.layout :as layout]
   [<http://clojure.java.io|clojure.java.io> :as io]
   [humboiserver.middleware :as middleware]
   [ring.util.response]
   [ring.util.http-response :as response]))

(defn home-page [request]
  (layout/render request "home.html" {:docs (-&gt; "docs/docs.md" io/resource slurp)}))

(defn about-page [request]
  (layout/render request "about.html"))

(defn home-routes []
  [""
   {:middleware [middleware/wrap-csrf
                 middleware/wrap-formats]}
   ["/" home-page]
   ["/about"
    ["/something"
     (ring.util.response/response {:something "something else"})]]])
.Home page is rendered, but I expect to see the map returned on accessing localhost:3000/about/something. How to fix this error?

2021-01-05T15:36:32.390600Z

@ps pardon my ignorance, but I can't tell from that snippet what your router is - you have a function that returns a data structure that is clearly meant to describe routes, but no indication of what program is using that structure

zendevil 2021-01-05T15:38:18.391200Z

reitit

2021-01-05T15:38:20.391400Z

405 indicates that the request method is wrong, but nothing in your route description indicates what methods are valid

2021-01-05T15:39:37.392200Z

@ps the reitit examples I see don't use data structure nesting for child routes, they use it for enumerating request methods

2021-01-05T15:40:18.393Z

they would have ["about/something" ...] and [about/something-else] as separate entries

zendevil 2021-01-05T15:40:47.393400Z

but it works if I have a layout/render with that route

2021-01-05T15:41:08.393900Z

OK - I'll let someone that knows reitit help

zendevil 2021-01-05T15:43:59.394400Z

I was thinking that it had something to do with incorrectly using ring.util.response

2021-01-05T15:45:44.395600Z

I would be very surprised if that caused a 405, a 405 has a precise meaning, and to me that points to giving something else where reitit thinks it's getting data describing request methods

zendevil 2021-01-05T15:48:47.396200Z

what should I do to diagnose and fix this?

zendevil 2021-01-05T15:50:28.396800Z

i think that the response map has to be wrapped in something, but I donโ€™t know what specifically

lukasz 2021-01-05T15:54:31.397500Z

Just a guess - but your home-page and about-page need to return {:status 200 :body &lt;html&gt;}

2021-01-05T15:54:35.397800Z

well, the other route is taking as an argument a function that takes a request and renders a response

2021-01-05T15:54:51.398200Z

your broken route is rendering a response inline with the data, before seeing a request

2021-01-05T15:55:39.398900Z

@lukaszkorecki that's what ring.util.response/response is doing - it doesn't do much else actually

zendevil 2021-01-05T15:56:12.399800Z

I made it an anonymous function

2021-01-05T15:56:25.400300Z

@ps a hunch - it doesnt' error since a response map is a callable, but it just fubars when it gets passed a request

zendevil 2021-01-05T15:56:41.400800Z

but that gives wrong number of arguments

2021-01-05T15:56:52.401200Z

@ps well it should take a request

lukasz 2021-01-05T15:57:00.401500Z

@noisesmith Right, but the root route ("/") is using home-page function directly, in the snippet, ring.util is used in only one place. That said, I'm just guessing here - not sure what that router is

2021-01-05T15:57:12.401700Z

it's reitit

2021-01-05T15:57:48.402500Z

ring is very smart about coercing results, the error here is happening on the layer of route dispatch

zendevil 2021-01-05T15:59:14.403200Z

it works when wrapped in (fn [req] โ€ฆ)

2021-01-05T15:59:29.403500Z

that's what I'd expect, cheers :D

souenzzo 2021-01-05T16:24:57.410200Z

Hello. My task is process an indefinitely long reader My first approach was a simple loop/recur But then I had the idea of implement as a lazy-seq I have some questions 1. read-all use non-tail call recursion. it may run into "stackoverflow" problem? 2. read-all will cause some GC issue? Just clean nodes at end or something like that? 3. There is any advantage of loop/recur approach?

(letfn [(read-all
          [rdr]
          ;; [clojure.data.json :as json]
          (let [v (json/read rdr
                             :eof-error? false
                             :eof-value rdr)]
            (when-not (identical? rdr v)
              (cons v (lazy-seq
                        (read-all rdr))))))
        (proc-all-loop [rdr]
          (loop []
            (let [v (json/read rdr
                               :eof-error? false
                               :eof-value rdr)]
              (when-not (identical? rdr v)
                (my-proc v)
                (recur)))))
        (proc-all-lazy [rdr]
          (run! my-proc (read-all rdr)))]
  ;; which is "better"
  (proc-all-loop *in*)
  (proc-all-lazy *in*))

2021-01-05T16:28:23.411600Z

@souenzzo the classic problem with laziness is resource usage, here you can't really know when to close the reader / the stream the reader is built on

2021-01-05T16:28:35.411900Z

(in the lazy version that is)

2021-01-05T16:29:07.412600Z

if your intent is to eagerly consume, and you throw away the produced values (via run!)I don't know why you are using laziness

๐Ÿ‘ 1
emccue 2021-01-05T16:30:55.413600Z

proc-all-loop

emccue 2021-01-05T16:31:37.414200Z

semantics are clear, no laziness other than waiting on the reader

2021-01-05T16:32:06.415Z

lazy-seqs don't cause stack overflow unless you nest large numbers of unrealized lazy transforms, which is caused by mixing lazy and eager code sloppily

2021-01-05T16:33:46.416100Z

(usually - I mean you could have (-&gt;&gt; coll (map a) (map b) (map c) ...) until the stack blows up but your code would be a huge mess before that happend...)

emccue 2021-01-05T16:38:19.417100Z

If you want to be able to handle the whole thing in sequence, you can make an IReduceInit from the reader that will be invalid when the reader is closed

emccue 2021-01-05T16:38:58.417900Z

Since your semantics really aren't the same as a lazy-seq - 32 elements at a time will block and maybe deadlock your program

emccue 2021-01-05T16:39:39.419100Z

or, sorry

2021-01-05T16:39:43.419400Z

right, but using lazy-seq directly won't impose that chunking

emccue 2021-01-05T16:39:45.419500Z

just an iterator

emccue 2021-01-05T16:39:54.419700Z

oh it wont?

emccue 2021-01-05T16:40:00.420Z

nvm ignore me

2021-01-05T16:40:28.420700Z

other ops that take multiple collections could take that lazy-seq and return a chunking one, but that's more convoluted

2021-01-05T16:41:08.421600Z

@emccue and the root point is a good one - lazy-seqs are bad for situations where realizing an element blocks or changes the state of some resource

emccue 2021-01-05T16:47:24.424100Z

(defn reducible-json-rdr [rdr]
  (reify IReduceInit
    (reduce [f start]
      (let [v (json/read rdr :eof-error? false :eof-value rdr)]
        (if (identical? rdr v)
          start
          (recur f (f start v))))))

emccue 2021-01-05T16:47:48.424600Z

^ just because it always feels neat to write out

dpsutton 2021-01-05T16:48:08.424900Z

IReduceInit needs an init value

emccue 2021-01-05T16:49:01.425500Z

public interface IReduceInit{
Object reduce(IFn f, Object start) ;
}

dpsutton 2021-01-05T16:49:22.425800Z

(reduce [_ f start] ...)

emccue 2021-01-05T16:49:34.426Z

oh yeah this

emccue 2021-01-05T16:50:18.426500Z

(defn reducible-json-rdr [rdr]
  (reify IReduceInit
    (reduce [_ f start]
      (loop [value start]
        (let [v (json/read rdr :eof-error? false :eof-value rdr)]
          (if (identical? rdr v)
            value
            (recur (f value v)))))))

souenzzo 2021-01-05T16:56:50.427400Z

it's the simple loop/recur wrapped on reify reduceinit wrapped on a function

๐Ÿ˜ 1
ghadi 2021-01-05T18:24:39.430100Z

Donโ€™t forget IReduceInit implementations need reduced? handling

Cรฉlio 2021-01-05T19:36:07.431200Z

Hi all. I need some sort of bounded queue that automaticaly drops elements older than X (seconds, minutes, hours, whatever). In general terms, every time I add a new element to the queue, I want it to remove elements that donโ€™t satisfy a certain predicate. I was trying to accomplish that using sorted-set and disj but Iโ€™m not sure if this is optimal, something roughly like this:

(let [queue (sorted-set 5 3 4 1 2 9 6 8 7 0)]
  (println queue)
  (println (apply disj queue (take-while #(&lt; % 3) queue))))
The console output is this:
#{0 1 2 3 4 5 6 7 8 9}
#{3 4 5 6 7 8 9}
So whatโ€™s the best approach to this problem? Is there anything like that available in Clojure?

2021-01-05T19:43:49.431700Z

There are many many options for this kind of thing, but you may need to work on your requirements to figure out what you actually want

2021-01-05T19:44:22.431900Z

e.g. doing stuff on a time limit is much easier then doing stuff for an arbitrary function

2021-01-05T19:45:06.432100Z

do you need immutable data structures? is this building some kind of cache? etc

Cรฉlio 2021-01-05T19:48:37.432300Z

@hiredman Imagine an in-memory collection of maps, each containing a :timestamp field whose value is a ZonedDateTime. Every time a new element (ie: a new map) is added to the collection, I need to remove all elements older than, say, 24 hours.

2021-01-05T19:49:24.432500Z

that is a bounded cache with ttl eviction

Cรฉlio 2021-01-05T19:50:22.432900Z

Correct (Thanks, I was also looking for the terminology ๐Ÿ™‚)

dpsutton 2021-01-05T19:51:03.433100Z

strange to see the ttl requirement enforced solely on addition and not retrieval

Cรฉlio 2021-01-05T19:52:44.433400Z

@dpsutton In my case the eviction on retrieval would be a nice bonus.

dpsutton 2021-01-05T19:52:58.433700Z

seems a necessity

Cรฉlio 2021-01-05T19:53:24.433900Z

For my purposes, not strictly necessary.

dpsutton 2021-01-05T19:53:28.434100Z

if you don't add anything for 7 days, everything is evicted, but if that's only enforced on addition you'll get bad data

2021-01-05T19:53:45.434300Z

the reason arbitrary function vs. time matters is you can build an index based on a known field, but not on an arbitrary function

Cรฉlio 2021-01-05T19:54:59.434500Z

@dpsutton Not a problem for my application.

๐Ÿ‘ 1
dpsutton 2021-01-05T19:56:18.435100Z

then i think you can use clojure.core.cache. the caches let you get a seq or iterator of the underlying hashmap that keeps the values. and when getting the iterator or seq, the cache invalidation is not respected (ie could have expired things in there)

Cรฉlio 2021-01-05T20:01:34.436Z

Thanks @dpsutton I think thatโ€™s what I need.

dpsutton 2021-01-05T20:01:54.436400Z

i think they are composable. ie, you can wrap a ttl around a bounded queue one.

Cรฉlio 2021-01-05T20:05:02.438300Z

thatโ€™s awesome

unbalanced 2021-01-05T22:08:53.441300Z

Does anyone have any recommended best practices for doing remote REPL work in a sensitive data environment (HR/accounting etc)? Permissions, policies, technologies, ACL? I think auditability, monitoring, and permissions are the primary concerns here.

2021-01-05T22:10:30.441500Z

a good baseline is ssh access, with the same user as the app runs under - and don't provide access to anyone you wouldn't provide a root shell on that machine to

๐Ÿ‘ 2
2021-01-05T22:11:19.441800Z

I think going finer grained would just be a mess - there's too many ways to get permissions in a jvm, and no way to truly hide data once you have vm access

unbalanced 2021-01-05T22:11:58.442Z

How about something like auditability or monitoring of REPL sessions?

unbalanced 2021-01-05T22:12:04.442200Z

Ever worked with anything like that?

2021-01-05T22:12:05.442400Z

(by ssh access, I mean tunneling on an ssh connection, and the standard logging of ssh access)

unbalanced 2021-01-05T22:12:20.442600Z

would ssh access log remote REPL stuff?

2021-01-05T22:12:36.442800Z

beyond the layer logging of when connection happens, I think there's too many ways to undermine it

unbalanced 2021-01-05T22:12:59.443Z

true ๐Ÿ˜•

2021-01-05T22:13:29.443200Z

of course you could take clojure.main and make a logged version, then make a policy of "always use the logged repl"

unbalanced 2021-01-05T22:13:52.443400Z

interesting, I didn't know that was a possibility. Makes total sense!

2021-01-05T22:14:19.443600Z

but that's not very easy to enforce - it's so easy to get a repl once you have a connection

unbalanced 2021-01-05T22:14:36.443800Z

true :thinking-face:

2021-01-05T22:14:54.444Z

@goomba yeah, at the root REPL is just a loop, you could say "only use this specific repl", and then check the logs, but there's still some layer of honor system there surely

unbalanced 2021-01-05T22:15:21.444200Z

do you suppose most folks just use the honor system? I mean, surely someone out there uses the REPL on sensitive systems

dpsutton 2021-01-05T22:15:48.444400Z

if its nrepl you could have a middleware that logs all messages back and forth

2021-01-05T22:15:57.444600Z

sure - I guess I'm no expert, I'm just reasoning first principles on what one can do once you have a repl for the most part

2021-01-05T22:16:18.444800Z

@dpsutton right, sure - that's easy to attach to any repl, the hard part is actually enforcing that that repl is used and not modified

2021-01-05T22:16:41.445Z

clojure.main is not a lot of code, a logged version is an afternoon project at most

dpsutton 2021-01-05T22:17:27.445200Z

remote repl sounds like something else manages the process. you have whatever repls it exposes. seems like a logging nrepl server exposed on that would be the easiest

๐Ÿ”ฅ 1
2021-01-05T22:17:58.445400Z

I guess there's always "some things are logged, if it looks like you are doing something shady you better have a good explanation", but that's nearly implicit on remote hardware

dpsutton 2021-01-05T22:18:06.445600Z

if its socket repl it might be even easier to have the functions spit their ins and outs to a file. dunno. just thinking of ease of use tooling wise. nrepl is pretty standard to work with

2021-01-05T22:18:15.445800Z

@dpsutton that kind of sandboxing is fragile and illusory

2021-01-05T22:18:20.446Z

with clojure that is

2021-01-05T22:19:24.446300Z

"something else manages the process" - until you run a single line of code that starts a new unlogged repl

2021-01-05T22:20:15.446500Z

for example, someone on #clojure IRC found a one liner that turned the number 3 into the number 5

unbalanced 2021-01-05T22:20:28.446700Z

no way! ๐Ÿ˜ฎ

unbalanced 2021-01-05T22:20:38.446900Z

it's like that Jimi Hendrix song

2021-01-05T22:20:42.447100Z

of course that didn't affect cached unboxed values

2021-01-05T22:21:03.447300Z

but otherwise, the cached Long instance of 3, was changed to now contain 5

2021-01-05T22:21:15.447500Z

it was remarkable that some things didn't break lol

Ben Sless 2021-01-05T22:21:16.447700Z

Why expose a remote repl to being with? Can it be avoided? If not, what about exposing a sci session instead?

๐Ÿ˜ฎ 1
2021-01-05T22:21:30.447900Z

sci session?

2021-01-05T22:21:44.448100Z

remote repls are great

Ben Sless 2021-01-05T22:21:44.448300Z

Small Clojure Interpreter

Ben Sless 2021-01-05T22:22:01.448500Z

That way you can't modify the running program

2021-01-05T22:22:17.448700Z

but you have to trust whoever has access to the repl

๐Ÿ’ฏ 3
2021-01-05T22:22:18.448900Z

that's a big assertion to be making

Ben Sless 2021-01-05T22:22:23.449100Z

They are, but with great power and all. They're a security nightmare

2021-01-05T22:22:27.449300Z

people can modify running C programs lol

unbalanced 2021-01-05T22:22:53.449600Z

I think "trust but verify" would be an acceptable solution (in my particular situation... not dealing with nuclear missiles or anything)

Ben Sless 2021-01-05T22:23:37.450Z

Well, sci is a better sandbox than just giving someone repl access. You can control which functions are exposed, for example

unbalanced 2021-01-05T22:23:46.450200Z

@ben.sless I'd take sci over nothing! As long as there's a jdbc connector

2021-01-05T22:23:59.450400Z

I would be so annoyed with sci

Ben Sless 2021-01-05T22:24:22.450600Z

There is everything you would expose to the sandboxed environment

2021-01-05T22:24:28.450800Z

the crazy stuff I've down with a repl over the years just would not be possible

unbalanced 2021-01-05T22:25:02.451Z

(that sounds like it should be its own channel... #hiredmanstories ๐Ÿ˜„ )

Ben Sless 2021-01-05T22:25:16.451300Z

It's a compromise. For every crazy creative programmer you have a stumbling newbie who can break production

unbalanced 2021-01-05T22:26:02.451500Z

So does everyone either just roll the dice or not provide a production REPL...?

Ben Sless 2021-01-05T22:26:41.451700Z

We don't expose production repls

unbalanced 2021-01-05T22:26:51.451900Z

Even for data analysis?

2021-01-05T22:26:52.452100Z

"oh, this doesn't have as much instrumentation as I would like, but I want to monitor it when it gets first deployed" -> write a program to connect the remote repl, execute code that reflective walks some objects and pulls out numbers and sends it back so I can stick it in a locally running graphite

dpsutton 2021-01-05T22:27:56.452300Z

would love to read that blog post

2021-01-05T22:28:16.452500Z

oh, we lost a bunch of customer data and need an error recovery process -> write it up, stick it a (company controlled) pastebin, use pssh to load it into the repl of every server via slurp

๐Ÿ‘€ 1
๐Ÿ˜ฎ 2
Ben Sless 2021-01-05T22:28:19.452700Z

A lot can be achieved with running locally with production data. Another compromise is running in a staging environment which mirrors or reads production data but can't change anything

2021-01-05T22:29:39.453200Z

but yeah, repl access is the keys to the kingdom, so if you can't trust people then don't give it to them

unbalanced 2021-01-05T22:29:54.453400Z

Yeah, fair enough. I guess that's the bottom line

2021-01-05T22:30:23.453600Z

I think this is all dev-trust complete. The most sensitive thing (liability wise in particular) is the customer data. Either a dev can be trusted to access it responsibly or not, the rest introduces a lot of work and frustration with little established benefit.

unbalanced 2021-01-05T22:30:31.453800Z

So I guess I need to figure out a way to find a small, read-only, sand castle kingdom ๐Ÿ˜›

2021-01-05T22:31:10.454Z

@goomba there's a lot you can do with configurable loggers plus an environment where a repl can process those logs

2021-01-05T22:31:36.454200Z

then security obfuscation / monitoring can be introduced as a middleware - you have a lot more control

2021-01-05T22:32:20.454400Z

puts me in mind of https://twitter.com/QuinnyPig/status/1346339906902130689

๐Ÿ˜‚ 1
1
unbalanced 2021-01-05T22:32:30.454700Z

@noisesmith so you're saying something like, use a REPL to consume/transform obfuscated logs (as opposed to directly consuming data?)

2021-01-05T22:35:05.455100Z

some of production servers don't always run a repl server now, you have to restart them with a special flag to turn on a repl, and even that bums me out

๐Ÿ˜ข 1
2021-01-05T22:36:34.455400Z

@goomba right, logs as data (or even db entries used as if they were logs, ordered by timestamp), plus a separate process (not the production app) to consume and manipulate that data

2021-01-05T22:37:27.455600Z

for example all sensitive customer info (everything identifiable) can be UUIDs / numeric ids, whatever pointing to a separate table you shouldn't need for dev

2021-01-05T22:38:07.455800Z

you can debug app logic / data flow using the id to id correlation without exposing anything especially important (at least not in an easy to extract way...)

unbalanced 2021-01-05T22:41:48.456Z

Yeah makes sense. Hopefully someone listens!

2021-01-05T23:00:20.458300Z

apropos of repls, for some reason a while ago I wanted the feature I think a number of clojure ide kind of enviroments have, where you can can get a repl running the context of some other running code, I didn't have that feature so I wrote this code to "throw" a repl over tap https://gist.github.com/a2630ea6153d06840a2723d5b2c9698c

vlaaad 2021-01-06T09:17:20.471800Z

Not sure what this REPL does, can you explain?

vlaaad 2021-01-06T09:22:39.472Z

Ah, so it blocks until repl is tapped, interesting! I'm still not sure what's the purpose of tapping it and waiting on tapped REPL...

vlaaad 2021-01-06T09:23:04.472200Z

Why not just start a repl with lexical eval?

2021-01-05T23:00:40.458500Z

it is kind of neat

seancorfield 2021-01-05T23:07:03.458600Z

hiredman has said most of what I was going to say -- yes, we run socket REPLs in several of our production processes; yes, we ssh tunnel into production and connect a client to production (I connect VS Code on my desktop to production sometimes ๐Ÿ™‚ ); since we AOT-compile our uberjars with direct-linking enabled, there's a limit to what we can redefine dynamically -- except in a couple of legacy processes that load Clojure from source at runtime (for reasons) and those can be live-patched all day long. Perhaps one of the most important considerations here is that any process that runs Clojure can be told to start a REPL via JVM properties at startup -- no code is needed inside the Clojure codebase, so anyone who has access to how a Clojure-based process is (re)started can enable a socket REPL in it, and then you have unlimited access, assuming you can get network access to that socket!

alexmiller 2021-01-05T23:11:23.458800Z

if I understand what that does, that's cool

alexmiller 2021-01-05T23:11:37.459Z

you might be able to use http://clojure.github.io/clojure/clojure.core-api.html#clojure.core/PrintWriter-on too, not sure

2021-01-05T23:19:12.459300Z

you put a call to start-repl somewhere in your code, then run your code, then in a repl connected to the same process (if your code runs in another thread you can use the same repl that you used to run your code) you call wait-for-repl, and once execution hits start-repl the repl where wait-for-repl is running is taken over and inputs and outputs are forwarded to and from a repl running where the call to start-repl is

2021-01-05T23:20:58.459500Z

yeah, PrintWriter-on looks handy, I need to remember it next time I write one of these

borkdude 2021-01-05T23:21:42.459700Z

Reminds me a bit of https://github.com/technomancy/limit-break (although it's different)

alexmiller 2021-01-05T23:21:51.460100Z

well it was born out of doing this kind of stuff from prepl :)

2021-01-05T23:21:51.460300Z

I'm no ssh expert, but I'd be surprised if there was no way to log what is transferred through the ssh connection

2021-01-05T23:22:11.460500Z

At the very least, ssh should have access logs.

2021-01-05T23:28:27.460700Z

The thing is, the REPL won't give you more power than the SSH itself. Once I'm SSHed in, I can simply replace the service with another one, change the class files or source files, I can read the computer memory, steal the credential files, etc.

2021-01-05T23:28:42.460900Z

Well root ssh

2021-01-05T23:28:59.461100Z

So if that's allowed, the REPL through SSH isn't any riskier

2021-01-05T23:30:58.461300Z

You could argue the data to steal is made more obscure without a REPL, but :man-shrugging:

2021-01-05T23:35:49.461500Z

@didibus sure, but once I am in a jvm with a clojure process I can open up any method I find convenient to communicate - I'm not limited to the repl I first connected to

2021-01-05T23:36:21.461700Z

access logs are a great start, and maybe even logging what comes across the wire in that first connection - just don't pretend it's especially limiting

2021-01-05T23:37:03.461900Z

Maybe I explained myself wrong. I meant, once ssh with sudo is compromised, you're f***ed REPL or no REPL.

2021-01-05T23:37:42.462100Z

So if your company allows ssh with sudo, and they deem they have secured that to allow it, the REPL doesn't add to the threat vector

2021-01-05T23:37:55.462300Z

right - I don't think sudo / root access is a given (we can and should drop app privleges when running)

2021-01-05T23:38:24.462500Z

but you have at least the privs of the process running the jvm, if you can repl in that jvm

2021-01-05T23:39:28.462700Z

and if there are operating systems in production without local privilege escalations, they aren't used often

2021-01-05T23:41:00.462900Z

I thought most places gave dev ssh sudo access to prod hosts

2021-01-05T23:41:42.463100Z

So what I mean is, if you are already granted that permission, they trust you with a lot, the REPL doesn't let you do more things than ssh + sudo

2021-01-05T23:41:51.463300Z

So I don't see why they'd be against it

2021-01-05T23:42:21.463500Z

Now, if you only get ssh with some restricted user permissions, and those permissions are less then the user of the JVM, that's different

2021-01-05T23:42:58.463700Z

But as long as your ssh user has the same or more permissions as the user of the JVM, the REPL does not expose more things to you, its just a nicer UX

2021-01-05T23:46:03.463900Z

For example, your DB credentials are going to be stored in some file which the user of the JVM has permission to read, so I can easily ssh, read the file, get the creds, ssh tunnel my SQL Workbench and connect to your DB

seancorfield 2021-01-05T23:46:43.464100Z

If you have ssh access and no permissions, you can still tunnel to the server and connect to a socket REPL.

seancorfield 2021-01-05T23:47:17.464300Z

The socket connection on the loopback isn't restricted to just certain user accounts.

seancorfield 2021-01-05T23:48:25.464500Z

We tunnel in via a low-privilege user and the JVM runs under a separate user to which that tunneling account has pretty much no access, yet it can still connect to the REPL's port.

seancorfield 2021-01-05T23:48:53.464700Z

So a socket REPL is more access that just what ssh allows, in that respect.

2021-01-05T23:52:46.464900Z

Yes, when your ssh user has less permissions. I'm saying, if your InfoSec department lets you have SSH access with equal or more permissions than your JVM user, than the REPL isn't doing anything worse.

2021-01-05T23:54:28.465200Z

I don't know if that's the case for OP, but they should check. If they are already allowed to SSH with a user of similar or more permissions to the user running their app, then they shouldn't need to do anything more to "secure" the use of the REPL, since all data that can be accessed by the REPL, and all commands the REPL can execute on the machine can also be accessed and executed through other means.

2021-01-05T23:55:56.465400Z

Otherwise, and something I've done in the past is that our app does not run with a REPL open. Instead, you ssh into the host, and you start a second instance of your app with a REPL in it, that second app instance is thus launched with your ssh user, and is restricted to those permissions, then you can REPL into that.

2021-01-05T23:56:35.465700Z

We also do this to protect ourselves from accidentally reloading some buggy or broken code and causing prod issues