Clojurians Log v2

Clojure programming

Channels

# 100-days-of-code # aatree # admin-announcements # adventofcode # ai # alda # aleph # all-the-channels # announcements # arachne # architecture # asami # atlanta-clojurians # atom-editor # autochrome-github # avi # aws # aws-lambda # babashka # babashka-sci-dev # bangalore-clj # beginners # berlin # biff # bigdata # bitcoin # boot # boot-dev # boulder-clojurians # braid-chat # braveandtrue # brevis # bristol-clojurians # business # calva # capetown # carry # cbus # cestmeetup # chestnut # chlorine-clover # cider # circleci # clara # clj-commons # cljdoc # cljfx # clj-http # clj-kondo # clj-on-windows # cljs-dev # cljs-experience # cljsfiddle # cljsjs # cljsrn # cljtogether # clojars # clojure # clojure-android # clojure-argentina # clojure-art # clojure-austin # clojure-australia # clojure-austria # clojure-bangladesh # clojure-bay-area # clojure-beijing # clojure-belgium # clojure-berlin # clojure-boston # clojure-brasil # clojurebridge # clojurebridge-ams # clojure-canada # clojure-chennai # clojure-chicago # clojure-china # clojure-colombia # clojure-conj # clojurecup # clojure-czech # clojured # clojure-denmark # clojure-denver # clojure-derby # clojuredesign-podcast # clojure-dev # clojure-dusseldorf # clojure-ecuador # clojure-egypt # clojure-estonia # clojure-europe # clojure-filipino # clojure-finland # clojure-france # clojure-gamedev # clojure-germany # clojure-greece # clojure-guangzhou # clojure-hamburg # clojure-hk # clojure-houston # clojure-hungary # clojure-india # clojureindia # clojure-indonesia # clojure-ireland # clojure-israel # clojure-italy # clojure-japan # clojure-kc # clojure-korea # clojure-losangeles # clojure-madison # clojure-mexico # clojure-miami # clojure-mk # clojure-mke # clojure-morsels # clojure-my # clojure-new-zealand # clojure-nl # clojure-nlp # clojure-norway # clojure-poland # clojure-portugal # clojure-provo # clojure-quebec # clojureremote # clojure-romania # clojure-russia # clojure-sanfrancisco # clojurescript # clojurescript-ios # clojure-sdn # clojure-seattle # clojure-serbia # clojure-sg # clojure-shanghai # clojure-spain # clojure-spec # clojuresque # clojure-survey # clojure-sweden # clojure-switzerland # clojure-taiwan # clojure-turkiye # clojure-uk # clojure-ukraine # clojureverse-ops # clojurewerkz # clojurewest # clojurex # clojure-za # clojurian-chat-app # clojutre # cloverage # cloxp # clr # code-art # code-reviews # community-development # component # conf-proposals # conjure # consulting # contributions-welcome # copenhagen-clojurians # core-async # core-logic # core-matrix # core-typed # cryogen # crypto # css # cursive # cz-clojure # d2q # datacrypt # datahike # datalevin # datalog # data-oriented-programming # data-science # datascript # datavis # dato # datomic # defnpodcast # deps-new # depstar # devcards # devops # dirac # docker # docs # domino-clj # duct # dunaj # eastwood # editors # emacs # error-message-catalog # etaoin # ethereum # euroclojure # events # exercism # expound # figwheel # figwheel-main # flambo # fulcro # funcool # functionalprogramming # funimage # garden # ghostwheel # girouette # gis # google-cloud # gorilla # graalvm # graalvm-mobile # graclj # graphql # gratitude # gsoc # hammock-driven-dev # helix # heroku # hispano # holy-lambda # honeysql # hoplon # hugsql # humor # hypercrud # hyperfiddle # immutant # improve-getting-started # incanter # indycljs # inf-clojure # instaparse # integrant # interceptors # interop # introduce-yourself # iot # iotivity # ipfs # jackdaw # jaunt # java # javascript # javelin # jobs # jobs-discuss # jobs-rus # joker # jukebox # juxt # jvm # kaocha # keechma # kekkonen # keyboards # klipse # kosmos # lambdaisland # ldnclj # ldnproclodo # lein-figwheel # leiningen # liberator # liquid # livestream # local-first-clojure # london-clojurians # lsp # luminus # lumo # mail # malli # mathematics # meander # melbourne # membrane # mental-health # microservices # mid-cities-meetup # midje # minecraft # minimallist # missionary # monads # mount # music # new-channels # new-clojure # nextjournal # nginx # nrepl # numerical-computing # nyc # observability # off-topic # om # om-next # onyx # other-languages # other-lisps # overtone # pamela # parinfer # pathom # pedestal # perun # philosophy # phzr # planck # plastic # play-clj # podcasts # polylith # portal # portkey # portland-or # powderkeg # practicalli # precept # prelude # programming-beginners # project-updates # proletarian # proton # protorepl # pulsar # pure-frame # qa # qlkit # quil # random # rdf # react # reactive # reading-clojure # reagent # reclojure # re-frame # reitit # releases # remote-jobs # respo # rethinkdb # reveal # rewrite-clj # ring # ring-swagger # robots # rum # schema # sci # sfcljs # shadow-cljs # _silence # sim-testing # sioux-falls # slack-help # sneer # sneer-br # spacemacs # specmonstah # specter # speculative # spirituality-ethics # sql # startup-in-a-month # sydney # test200 # test-check # testing # thejaloniki # timbre # tmp-json-parsing # tools-build # tools-deps # trading # tree-sitter # uncomplicate # unrepl # untangled # utah-clojurians # videos # vim # vrac # vscode # wasm # web-security # windows # xtdb # yada # yleinen

Apps

clojure

New to Clojure? Try the #beginners channel. Official docs: https://clojure.org/ Searchable message archives: https://clojurians-log.clojureverse.org/

2021-01-11T05:29:39.044400Z

Anyone has an idea how you could detect that *ns*has changed. Or more like, that we left the current namespace?

2021-01-11T05:30:28.044800Z

add-watch only works on root bindings

2021-01-11T06:32:06.046300Z

Nrepl would send back the value of *ns* on every response if I recall

Yehonathan Sharvit 2021-01-11T07:20:59.047500Z

Looks very complicated :thinking_face:

suren 2021-01-11T08:11:53.050Z

Does anyone here have experience to parameterize a uuid string? I am not being able to make following query in next-jdbc .

["SELECT * FROM sessions WHERE (session_id = ?)" "1594d14d-91f7-4c8e-88cb-b06ca14ada0f"]

It throws following error

ERROR in (logout) (QueryExecutorImpl.java:2433)
Uncaught exception, not in assertion.
expected: nil
  actual: org.postgresql.util.PSQLException: ERROR: syntax error at or near "WHERE"
  Position: 18
 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse (QueryExecutorImpl.java:2433)
    org.postgresql.core.v3.QueryExecutorImpl.processResults (QueryExecutorImpl.java:2178)
    org.postgresql.core.v3.QueryExecutorImpl.execute (QueryExecutorImpl.java:306)
    org.postgresql.jdbc.PgStatement.executeInternal (PgStatement.java:441)
    org.postgresql.jdbc.PgStatement.execute (PgStatement.java:365)
    org.postgresql.jdbc.PgPreparedStatement.executeWithFlags (PgPreparedStatement.java:155)
    org.postgresql.jdbc.PgPreparedStatement.execute (PgPreparedStatement.java:144)

Tamas 2021-01-11T09:44:16.050500Z

I think

["SELECT * FROM sessions WHERE (session_id = ?)" (UUID/fromString "1594d14d-91f7-4c8e-88cb-b06ca14ada0f")]

should work. After importing java.util.UUID of course.

suren 2021-01-11T10:53:57.050900Z

Curious why we need to change a string to uuid before querying against a string column. session_id is a varchar in db. Trying above line I get following error

Uncaught exception, not in assertion.
expected: nil
  actual: org.postgresql.util.PSQLException: ERROR: operator does not exist: character varying = uuid
  Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
  Position: 42
 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse (QueryExecutorImpl.java:2433)

Tamas 2021-01-11T11:07:10.051200Z

Ah, it is a varchar? I thought it is an UUID in Postgres, if it isn't then it is just a normal String parameter, isn't it?

Tamas 2021-01-11T11:09:30.051400Z

It is strange that is says it is syntax error. Does it work if you pass in the full SQL statement without the param? ie. ["SELECT * FROM sessions WHERE session_id = '1594d14d-91f7-4c8e-88cb-b06ca14ada0f'"]

p-himik 2021-01-11T11:22:37.051900Z

@suren Log all the PG statements (via the server config, not via CLJ) and see what the actual statement is that gets executed.

amarnah 2021-01-11T11:24:40.052200Z

Hey guys. In this code:

(defn factors [n]
  (filter #(zero? (mod n %)) (range 1 (inc n))))

(defn prime? [n]
  (= (factors n) [1 n]))

(def all-primes
  (filter prime? (range)))

if you do (time (take 10000 all-primes)) the reported time will print instantly. Why is that? although take executes the first n all-primes . I am familiar with the fact that take returns a lazy sequence, but is there a better way to calculate the execution time? Do I have to do for example (last (take n all-primes))?

p-himik 2021-01-11T11:36:27.052300Z

take is lazy.

p-himik 2021-01-11T11:36:48.052500Z

And filter as well.

p-himik 2021-01-11T11:37:16.052700Z

To realize the lazy seq, you can do doall.

p-himik 2021-01-11T11:38:55.054Z

user=&gt; (do (time (filter odd? (range 1000000))) nil)
"Elapsed time: 0.014845 msecs"
nil
user=&gt; (do (time (doall (filter odd? (range 1000000)))) nil)
"Elapsed time: 45.723745 msecs"
nil

do with nil in there just so that REPL does not realize the seq by printing it all out.

Tamas 2021-01-11T11:40:59.056300Z

or just use dorun instead of doall?

Phil Shapiro 2021-01-11T11:41:03.056600Z

doall would work but I would use either (last (take n)) or (first (drop n)) since doall holds the seq in memory and you only want the time.

Tamas 2021-01-11T11:41:06.056800Z

(time (dorun (take 10000 all-primes)))

Phil Shapiro 2021-01-11T11:42:00.057600Z

dorun is better still :)

p-himik 2021-01-11T11:42:11.057800Z

Thanks, I keep forgetting about it. :)

Tamas 2021-01-11T11:43:07.058800Z

but in practice you would probably want the nth prime ... dorun is good for just timing it

amarnah 2021-01-11T11:47:58.059200Z

Yeah, dorun or doall is what I was actually looking for. Thanks everyone :)

Tamas 2021-01-11T11:56:04.059400Z

just for completeness (time (nth all-primes 10000)) does the job as well and it is shorter 🙂

👍 3

amarnah 2021-01-11T12:04:11.059800Z

One more question, if you do

(defn pinc [n] (println "*") (inc n))
(def arr (map pinc (range)))
(take 10 arr)

The last line will indeed print * 10 times (so the function will execute on the first n elements). Why does time not report that?

Tamas 2021-01-11T12:10:34.060300Z

It is a lazy seq again what you get back.

Tamas 2021-01-11T12:12:51.060500Z

(take 10 arr) returns the head of a seq. That gets passed to time when you do (time (take 10 arr)) . time prints the elapsed time (a few milliseconds) and returns the head of the seq to the REPL. The REPL prints it out and that's when it gets evaluated.

Tamas 2021-01-11T12:14:00.060700Z

if you do something larger again ex. (time (take 10000 arr) it might be more obvious that time prints out the Elapsed time first and then the arr is printed out.

amarnah 2021-01-11T12:15:07.061100Z

Oh, now I get it. Thank you so much! 🙂

Tamas 2021-01-11T12:15:57.061400Z

no worries, I'm glad I'm making some sense 🙂

suren 2021-01-11T12:18:22.061600Z

@sztamas yup it works if I pass it within the sql statement.

Tamas 2021-01-11T15:00:56.065400Z

@suren Sorry, was AFK. Next step would be to try it with ["SELECT * FROM sessions WHERE session_id = ?" "1594d14d-91f7-4c8e-88cb-b06ca14ada0f"] . If it fails can we see the code that gets executed and the table definition? Also, what @p-himik suggested (getting the logs of the queries received) makes a lot of sense if it is feasible to get to the DB somehow.

orestis 2021-01-11T15:24:29.068800Z

I have a seq of foos that I want to process sequentially and for each foo I might generate zero, one or “a few” bars. The way I’d usually do this is with a (mapcat identity (for …)) just because I like the way for looks — but I’m not so wild about mapcat identity (and the http://chouser.n01se.net/apply-concat/)… Is this a transducer shaped hole?

borkdude 2021-01-11T15:29:55.069Z

@orestis I think so:

user=&gt; (into [] (comp (map (fn [x] [x x])) cat) [1 2 3])
[1 1 2 2 3 3]

p-himik 2021-01-11T15:37:46.069400Z

mapcat can also create a transducer.

Sam Ritchie 2021-01-11T15:38:04.069600Z

user=&gt; (into [] (mapcat (fn [x] [x x])) [1 2 3])

👍 1

orestis 2021-01-11T16:15:27.074300Z

So I guess as a follow up - is it common for internal APIs that implement business logic by returning a transducer? In my case, the logic is all in the function that takes a foo and returns some bars. There will be multiple of those functions (probably a multi method) but I guess they are a little bit concerned about the plumbing since they need to be a bit efficient. I guess what I’m asking is, is it better to return a transducer that can be directly applied to a seq or a function that can only work with the correct plumbing?

2021-01-11T16:22:37.075300Z

what do you mena by "concerned about plumbing"? do you mean tied to particular transducing contexts, or coupled with the shape of the input or something?

orestis 2021-01-11T16:45:56.075900Z

Mainly sources and sinks, laziness vs eagerness etc.

borkdude 2021-01-11T16:46:57.076800Z

@orestis You can also consider an API that returns an eduction which in loosey-goosey terms is a transducer coupled with its input source

Tamas 2021-01-11T16:53:21.078300Z

this was a good excuse to look a bit at next.jdbc source code, but I don't think anything "special" is happening in there that might cause this error, so I suspect there is something wrong with that SQL and param vector from your initial post. Anyways, you could do the following to see the SQL statement that will be sent to Postgresql for execution:

(import org.postgresql.jdbc.PgPreparedStatement)

(def ps (.prepareStatement (.getConnection YOUR_DATASOURCE) "SELECT * FROM sessions WHERE (session_id = ?)"))
(.setObject ps 1 "1594d14d-91f7-4c8e-88cb-b06ca14ada0f")
(str ps)
"SELECT * FROM sessions WHERE (session_id = '1594d14d-91f7-4c8e-88cb-b06ca14ada0f')"

2021-01-11T17:02:12.078500Z

Can eduction be further composed into bigger transducers?

2021-01-11T17:03:00.078700Z

Like can you add more transforms to it after the fact?

borkdude 2021-01-11T17:03:18.079Z

yes

2021-01-11T17:03:57.080Z

I see, so they seem the closest thing to lazy-seq.

2021-01-11T17:05:18.082700Z

I'd say it depends, are they going to process the returned collection further? If so, and you care about performance, probably best you return a transducer or an eduction.

p-himik 2021-01-11T17:06:19.083600Z

Just in case - what PostgreSQL ends up doing can be infinitely complex. It all depends on your schema. E.g. you can have a bazillion INTEAD OF rules and some triggers that end up trying to execute broken SQL that has absolutely nothing to do with the original query.

p-himik 2021-01-11T17:07:21.084400Z

What makes me especially suspicious is that the error says Position: 18. But the 18th character of that query is in the middle of the sessions word. It doesn't make any sense.

pez 2021-01-11T17:07:34.084600Z

I just want to make Clojure fans aware that in #startup-in-a-month you will be able to follow @andyfry01 picking up Clojure for his project of starting 12 apps in 12 months. As I understand things, he is not completely new to functional, but there sure will be Clojure things he need to figure out. So maybe consider being part of his line of support and helping him create as many success stories for Clojure as possible the following year? ❤️

👀 2

orestis 2021-01-11T17:19:35.085700Z

I guess the semantics I’m debating is should the api itself be concerned with transforming a single element or transforming streams/sequences

2021-01-11T17:19:57.085800Z

Anyone know any good examples of eduction being used like this? I feel like it's the part of transducers i never fully "got". I've re-read the official transducers page and the clojuredocs page for eduction a few times but it's not quite clicking

2021-01-11T17:20:26.086500Z

couldn't you do both with a separate arity in the same way clojure core does?

borkdude 2021-01-11T17:22:18.088300Z

@jjttjj we use a couple of functions that return an eduction as the result from a sql query. you can do transformations on this, but only until you consume the eduction will the SQL query be executed and the results streamed

2021-01-11T17:22:29.088800Z

I've implemented things as transducers, and then just have another arity that just adds an xs arg and does (into [] (this-fn xf-args) xs) for easier use at the repl with sequences

2021-01-11T17:23:54.089400Z

by transformations you mean like (map my-fn eduction-result) right? like the regular sequence functions?

Tamas 2021-01-11T17:26:52.089600Z

yes, good points!

Tamas 2021-01-11T17:29:15.089800Z

anyways, the code above does about what next.jdbc is doing, so if the SELECT string looks legit (and I don't see why it wouldn't) the OP will know that the problem isn't in next.jdbc as originally suspected

➕ 1

orestis 2021-01-11T17:30:28.091300Z

@jjttjj that’s an interesting approach, I’ll have a think about it. I need a REPL to try things out :) but it helps discussing here in the abstract too

pez 2021-01-11T17:37:47.095100Z

A random observation (or two): I’m converting an important CI (bash) script, which mainly contains a series of small inline awk scripts, to Clojure (babashka). It is such a bliss being equipped with a REPL! :clojure-spin: Interestingly the loc count more than doubles. I hadn’t expected that.

borkdude 2021-01-11T17:39:12.095500Z

Use fewer newlines?

😄 1

2021-01-11T17:39:12.095600Z

As concise as Clojure is, hard to beat awk

pez 2021-01-11T17:41:16.095900Z

Indeed. Back in the days I wrote whole systems in awk. Actually is a thing I like about Clojure. That it somehow reminds me about when I was so productive with awk.

pez 2021-01-11T17:41:41.096100Z

I don’t mind the extra lines at all. For the record. 😃

2021-01-11T17:42:27.096300Z

No I think with further transduce

2021-01-11T17:43:53.096600Z

(into [] (map my-fn) eduction-result) though this is what I'm not sure of. If this will create a single transducer pass over the contained eduction collection or if it will be two pass

2021-01-11T17:44:29.096800Z

Until someone puts awk inside Clojure 😛

borkdude 2021-01-11T17:44:42.097Z

@didibus eductions don't do passes until you consume them. you can compose them with other transducers and they will be "merged"

borkdude 2021-01-11T17:45:11.097300Z

think of an eduction just being a pair of input collection and a transducer

2021-01-11T17:45:42.097900Z

Hey guys, I am seeing strange behavior with tolitius/mount. Is anyone here an expert?

pez 2021-01-11T17:46:47.098100Z

That would be lovely, I think. Crazy, but lovely.

2021-01-11T17:48:27.098300Z

Ok, as long as they are further merged prior to looping

2021-01-11T17:49:27.099100Z

I also wasn't sure if they'd be merged in all context. Like maybe you can:

(-&gt;&gt; (eduction (map inc) [1 2 3 4])
  (eduction (filter even?)))

But can you:

(-&gt;&gt; (eduction (map inc) [1 2 3 4])
  (into [] (filter even?)))

2021-01-11T17:49:40.099500Z

@doubleagent it's probably more productive to just post the question - I've solved people's mount problems in the past based on clojure knowledge despite never having used mount

👍 1

alexmiller 2021-01-11T17:49:45.099700Z

awk is an incredible tool when you have an awk-shaped problem

2021-01-11T17:50:25.100300Z

I would normally do that but taking a different approach since I can't post the code.

2021-01-11T17:50:26.100500Z

And both performs a loop fusion?

2021-01-11T17:53:42.100800Z

#mount ?

👍 1

pez 2021-01-11T17:54:42.101Z

This is one such problem, plus the fact that I tend to shape my problems like that. However, I am so happy for babashka helping me to write it in Clojure that I can hardly word it. The requirements list for this pipeline used to stress me out, but now I feel how I am smiling looking at it.

2021-01-11T17:58:31.101200Z

There's probably some utils you could build up to imitate a more awk like workflow, in terms of the read line, process, repeat

alexmiller 2021-01-11T17:58:48.101400Z

that would be a cool library

borkdude 2021-01-11T18:50:02.101700Z

yes

borkdude 2021-01-11T18:52:49.101900Z

can you give a clojure (pseudo-code) example of what you would normally do in awk? I am lacking awk knowledge

2021-01-11T19:02:42.102300Z

I fixed the issue but I don't have any explanation for why the issue was happening in the first place.

vemv 2021-01-11T19:03:15.102600Z

when faced with this decision (which pops up from time to time) I tend to choose vanilla functions. My reasoning is - which of these is more agnostic and reusable?

(defn logic1 []
  (fn []
    ...))

(defn logic2 []
  (map (fn []
         ...)))

(defn logic3 [xs]
  (map (fn []
         ...)
       xs))

(fn, trandsucer and call to a coll-processing function, respectively)

2021-01-11T19:03:18.102800Z

#mount for details

vemv 2021-01-11T19:04:02.103Z

In a way, a vanilla fn is a superset of the other two options. By pickng the vanilla fn, I can choose laziness or transduder, map or filter, a la carte

2021-01-11T19:14:01.103200Z

Neato, used like this eduction is pretty nice actually. Never really thought of using it.

suren 2021-01-11T20:20:34.103600Z

@sztamas @p-himik You guys are right. The issue was with one of my queries. Apparently I was focusing on a wrong query. The query below works fine.

(db/query! (sql "select *"
                       "from sessions s"
                       "inner join users on users.id = s.user_id"
                      (where {:s.session_id session_id})))

The issue was with following query

(db/query! (sql "UPDATE sessions"
                  (set {:idle_timeout 0})
                  (where {:session-id session-id})))

Instead of set I should have used set-values as per this doc here https://github.com/ludbek/sql-compose#set-values . Its funny cos I wrote that little package. Anyways thanks guys. As suggested, watching the db log was helpful.

👍 2

pez 2021-01-11T21:26:21.104200Z

So, from the top of my mind when I think about what awk is… It has this pattern -> action structure that it applies to all rows of the input. The pattern is a predicate of sorts, having access to the current row as input. The action is code executed when the pattern matches. It also has access to the current row as input. Both the pattern and the action “automatically” has the current row split up into columns. Both also have access to any state set up by actions. Patterns can be of a special variant of FROM, TO, which I think about as two patterns, it will match from when FROM is true until TO is true. (Both the pattern and the action has syntactic convenience access to the current row, but I don’t think it is what makes so much difference.) Pseudocode… I have to think about it a bit…

borkdude 2021-01-11T21:28:16.104400Z

Sounds like an Advent of Code problem. I bet there's people who did AOC in awk ;)

pez 2021-01-11T21:31:34.104600Z

Haha, yes, Some of the problems would lend themselves to awk. It is a quite full featured language as such, but a bit like early javascript, where sourcing in existing code bases is not really solved. (But there is cat and stuff in the shell that bridges this to some extent.)

pez 2021-01-11T21:44:57.104900Z

If you have experience from Perl you might know about some awk idioms/powers.

pez 2021-01-11T22:00:45.105100Z

Here’s an example which displays some of what I tried to described, and also highlights some things I forgot to mention. It’s from https://stackoverflow.com/a/33110779/44639 and an answer to a question about how to implement tail in awk:

awk -v tail=10 '
  {
    output[NR % tail] = $0
  }
  END {
    if(NR &lt; tail) {
      i = 0
    } else {
      i = NR
    }
    do {
      i = (i + 1) % tail;
      print output[i]
    } while (i != NR % tail)
  }'

The things it shows that forgot to mention is that the defauly pattern is to match all rows. And that there are special patterns, BEGIN and END that match before and after the input. I somehow think that in a Clojure context BEGIN/END is not so important, but I might be overlooking something.

borkdude 2021-01-11T22:08:57.105400Z

> should print the last 10 lines of a text file

cat README.md | bb -io -e '(drop (- (count *input*) 10) *input*)'

pez 2021-01-11T22:12:43.105600Z

Indeed. Babashka has a lot of the awk feeling in my book. Even if awk works on each row at a time, so slurping up the whole input is not usually what you do.

borkdude 2021-01-11T22:14:05.105800Z

*input* is lazy, but to get the count means that it will be realized fully. how does awk do this?

2021-01-11T22:58:23.106Z

AWK has three phase: do something at the beginning for each line <process-line>> do something at the end And you can set variables in the begin, that you set in the process phase, and use in the end phase

pez 2021-01-11T23:00:21.106200Z

Yeah. The awk script gets called before the input, once for each input row (where “row” is defined by a separator regex) , and then after the input.

borkdude 2021-01-11T23:03:23.106500Z

So how does it know the amount of lines at the start, it reads the entire file probably?

pez 2021-01-11T23:08:37.107Z

It can’t know the amount of lines in the start. This is known only in the action matching the END pattern.

borkdude 2021-01-11T23:09:45.107200Z

So what does:

{
    output[NR % tail] = $0
  }

do?

pez 2021-01-11T23:10:52.107400Z

tail is the variable fed to this particular script. NR keeps track of the current row number. (Which then in the END pattern is the number of total lines.)

pez 2021-01-11T23:11:55.107600Z

output is an array. An associative array as it happens, because all arrays in awk are like that.

borkdude 2021-01-11T23:13:20.107800Z

ah, ok, so it assigns the NR mod tail-th element to the current line, so in the end you will have the last 10 lines in the array

pez 2021-01-11T23:13:34.108Z

Exactly.

pez 2021-01-11T23:16:46.108200Z

Print a random line from a file:

awk 'rand() * NR &lt; 1 { line = $0 } END { print line }' file.name

pez 2021-01-11T23:17:43.108400Z

rand() * NR < 1 is a pattern causing line to be set to the current row ( $0).

borkdude 2021-01-11T23:21:59.108600Z

This seems to do what awk does then, sort of:

$ cat README.md | bb -io '(first (reduce (fn [[tail nr] line] [(assoc tail (mod nr 10) line) (inc nr)]) [(vec (repeat 10 nil)) 0] *input*))'

borkdude 2021-01-11T23:24:02.108800Z

Print a random line:

$ cat README.md | bb -io '(rand-nth *input*)'

borkdude 2021-01-11T23:24:18.109Z

but yeah, I get awk syntax/semantics more now, thanks, TIL

pez 2021-01-11T23:33:46.114200Z

Cool. The awk script for printing a random line uses a Knuth trick to not realize the whole file. So a reduce might be in place again. Maybe a reduce context is what we can imagine any awk script to “live” in.

pez 2021-01-11T23:34:41.115Z

Not realize the whole file at once that is.