I'd like to use babashka to process a file line by line reading from standard input, but I could not produce the processed lines on the standard output. I've simplified my code to the following to figure out the problem:
(ns convert.drop-bart-and-uppercase)
(defn clean-location
[x]
x)
(defn clean [lines]
(->> lines
(map clean-location))
)
(clean *in*)
and use the following to execute:
cat samples.dat | bb -i -o -f ../convert/src/convert/drop_bart_and_uppercase.clj
and here is the sample.dat:
Time=Thu Oct 1 15:27:15 PDT 2020, Value=75.7, Location=L16-tempmon.xxx.yyy, Device_Type=tempSensor, Value_Type=Temperature
Time=Thu Oct 1 15:28:12 PDT 2020, Value=91.4, Location=a40-tc-ups, Device_Type=UPS, Value_Type=Temperature
to my understanding, I'd expect it outputs the identical two lines of the content. But I see nothing.
Please help me to figure out my problem. Thanks!
------------
I found the following command works as expected:
< samples.dat bb -io '(load-file "/home/yshen/data/temperature-data-archive/convert/src/convert/drop_bart_and_uppercase.clj") (convert.drop-bart-and-uppercase/clean *input*)'
but I find it too clumsy to load the file and then call the function
@yubrshen In the first piece of code you use *in*
(not *input*
) which is not a seq of lines, but just the stdin stream from Clojure.
I would like to learn what is the idiomatic way to process every line of string with Babashka?
@yubrshen You can use *input*
but this is honestly more for one-liners on the command line.
For scripts you might want to use:
$ ls | bb -e "(first (line-seq (io/reader *in*)))"
"CHANGELOG.md"
io/reader
is coming from <http://clojure.java.io|clojure.java.io>
ok :)
Finally, this is what works for my need.
< samples.dat bb -i -o '(->> *input* (map (fn [line] (clojure.string/replace-first line #"Location=([^.,]+)[^,]+" #(str "Location=" (clojure.string/upper-case (last %1)))))))'
I can use user/**input**
inside of my script file to access the stdin as list of lines, but I have not figured out how to output lines to stdout inside my script.
The above one-liner works, but it's getting hard to maintain. Is there such equivalent mechanism to let babashka to help to output lines to stdout from a script?
I can improve the readability but not keeping in the ecosystem of Clojure:
#!/usr/bin/env bash
< $1 bb -i -o '(->> *input*
(map (fn [line]
(clojure.string/replace-first line
#"Location=([^.,]+)[^,]+" #(str "Location="
(clojure.string/upper-case (last %1)))))))'
Is there any better approach, keeping my code mostly in Clojure development environment?
@yubrshen I'm not sure what you mean. You can put this code in a file and that should work? https://clojurians.slack.com/archives/CLX41ASCS/p1603213831030700?thread_ts=1603176112.006000&cid=CLX41ASCS
Without babashka in/output flags:
(ns my-script
(:require [<http://clojure.java.io|clojure.java.io> :as io]
[clojure.string :as str]))
(defn lines []
(line-seq (io/reader *in*)))
(->> (lines)
(map
(fn [line]
(str/replace-first line #"Location=([^.,]+)[^,]+"
#(str "Location=" (str/upper-case (last %1))))))
(run! println))
This also works with Clojure on the JVM
Yes, exactly, this is what I'm looking for to learn to have the script to run both with Clojure and Babashka. Thanks a million!
:thumbsup:
How's this for passing options to the nifty $
macro?
user=> (def sw (java.io.StringWriter.))
#'user/sw
user=> (-> ($ ls -la Dockerfile) ^{:out sw} ($ cat) check :exit)
0
user=> (str sw)
"-rw-r--r--@ 1 borkdude staff 729 Oct 15 17:25 Dockerfile\n"
I donβt use metadata that much so Iβm not sure how to read it π¬
The metadata preceding the ($ ...) form are the options for that form
So in this case the metadata is attached to the return value of ($ ls -la Dockerfile)
and used by ($ cat)
?
no, the metadata is only attached to ($ cat)
, this is how metadata works.
it is the same as writing (process '[cat] {:out sw})
Iβm not sure if my clojure knowledge is helping me here or actually making it more complex. macroexpand-1
is not helping here (:exit (check ($ ($ ls -la Dockerfile) cat)))
ah so the metadata is attached to the form and this is something https://github.com/babashka/process/commit/182d08bcdc6f75f5cb6e59fb2808a125f263dac3#diff-b5668fe5b029db7465ab915136e20ef98b6ddf9585094c7ebbf325cbf9d0fa04R167 π
yes :)
ok so I think there are two people who will have no issue using this. beginners and more advanced clojure users
But maybe it just a valueable lesson about clojure π Thank you
I have updated my mental model
@yubrshen If you change *in*
to *input*
in your top program, that should maybe work. If you want to get lines from stdin yourself, you can use (clojure.string/split-lines (slurp *in*))
or (line-seq (<http://clojure.java.io/reader|clojure.java.io/reader> *in*))
ah, I see. yes. *input*
is only defined in the user
namespace, so you have to use user/*input*
in your top program
or get rid of the ns
declaration
@borkdude I see. Just use/input Thanks! I may need to have the ns namespace in order to use Clojure's test framework.
Love the $ macro, and I like metadata use, just took me a little bit of time to figure out how to adopt it for my use. I've ran into some other weird issue that when I use (check) it gets stuff if there is no error, but goes through if there is error. I will try to reproduce with smaller use case and report later.
hmm ok, thanks!
https://github.com/borkdude/babashka/issues/575#issuecomment-713105955 this is pretty cool
Yeah! Please let me know about the bug. There's still time to fix before it goes into 0.2.3
Sorry, very busy this week, I will try to isolate it at some point. Just need to try it with something simpler then aws command line. If I uncomment check above it get stuck on the success, but not on error.
what does stuck mean?
no output, like it is waiting for something
and I have ^C it
ah, this explains it. yes, check will wait for the process to exit, else it can't inspect the exit code.
so the process is maybe waiting for something?
check = deref + throw on non-zero
hmm, strange it definitely exist w/o check
is there a way to dump stack, like SIGQUIT or something?
I've tried some https://www.graalvm.org/reference-manual/native-image/NativeImageHeapdump/ but no luck
How big is the JSON it's trying to write to stdout?
@i.slack Can you try with e.g.:
{:out (io/file "out.json")}
to see if the process is maybe waiting for stdout to be consumed?when i added it to $ it writes out 148k file and exits
if i put #_ in front of it, it gets stuck again
some kind of buffering thing, try maybe with big .json file?
ah so that may be it
yeah, so:
(-> (process ["cat"] {:out (io/file "/tmp/foo.csv") :in (io/file "/Users/borkdude/Downloads/1mb-test_csv.csv")}) check)
works, but if I remove :out
is has nowhere to write, so cat is going to wait until it can@i.slack A solution:
user=> (def csv (with-out-str (-> (process ["cat"] {:out *out* :in (io/file "/Users/borkdude/Downloads/1mb-test_csv.csv")}) check)))
#'user/csv
user=> (count csv)
1000448
whereas
(def csv (with-out-str (-> (process ["cat" "foo"] {:out *out*}) check)))
would give an errorI'll write a note about this in the docs
This is probably also a good option:
(def sw (java.io.StringWriter.))
(-> (process ["cat"] {:in (slurp "<https://datahub.io/datahq/1mb-test/r/1mb-test.csv>") :out sw}) check)
(count (str sw)) ;; 1043005
as long as it has a way to write the stream somewhere
maybe it would be convenient to have an :out :string
for this use case
yep, that is the case, so it works w/ w-o-s
yes, :string is good idea, since it is a common case to check and :out slurp
@i.slack Are you testing with bb or directly on the JVM?
bb from builds
that is why i could not dump stack to see where it is stuck
This should now work in the JVM lib:
(testing "output to string"
(is (string? (-> (process ["ls"] {:out :string})
check
:out))))
I'll push it to bb masterc00l, i will test it on my use case once it is build
ok, just pushed it. should be a few minutes
Should be there now. With this enhancement the following now also works:
user=> (count (-> (process ["cat"] {:in (slurp "<https://datahub.io/datahq/1mb-test/r/1mb-test.csv>") :out (io/file "/tmp/download.csv")}) check :out slurp))
1043005
i.e. :out contains the same value as was put in