data-science

Data science, data analysis, and machine learning in Clojure https://scicloj.github.io/pages/chat_streams/ for additional discussions
2020-03-22T14:19:56.070100Z

Hi @dmarjenburgh! Welcome 🙂

2020-03-22T14:24:45.073900Z

I'll second the call for Vega-Lite/Vega for interactive visualizations. They are based on the ideas from the Grammar of Graphics, a wonderful treatise outlining a principled approach to data visualization design and composability. It's all very declarative and data-centric, and fits in really nicely with the rest of the Clojure world. I gave a talk about it and relation to Clojure at a Seattle meetup a couple years ago which may provide some insight: https://www.youtube.com/watch?v=hXq5Bb40zZY

2020-03-22T14:25:36.074900Z

Another great video from the authors of Vega-Lite/Vega to help you get a better sense for the language itself: https://www.youtube.com/watch?v=9uaHRWj04D4

2020-03-22T14:29:27.079200Z

Thanks, looks interesting. Is this something that can be built in gorilla-repl? Since it also uses vega

2020-03-22T14:30:12.080Z

Last I'll say that while you should totally use Saite if it feels like the right tool for you, I also have a library for working with Vega-Lite & Vega from Clojure: https://github.com/metasoarous/oz This includes: • Support for creating Vega-Lite/Vega visualizations from Jupyter notebooks, via either Clojupyter or IClojure kernels • A live-reload! function which lets you treat a Clojure file as a live-reloadable notebook, so that every time you change the file, the code is re-evaluated (starting from the first form of code that changed), and display updated.

❤️ 1
2020-03-22T14:31:58.080900Z

@dmarjenburgh Yeah, you should be able to use Vega-Lite from Gorilla repl by compiling it to Vega and then plugging it in to the Gorilla view machinery. I've had half a mind to add a function for doing this to Oz, alongside the other oz.notebook.* namespaces.

2020-03-22T14:32:49.081900Z

Can someone also remind me whether nextjournal has Vega-Lite/Vega support yet?

2020-03-22T14:33:15.082300Z

It's a pretty nice notebook environment with some really neat tricks up its sleave.

2020-03-22T14:33:58.083100Z

Jupyter is definitely worth checking out if you aren't familiar with it already. It's a pretty standard Python datascience tool, but supports "kernels" which let you create notebooks with other languages, such as Clojure.

jsa-aerial 2020-03-22T18:59:44.084900Z

For Saite, you can very quickly get a sense of what you can do and how you do it by doing the following: 1. wget http://bioinformatics.bc.edu/~jsa/aerial.aerosaite-0.7.0-standalone.jar 2. java -jar aerial.aerosaite-0.7.0-standalone.jar --install take the default location 3. Move or link the jar to the ~/.saite install directory 4. Use mac-runserver or linux-runserver to run the server 5. Navigate to localhost:3000 6. Select Upload Document (up arrow key at top left) 7. Select SciCloj session and BosCljMeetup

jsa-aerial 2020-03-22T19:15:24.092500Z

You will be dumped into a 'gallery' tab where you can see some of the things you can simply do. Then you can start with the Overview tab, and work your way through the others to the right. The Templates tab is a tutorial on the template system and how you can use them to abstract visualizations. In any tab you can select the Open Editor Panel to see the code that created the markdown / LaTex / Visualizations in the body. The Tabs, Gallery tabs both have walk through tutorials in them for the results in the tab body. Other documents that show things, would be in the Test session and in there select cm-example-picframes to see how you can have editors (static or live) in the tab body mixed seamlessly with markdown and latex; and also in Test, select bulk-vis-save to see an example of how you can simply and easily create bulk sets of visualizations and save them all with a couple of clicks.

jsa-aerial 2020-03-22T19:18:55.095300Z

If you are on Win10, you can look at the mac or linux scripts to see the java command to run the server. Win10 bat/cmd script is not yet done as I haven't had access to a win machine to to the work (yet). Another 'freebie' you get with Saite, is that it will download and install the MKL libraries for Neanderthal. To try that, you can (from under Test again) load up neanderthal-et-al and walk through the document.

jsa-aerial 2020-03-22T19:28:13.098300Z

Some new things that are coming: R and Python aware editors (for both main editors and those in tab bodies) which will interop with Clojure(Script) via https://github.com/scicloj/clojisr and https://github.com/clj-python/libpython-clj and https://github.com/alanmarazzi/panthera . The really neat thing with this is they all work together (via Clojure) and you can visualize results obtained using the same VGL/VG capabilities as straight Clojure(Script)

😮 3
🎉 3
2020-03-23T08:00:24.100300Z

The only benchmark I have seen are from @justalanm here https://gitlab.com/alanmarazzi/numpy-vs-neanderthal

👍 1
2020-03-23T18:38:17.101900Z

Depends on the size of the matrix and the device (CPU vs GPU). It's best to try both and see.