Hi everyone, what's your favorite library for working with tabular data (column-/row-slicing, joining, grouping, etc)? For context I am working with a lot of time-series data where I would like to join, group, and filter on dates.
Additionally, if you like a dplyr type API, there is [tablecloth](https://github.com/scicloj/tablecloth) which is very cool.
@ronny463 this is a great walkthrough of TC with many examples from data.table and dplyr : https://scicloj.github.io/tablecloth/index.html#Introduction
thank you @jsa-aerial! Yeah I checked out <http://tech.ml|tech.ml>.dataset
and tablecloth
and thought they looked interesting. I found it strange that most of the tablecloth
features weren't already in dataset
so it kind of turned me away from those libraries. How have you found your experience with dataset
so far?
I'll check out Zulip, thank you for the recommendation!
@ronny463 It is great - nothing else really compares. Most of the tablecloth features are already in TMD. TC is really mostly a thin layer that abstracts TMD into a dplyr like API. TMD is extremely fast and scalable: https://github.com/zero-one-group/geni/blob/develop/docs/simple_performance_benchmark.md#results
@ronny463 @jsa-aerial it seems like great timing for bringing up the time series aspect to the story.
<http://tech.ml|tech.ml>.dataset
does already have good support for time-typed columns, but additional layers for time-series indexing, processing, and analysis are still missing there, afaik.
It would be great to learn from this use case and use it to push the stack forward and add some of the missing pieces (but as @jsa-aerial suggested, it may be better to bring that discussion to Zulip).
great, thanks for the feedback everyone! I'll move the convo to Zulip 🙂
@ronny463 You could also consider just using xsv
for stuff like this. https://github.com/BurntSushi/xsv