observability

o11y, monitoring, logging, tracing, alerting and higher level discussions
plexus 2020-07-17T08:44:27.107900Z

Anyone know of a good alternative to librato? I used to use their free plan off an on whenever I wanted to collect some ad-hoc metrics as it was just so easy. Create an API token, and start sending in measurements with curl. But it seems they've gotten rid of their free plan.

plexus 2020-07-17T08:45:33.108900Z

self-hosted open source stuff is also welcome if it's trivial to set up, or there's a hosted version that doesn't break the bank and is easy to cancel again

plexus 2020-07-17T08:46:28.109400Z

I guess I should look into Grafite or Prometheus but I'm afraid it's going to turn into a rabbit hole

2020-07-17T09:01:26.114600Z

This question can be a rabbit whole in itself I guess 😅 If you are looking for a hosted solution with a free plan, I would definitely have a look at https://www.honeycomb.io/ It's simple to get started and it's more than just metrics. Only having metrics as in numbers I find restricting pretty quickly - especially at small scale where you can afford keeping more data around. But if you really want only metrics, we have quite a few Prometheus instances running and it's basically zero maintenance to run Prometheus plus Grafana. When going down the Prometheus route, just need to be aware upfront that a pull-based solution is actually what you want.

plexus 2020-07-17T09:02:34.115700Z

I spun up a local Graphite through docker. The UI is very ugly but apart from that it seems to be exactly what I wanted, just feed in numbers and put them on a graph

plexus 2020-07-17T09:03:43.116600Z

context is that I need to collect cpu and memory usage from firefox to help track down a performance issue, so it's really just a local ad-hoc short term thing where I need a little bit of visibility

2020-07-17T09:06:15.119100Z

For machine metrics I only have experience with the Prometheus node_exporter, which does a good job at this. It's straight forward to setup. If you really want to just query your data and not build dashboards or alerts, then you also don't need Grafana. The Prometheus built-in web interface is good enough for that. But I am sure there are other solutions that are potentially even easier to setup than node_exporter+prometheus. I am happy to learn more 🙂

2020-07-17T09:08:54.119200Z

Just to clarify: Setting up node_exporter+prometheus would mean running two binaries and you are done. But there might be other tools that do this even simpler 🙂

plexus 2020-07-17T09:10:48.120900Z

it's all trivial once you know how to do it 🙂 sometimes the steps are easy but finding the relevant bits in the docs is not. In this case I needed two snippets.

docker run  \
 --name graphite \
 --restart=always \
 -p 1080:80 \
 -p 2003-2004:2003-2004 \
 -p 2023-2024:2023-2024 \
 -p 8125:8125/udp \
 -p 8126:8126 \
 graphiteapp/graphite-statsd
echo $metric_name $metric_value $(date +%s) | nc -q 0 localhost 2003
went to the web interface and I could see my data

2020-07-17T09:13:57.121100Z

Nice! So Graphite has some machine metrics already built-in?

plexus 2020-07-17T09:27:41.121300Z

you just feed it numbers, I hacked up some shell functions to gather up memory/cpu

plexus 2020-07-17T09:28:20.121500Z

I'm sure there are ready made scripts/daemons/agents that collect specific things, but that's already more than I needed

2020-07-17T09:30:32.121700Z

Ah I see 🙂 That's why I mentioned node_exporter as thing to go in addition to prometheus. But if you are looking for very specific things you might be quicker doing your own thing than understand the tons of metrics node_exporter is gathering for you 😅

lukasz 2020-07-17T14:46:46.122700Z

@plexus me and my team have been really happy with Grafana Cloud - very reasonable pricing for both logs and metrics, you can use either Graphite or Prometheus as the metrics backend and Loki is pretty decent for log aggregation

lukasz 2020-07-17T14:47:40.123200Z

and for running locally I always use this image: graphiteapp/graphite-statsd