Does anyone here know of a tool for plotting 'confidence regions' of 2D probability distributions?
More precisely, I'd like to draw (posterior) probability densities as 2D heat maps, with 'contour lines' delineating regions of probability mass 95%, 99% etc.
Does that make sense, and does it have a name?
I think you can achieve something similar with ggplot2: https://ggplot2.tidyverse.org/reference/geom_contour.html
Might need to do something with stat_contour
to get the specific regions you’re interested in.
No idea about clj, I’m afraid
Thanks. Thinking out loud, I guess I could also find the appropriate density levels, either by numerical integration + dichotomic search, or by filling a 2D array with densities, sorting the values and searching for quantiles. Then draw the contours at the appropriate level lines.
I'm also wondering about the relevance of this approach for data analysis - are there alternative approaches to choosing / viewing 2D confidence regions that make this one uninteresting?
@val_waeselynck Like so? https://vega.github.io/vega/examples/contour-plot/
This is not trivial. Contours are made out of kernel density estimator which is usually just gaussian blur (for 2d) or specific kernel function (for 1d). I don't see an easy way to estimate inverse CDF for such approach.
@tsulej In this case, I can evaluate the density at any point, so it seems doable: https://clojurians.slack.com/archives/C0BQDEJ8M/p1574352117165200
Still integrating area is much more trickier than 1d range for symmetric distribution.
@val_waeselynck > by filling a 2D array with densities, sorting the values and searching for quantiles
to find quantiles you want to use icdf (cumulative density) not pdf (density). For 2d you want to find volume and area which covers say 95% of total density volume.
For distributions like multivariate normal some numerical algorithms exist but I suppose they can't be applied to general case and any distribution (especially multidimentional empirical)
> to find quantiles you want to use icdf (cumulative density) not pdf (density). Yes of course, just forgot to mention it :)
> For distributions like multivariate normal some numerical algorithms exist but I suppose they can't be applied to general case and any distribution (especially multidimentional empirical) Yes for 2d gaussians this can be solved analytically - once you have an eigen-decomposition of the covariance matrix you're good, and even that may not be mandatory.