As I've spent more time with flame graphs I realize they are, once you get down to brass tacks, the wrong tool for perf analysis because it's the width of the flame that tells you the most, not the height, and we usually worry about height when thinking about actual flames.
However there are all sorts of little subtle costs in your system that aren't captured by most of these tools due to lack of resolution (I haven't spent a lot of time with Intel's hardware solution) and asynchronous costs like memory defragmentation. Depth and frequency of calls are a useful proxy for figuring out what rocks to look under next after you've exhausted the first dozen. For this reason the flame graph is a useful fiction so I don't poo-poo them where anyone can hear. I can barely get people to look at perf data as it is.
But then I think how I'm turned off by some fictions and avoid certain fields, like parser writing, and wonder if a more accurate model would get more people to engage.
However there are all sorts of little subtle costs in your system that aren't captured by most of these tools due to lack of resolution (I haven't spent a lot of time with Intel's hardware solution) and asynchronous costs like memory defragmentation. Depth and frequency of calls are a useful proxy for figuring out what rocks to look under next after you've exhausted the first dozen. For this reason the flame graph is a useful fiction so I don't poo-poo them where anyone can hear. I can barely get people to look at perf data as it is.
But then I think how I'm turned off by some fictions and avoid certain fields, like parser writing, and wonder if a more accurate model would get more people to engage.