Bear,
I took your numbers and looked at it a bit differently, and it seems to me that there is a potential correlation between AWS sequential results and DDOG sequential results.
Please keep in mind that I am an engineer, not a statistician, so my terminology in this may not be 100% accurate, but here’s what I see. Just looking at the dataset as provided, the correlation coefficient between AWS sequential growth and DDOG sequential growth is 0.59 (correlation coefficient is what it sounds like- a way to see how two datasets are correlated. It ranges from -1 to 1, with -1 being a strong negative correlation, 0 being no correlation, and 1 being a strong positive correlation- the datasets go in the same direction if positive, in opposite directions if negative). A correlation of 0.59 is decent, but not necessarily that strong.
However, if you look a little closer at the data, you may see the relationship between AWS and DDOG results look a bit strange in Q1 2020. Using a standard test for outliers (1.5 times the interquartile range), we can see that the difference between DDOG and AWS results is an outlier in Q1 2020 (DDOG - AWS = difference). This makes sense both visually looking at a plot of the data as well as thinking back to that timeframe - covid had a varying effect on many companies when it appeared.
If you agree Q1 2020 is an outlier and remove the data, the correlation coefficient increases tremendously, to 0.89. I would consider this to be a strong correlation (as always, correlation isn’t causation, but we’re not necessarily looking for why, just what may happen). I would suggest based on this that there is a correlation between AWS and DDOG sequential results.
Another way that I enjoy looking at statistics is simply visually. As humans we’re very good at seeing patterns (often when they aren’t even there), and statistics is essentially a mathematical way of confirming (or refuting) the patterns we see in data. A simple scatter plot shows the evident correlation (shown here with the outlier removed).
If you perform a simple linear curve fit on the data, you get an r-squared value of 0.65 (r-squared is a measure of how close your actual data is to a line drawn through the data. It ranges from 0 to 1, with 1 indicating the data fits very close to the line, and 0 meaning the data does not fit at all to the theoretical line). A value of 0.65 is decent- again supporting a relationship between AWS and DDOG. Ultimately, the linear curve fit is valuable because we can use it to predict DDOGs sequential results based off its historical relationship with AWS sequential results.
If this relationship holds true, we could expect DDOG to report sequential growth of 8.4% this quarter.
As an aside, there’s definitely further you can go with this sort of analysis, such as looking at confidence intervals, etc., but I’m afraid I’ve already strayed fairly far from the purpose of this board. Hopefully some folks found this interesting.
Thanks,
Mdm