There are plenty of articles out there motivating why we should use Bayesian statistics in product analytics. I’m not going to try to reinvent the wheel here, so I will provide only a brief explaination and link to some articles that I found helpful in my learning. I will also explain why I created this package.
Bayesian statistics is a completely different way to think about statistics from what you likely learned in high school and college (typically referred to as frequentist statistics). Bayesian statistics allows us to much more easily answer a wider variety of questions that are more relevant for the business world. Although most people are used to hearing a p-value for a statistical test, many don’t actually know what it means or how to interpret it beyond the “satistical significance is when p <= 0.05” we were programmed to regurgitate. Bayesian statistics allows a much more intuitive interpretation of the results of a test. Examples of questions Bayesian statistics is purpose built to answer:
The goal of using frequentist statistics is to minimize the probability of being wrong when we pick the variant over the control. P-values are designed to be biased towards the control. In business we often run an experiment because we believe we are making an improvement to the product. There clearly needs to be statistical rigor, but a question I often get when the variant is slightly better than the control, is “why can’t we just pick the variant?“. Bayesian statistics allows for statistical support, even when picking a variant that is only slightly better.
Frequentist statistics protects us against choosing something new that isn’t actually better. This is important in things like medicine; it’s not that important in the business world. In business, we want to run lots of tests as quickly as we can, in order to make the best decisions we can about the business. Changing the color of a button on the website likely will not result in lives lost, while a new medication could. Bayesian statistics allows us to control the risk we are taking on every decision we make; we can choose to make a decision with less data than we would need with Frequentist, while controlling our risk.
Bayesian statistics is designed to use our belief about the world in order to help us make a decision. At first, this was confusing (at least to me) because it sounded like an arbitrary choice. And it is; but the key to understanding why Bayesian statistics is so powerful is that when using frequentist statistics, we are making even stronger and more arbitrary assumptions. In business and tech, we often have access to a lot of data and have a pretty good idea about conversion rates. I would argue that not using any of that information is a more egregious mistake than using the wrong prior with Bayesian statistics.
I created this package originally out of necessity and a desire to learn.