Visualizing BG/NBD Model Distributions

Posted on Tuesday, 10 September 2024

Recently, I learned about the BG/NBD (beta-geometric/negative binomial distribution) model for forecasting customer transactions (as part of a data science course by pacmann.io). It involves several probability distributions that I wasn’t familiar with, and so I created several graphs using Desmos graphing calculator to visualize them. I was particularly interested in seeing how changes in the probability density functions’ (PDF) parameters affect the shape of the PDF. I hope these visualizations also help you too.

I wouldn’t go into detail about the BG/NBD model or about each probability distribution. You can read the original paper here.

Poisson Distribution for Transaction Rates

For a particular customer, the probability that they will make $x$ transactions in a period assuming that they make $\lambda$ transactions per period on average is described by the Poisson probability distribution:

\[f(x|\lambda) = \frac{e^{-\lambda}\lambda^x}{x!}, x \geq 0\]

Note: the chart should actually be bar charts, since we’re plotting a PMF for discrete values.

Gamma Distribution for Heterogeneity of Transaction Rate

The heterogeneity of $\lambda$ among all customers is described the Gamma distribution, parameterized by the shape parameter $r$ and the scale parameter $\alpha$.

\[f(\lambda | r, \alpha) = \frac{\alpha^r \lambda^{r-1} e^{-\lambda \alpha}}{\Gamma(r)}, \lambda > 0\]

In Desmos, the Gamma function is accessed via the factorial, where $\Gamma(r) = (r - 1)!$.

Beta Distribution for Heterogeneity of Drop-off Probability

Every time a customer makes a transaction, there’s a chance that they will stop making any transactions afterwards with probability $p$, which is called the drop-off probability. This probability is assumed to be independent across time and across customers, and the heterogeneity of $p$ among all customers is assumed to follow the Beta distribution, which is parameterized by the shape parameters $a$ and $b$:

\[f(p | a, b) = \frac{p^{a-1} (1-p)^{b-1}}{B(a, b)}, 0 \leq p \leq 1\]

Where $B(a,b)$ is the beta function, which can be expressed as $B(a, b) = \Gamma(a) \Gamma(b) / \Gamma(a + b)$.

Notice the following properties of the Beta distribution: