Model of Maximum Load on a Distributed Cache

Model of Maximum Load on a Distributed Cache


Given gaps distributed by Exp(n)Exp\Big(n), we replace 1p1 - p and λ\lambda with 1n/(n+1)1 - n/\Big(n + 1) and nn respectively.

I have previously written about the gap between consecutive order statistics of U(0,1)U\Big(0, 1). In this post I’m just going to note several ways to estimate the expectation of the maximum gap from a sample of U(0,1)U\Big(0, 1) draws. This isn’t a purely academic exercise, estimating gaps of the uniform is of particuar interest when analyzing load patterns in distributed load balancing algorithms (e.g. consistent hashing).

𝐅𝐢𝐠𝐮𝐫𝐞 𝟏.𝟎\textbf{Figure 1.0} — Estimates for Expected Max Gap

Adjacent order statistics of nn draws of U(0,1)U\Big(0, 1) are distributed as Beta(1,n)Beta\Big(1, n). We’ll approximate Beta(1,n)Beta\Big(1, n) as Exp(n)Exp\Big(n) and use the inverse CDF of the exponential to estimate the expectation of the maximum gap as a function of nn.

Fλ1(p)=log(1p)/λE[X]log(n+1)/n\begin{equation} F_{\lambda}^{-1}\Big(p) = -\log\Big(1-p)/\lambda \implies E\Big[X] \approx \log\Big(n + 1)/n \end{equation}

We can still arrive at Exp(n)Exp\Big(n) without this handy fact about order statistics memorized. Consider a point a(0,1)a \in \Big(0, 1) and a ball centered at aa with radius rr. The probability that any of nn other points fall inside Ball(a,r)\operatorname{Ball}\Big(a, r) is 1(12r)n1e2rn1 - \Big(1 - 2r)^n \approx 1 - e^{-2rn}. This is the CDF for Exp(n)Exp\Big(n) and the argument proceeds as above. \blacksquare


If we want to be more precise we can calculate the CDF of the maximum of nn instances of Beta(1,n)Beta\Big(1, n). This more complicated calculation yields a result that is on the order of log(n)/nlog\Big(n)/n.

01n(1x)n1dx=1(1x)n(1(1x)n)n\begin{equation} \int_0^1 n\Big(1 - x)^{n - 1} dx = 1 - \Big(1 - x)^{n} \implies \Big(1 - \Big(1 - x)^{n})^{n} \end{equation}

101(1(1x)n)ndx=101(1xn)ndx\begin{equation} 1 - \int_0^1 \Big(1 - \Big(1 - x)^{n})^{n} dx = 1 - \int_0^1 \Big(1 - x^{n})^{n} dx \end{equation}

Now we can calculate the expectation of the largest gap from it’s CDF. This is made easier because Wolfram Alpha catches that 01(1(x)n)ndx=Γ(1+1/n)Γ(1+n)Γ(1+n+1/n)\int_0^1 \Big(1 - \Big(x)^{n})^{n} dx = \frac{\Gamma\Big(1 + 1/n)\Gamma\Big(1 + n)}{\Gamma\Big(1 +n + 1/n)}. From there we get the following result by noticing this is 𝑎𝑙𝑚𝑜𝑠𝑡\textit{almost} the Beta function.

E[X](1(1+n+1n)01(1x)1nxndx)=1(1+n+1n)𝐁(1+1n,1+n)\begin{equation} E\Big[X] \sim \left(1 - \Big(1 + n + \frac{1}{n}) \int_0^1 \Big(1 - x)^{\frac{1}{n}} x^{n} dx \right) = 1 - \Big(1 + n + \frac{1}{n}) \cdot \textbf{B}\Big(1 + \frac{1}{n}, 1 + n) \end{equation}