A Comment on Fast Computation of the Median by Successive Binning

A Comment on Fast Computation of the Median by Successive Binning

I read a paper wherein the author proposes an algorithm for approximating the median of a distribution. The only fact required to justify this method is that a distribution’s median is bounded by μ±σ\mu \pm \sigma (provided both exist!). Under what conditions can we find a tigher bound using higher moments? Consider a non-negative RV, using Markov’s Inequality and letting Z=|E[X]X|Z = \lvert E[X] - X \rvert and φk(x)=xk\varphi_k(x) = x^k:

P(Zt)t1E[Z]=tkE[φk(Z)]\begin{equation} P(Z \geq t) \leq t^{-1}E[Z] = t^{-k}E[\varphi_k(Z)] \end{equation}

After doing some algebra, we can see that the median is bounded on the following ranges. However, it’s most generally applicable in its original form (i.e. k=2k = 2).

μ±infk(+f(w)|wμ|kdw)1k\begin{equation} \mu \pm \text{inf} \limits_{k \in \mathbb{N}} \left(\int_{\mathbb{R^+}}f\left(w\right)\lvert w\ -\ \mu\rvert^{k}\ dw\right)^{\frac{1}{k}} \end{equation}

Notice that the bounds derived from the higher moments are useless when moments are increasing. Unfortunatley, the following (not-hard-to-achieve) conditions are sufficient for increasing moments. Unless you’ve got a compelling reason, may as well just K.I.S.S and use the original bound.

E[Xk]=01f(x)xkdx+1f(x)xkdx01f(x)dx1f(x)xkdx𝐼𝑛𝑐. 𝑀𝑜𝑚𝑒𝑛𝑡𝑠 𝑂𝑛[k+1,)01f(x)dx1f(x)xk(x1)dx" "\begin{eqnarray} E[X^k] = \int_0^1 f(x)x^k \ dx & + & \int_1^\infty f(x)x^k \ dx \\ \int_0^1 f(x) \ dx & \leq & \int_1^\infty f(x)x^k \ dx & \implies & \textit{Inc. Moments On} \ [k+1, \infty) \\ \int_0^1 f(x) \ dx & \leq & \int_1^\infty f(x)x^{k}(x - 1) \ dx & \implies & \textit{" "} \\ \end{eqnarray}