Using Harris Inequality to Bound Intractable Integrals

Using Harris Inequality to Bound Intractable Integrals

When modelling the properties of a system, you may find yourself needing to integrate over an impractically large set. For example, f:{0,1}n{0,1}f\colon \{0, 1\}^n \to \{0, 1\} may map the status of all nodes in a system to a single “up” or “down” signal, and we aim to compute uptime by integrating over {0,1}n\{0,1\}^n with respect to a measure μ:{0,1}n[0,1]\mu\colon \{0, 1\}^n \to [0, 1].

To avoid this, we can find alternatives to ff and μ\mu which allow us to compute uptime with a set of size nn rather than a set of size 2n2^n. The new functions f*(m)f*\big(m) and μ*(m)\mu*\big(m) approximate the probability the system is “up” given mm “up” nodes and the probability that mm nodes are up.

sSf(s)dμ(s)0nf*(m)dμ*(m)\begin{equation} \int_{s \in S} f(s) \ d\mu\big(s) \ \ \to \ \ \int_0^n f*(m) \ d\mu*\big(m) \end{equation}

Computationally, this should be a massive improvement, but this strategy doesn’t guarantee it’s “easy” to integrate f*f* with respect to μ*\mu*. I recently encountered such a situation and found that Harris Inequality was a useful tool to get bounds on the property I was looking for. For non-decreasing functions, ff and gg:

f(x)g(x)dμ(x)f(x)dμ(x)g(x)dμ(x)\begin{equation} \int_{\mathbb{R}} f(x)g(x) \, d\mu(x) \geq \int_{\mathbb{R}} f(x) \, d\mu(x) \int_{\mathbb{R}} g(x) \, d\mu(x) \end{equation}

Applying this to our setting, we obtain the following useful bounds. In the following, I’m assuming that μ*\mu* is increasing on [0,cn)[0, cn) and decreasing on (cn,n](cn, n].

0cnf*(x)dμ*(x)1cn(0cnf*(x)dx)(0cn1μ*(x)dx)\begin{equation} \int_{0}^{cn}f*\left(x\right)\ d\mu*\left(x\right) \ \geq \ \frac{1}{cn}\left(\int_{0}^{cn}f*\left(x\right)\ dx\right)\left(\int_{0}^{cn}1\ \cdot\mu*\left(x\right)\ dx\right) \end{equation}

cnnf*(x)dμ*(x)1ncn(cnnf*(x)dx)(cnn1μ*(x)dx)\begin{equation} \int_{cn}^{n}f*\left(x\right)\ d\mu*\left(x\right) \ \leq \ \frac{1}{n - cn}\left(\int_{cn}^{n}f*\left(x\right)\ dx\right)\left(\int_{cn}^{n}1\ \cdot\mu*\left(x\right)\ dx\right) \end{equation}

  1. For illustration’s sake (Desmos), let’s consider a model of uptime that we’ve approximated with the following equations (notice that both are increasing!). Do not worry where they come from, I am pulling them out of thin air. For a kk node system, the probability that xx nodes are up depends on some constant cc. To check the “uptime” conditional on k/2k/2 or more nodes being up, we can apply the above result to our uptime function and probability measure.
  2. With all of this said, it’s unlikely that f\int f would be significantly less challenging for the computer than fg\int fg. I don’t really know, maybe one (1)\big(1) dot product is just too many.