A Problem About Integer Combinations

A Problem About Integer Combinations

We’re given a,b,c+a, b, c \in \mathbb{R}^{+}, ZZ \in \mathbb{N}, and we must choose za,zb,zc0z_a, z_b, z_c \in \mathbb{N}_0 to minimize ff. This problem is NP-Hard and (except in special cases) requires integer programming to get an optimal answer. I’m interested in the expectation of the absolute error when the optimal values are provided.

f(za,zb,zc)=|aza+bzb+czcZ|\begin{equation} f\big(z_a, z_b, z_c) = \lvert a z_a + b z_b + c z_c - Z\rvert \end{equation}

Let’s first consider what is achievable with very little computational effort. Using just the smallest of the three variables, we get a result which bound by the better of the following:

m:=min(a,b,c)f(z*a,z*b,z*c)Z/mmf(z*a,z*b,z*c)Z/mm\begin{eqnarray} m := \operatorname{min}\big(a, b, c) \\ f\big(z*_a, z*_b, z*_c) & \leq & \lceil Z/ m \rceil \cdot m \\ f(z*_a, z*_b, z*_c) & \leq & \lfloor Z/ m \rfloor \cdot m \end{eqnarray}

Clearly we can do better. No coefficient should be large enough so that a single component exceeds zz, thus we can restrict our search space to “just” z/az/bz/c\lceil z/a \rceil \cdot \lceil z/b \rceil \cdot \lceil z/c \rceil options. If these options produce unique*^{*} values, we’ll have at least z3/abcz^3/abc possible values for the error function spread over the range [z,2z]\Big[-z, 2z]. This gives us an average density and an approximate expected error:

E[f(z*a,z*b,z*c)]12(z3abc13z)1=𝒪(abcz2)\begin{equation} E\Big[f\big(z*_a, z*_b, z*_c)] \approx \frac{1}{2}\cdot\left(\frac{z^{3}}{abc}\ \cdot\ \frac{1}{3z}\right)^{-1} = \mathcal{O}\left(\frac{abc}{z^2}\right) \end{equation}

Using 𝙿𝚞𝙻𝙿\texttt{PuLP} we can find the optimal values for za,zb,zcz_a, z_b, z_c given ZZ. I ran a small experiment using a few hundred simulations with ZZ drawn uniformly from Zn±1Z_n \pm 1. Results are below and aren’t too bad! Looks like I’m still missing a little something, but we’re correct asymptotically.

+-------+-------------------+----------------------+---------------+
|   Z_n |   error (sampled) |   error (asymptotic) |   error ratio |
+=======+===================+======================+===============+
|    32 |       0.00632345  |          0.0202406   |       3.20088 |
+-------+-------------------+----------------------+---------------+
|    64 |       0.00176678  |          0.00506015  |       2.86406 |
+-------+-------------------+----------------------+---------------+
|   128 |       0.000517157 |          0.00126504  |       2.44614 |
+-------+-------------------+----------------------+---------------+
|   256 |       0.000190304 |          0.000316259 |       1.66187 |
+-------+-------------------+----------------------+---------------+
|   512 |       5.44808e-05 |          7.90648e-05 |       1.45124 |
+-------+-------------------+----------------------+---------------+
|  1024 |       1.36526e-05 |          1.97662e-05 |       1.4478  |
+-------+-------------------+----------------------+---------------+

𝐍.𝐁\textbf{N.B} — This problem was originally presented to me using the special case where a,b,c=π,ϕ,ea, b, c = \pi, \phi, e and Z=2025Z = 2025. As mentioned before, this is simple enough to solve with integer programming.

import pulp
import math
import numpy as np

def best_lin_comb(a, b, c, Z):
    prob = pulp.LpProblem("Closest_Lin_Combination", pulp.LpMinimize)
    z_a = pulp.LpVariable("z_a", lowBound=0, upBound=math.ceil(Z/a), cat='Integer')
    z_b = pulp.LpVariable("z_b", lowBound=0, upBound=math.ceil(Z/b), cat='Integer')
    z_c = pulp.LpVariable("z_c", lowBound=0, upBound=math.ceil(Z/c), cat='Integer')
    z = pulp.LpVariable('z', lowBound=0)
    prob += z
    prob += z >= z_a*a + z_b*b + z_c*c - Z
    prob += z >= -(z_a*a + z_b*b + z_c*c - Z)
    prob.solve()
    return z_a.varValue, z_b.varValue, z_c.varValue, pulp.value(prob.objective)

a, b, c, Z = np.pi, 1/2 + math.sqrt(5)/2, np.e, 2025
k_a, k_b, k_c, delta = best_lin_comb(a, b, c, Z)

# displays ::: 50.0 * 3.14159 + 847.0 * 1.61803 + 183.0 * 2.71828 = 2025.0000042412528
# for what it's worth - 1.5 * a * b * c * z**-2 ≈ 0.000005
print(f'{k_a} * {a:0.6} + {k_b} * {b:0.6} + {k_c} * {c:0.6} = {Z + delta}')