A Method for Encoding Permutations

This is a quick one. Let’s imagine that you wanted to store a permutation of the first

n

elements of

\mathbb{N}_0

on disk. Can we store it in fewer than

n \cdot \operatorname{log_2}\Big(n)

bits?

Because there are

n!

permutations of

n

, we can encode any permutation into a single

\operatorname{log_2}\Big(n!)

bit value. Using

n=3

as an example,

\operatorname{Enc}\Big([0, 1, 2]) = 0

and

\operatorname{Enc}\Big([2, 1, 0]) = 3! - 1 = 5

. Generally, for any permutation of

[n]

, we can encode it to it’s lexicographical rank among all permutations of

n

items. That said, is this any good? We can approximate how many bits this scheme uses by applying Stirling’s Approximation.

\begin{equation} \operatorname{log_2}\left(n!\right) \approx \operatorname{log_2}\left(\sqrt{2\pi n}n^{n}e^{-n}\right) = n\operatorname{log_2}\left(n\right) + \operatorname{log_2}\left(\sqrt{2\pi n}\right) - n\operatorname{log_2}\left(e\right) \end{equation}

After doing the algebra, you’ll notice that the savings are quite poor asymptotically compared to a baseline of

n \cdot \operatorname{log_2}\Big(n)

bits. However, when

n

is small we can get an OK improvement over the naive representation. For example, when

n \leq20

, we can encode any permutation into a single

\texttt{uint64}

. Compare that with the

n

bytes required to store the permutation in

\texttt{[]uint8}

Finally, a short example of a potential application of this method. In the code below,

\texttt{RandWithNBitsSet}

uniformly samples from the subset of

\texttt{uint32}

with with exactly

n

bits set. It does this without trial-and-error or by drawing a uniform value on

[0, 32!)

and then extracting the first

n

positions of the corresponding permutation to find the

n

bits to set.

package main

import (
    "math/big"
    "math/rand"
    "slices"
)

var fact32 = BigFactorial(32) // 32!

func RandWithNBitsSet(rng *rand.Rand, n uint8) *big.Int {
    mask := big.NewInt(0).Lsh(big.NewInt(1), 118)
    mask.Sub(mask, big.NewInt(1))              // 2^118 - 1
    u := big.NewInt(0).Lsh(big.NewInt(1), 120) // larger than fact32
    for u.Cmp(fact32) != -1 {
        u = u.SetUint64(rng.Uint64())
        u.Lsh(u, 64)
        u.Add(u, big.NewInt(0).SetUint64(rng.Uint64()))
        u.And(u, mask)
    }
    arr := DecodeBig(u, n)
    u.SetUint64(0)
    for i := 0; i < int(n); i++ {
        u.Add(u, big.NewInt(1<<arr[i]))
    }
    return u
}

func DecodeBig(v *big.Int, n uint8) []uint8 {
    remain, decoded := make([]uint8, 32), make([]uint8, n)
    c, fact := big.NewInt(0), BigFactorial(32)

    for i := uint8(1); i < 32; i++ {
        remain[i] = i
    }

    for i := uint8(0); i < n; i++ {
        fact.Div(fact, big.NewInt(int64(32-i-1)))
        c.Div(v, fact)
        ic := int(c.Int64())
        decoded[i] = remain[ic]
        slices.Delete(remain, ic, ic+1)
        v.Sub(v, c.Mul(c, fact))
    }
    return decoded
}

func BigFactorial(n uint8) *big.Int {
    res := big.NewInt(1)
    for i := uint8(2); i <= n; i++ {
        res.Mul(res, big.NewInt(int64(i)))
    }
    return res
}