bd808.com

FOSS in, FOSS out: software, process and operations

Generating an Apparently Random Unique Sequence

Using a sequentially increasing counter to generate an id token is easy. Database sequences and auto-number columns make it fairly trivial to implement. If that isn’t available a simple file or shared memory counter can be implemented in minutes. Displaying such a number to a client however may give them more information than you would really like them to have about the number of ids you are allocating per unit time. We’d really like to obfuscate the id somehow while retaining the uniqueness of the original sequence.

One way to do this is to use a combination of multiplication and modulo arithmetic to map the sequence number into a constrained set. With careful choice of the multiplicative constant and the modulo value the resulting number can be made to wander rather effectively over the entire space of the target set.

The basic math looks like this: f(n) := (n * p) % q

  • n := input sequence value
  • p := step size
  • q := maximum result size

p and q must be chosen such that:

  • p < q
  • p * q < arithmetic limit (231, 232, 263, 264, … depending on the precision of the underlying system)
  • pq (coprime or relatively prime)

With p := 5 and q := 12 our function will generate this output:

n1 2 3 4 5 6 7 8 9 10 11
f(n)5 10 3 8 1 6 11 4 9 2 7

Change p to 7 and you’ll get:

n1 2 3 4 5 6 7 8 9 10 11
f(n)7 2 9 4 11 6 1 8 3 10 5

The rational for keeping p * q < limit is that as n approaches q the initial multiplication will approach p * q and if this calculation overflows the available precision the result will wrap back into a previously traversed space causing duplication. The same sort of thing will occur if p and q are not coprime. The result of the modulo will exhibit a period equivalent to the GCD1 of p and q rather than mapping the entire range of q evenly.

Careful choice of p and q are key to getting a good spread in the output of the function and maintaining the uniqueness of the result. One easy way to ensure that the chosen coefficients are coprime is to make them both be prime powers of prime numbers (eg 917, 1311, 1315, 197, …).

This method is a type of Linear congruential generator almost exactly equivalent to the Park–Miller random number generator.

Examples

PHP
1
2
3
4
5
6
7
8
9
10
11
12
<?php
/**
 * Obfuscate an id generated from a linear sequence.
 *
 * @param int $n Input value
 * @param int $p Random walk step size
 * @param int $q Maximum result value
 * @return int Obfuscated result
 */
function obfuscate_id ($n, $p, $q) {
  return ($n * $p) % $q;
}
PL/SQL
1
2
3
4
FUNCTION obfuscate_id (n NUMBER, p NUMBER, q NUMBER) RETURN NUMBER IS
BEGIN
  RETURN MOD(n * p, q);
END f;

Thanks to Tim for explaining all of this to me several times without becoming annoyed at the parts I wasn’t getting.


  1. Greatest Common Divisor 

Comments

Visit this post’s issue page on GitHub to add a comment.