Re: [hackers] [PATCH] libutil/random: fix conflict between versions

From: Elie Le Vaillant <eolien55_AT_disroot.org>
Date: Sat, 21 Dec 2024 00:20:46 +0100

Hi,

On Fri Dec 20, 2024 at 9:44 PM CET, Roberto E. Vargas Caballero wrote:
> Did you read my other mail about considering if we actually need these functions?
> I begin to consider that standard random() is good enough for our use case, and
> we don't need additional functions. What do you think?

I did read your other mail.

Maybe the PCG (rng32()) itself isn't essential, but I think rng32_seed()
and rng32_bounded() are.

It is all too often to seed rand()/random() badly (eg. using time()).
Seeding it correctly is often complicated. Hence having a separate
procedure for correctly initializing.

And none of these APIs provide _bounded(m) versions (to generate in the
range [0, m)), which, too, is non-trivial to implement properly (ie,
without bias).

I don't feel so strongly about having a custom PRNG, but here are my
thoughts:

Unacceptable RNGs:
- rand() and *rand48() are LCGs, and hence unnacceptable for generating
  random numbers in a range (low-entropy on the lower bits, which makes
  for very much non-uniform distributions on a range, which makes
  permutations usually poor quality).
- arc4random is (generally) not available.
- /dev/random and /dev/urandom can be unavailable (eg, in a chroot), and
  system calls used to read aren't standardized (Linux and OpenBSD provide
  their own, but to my knowledge most Unices do not).

The remaining candidates are thus a custom PRNG (eg PCG here), or random().

Arguments for custom PRNG/against random():
- Has better statistical properties, which make it better for generating
  numbers in a range (and better than random()).
- We do not rely on implementation details for correct ("sufficiently
  random") behavior.

Arguments against custom PRNG:
- The state of the PCG is 64 bits, which means that we can produce at
  most 2^63 permutations of any given list. Say we want to shuffle
  a deck of 52 cards. 52! ~= 10^226, so this RNG would only generate
  a very tiny fraction of the permutations.

  random() has 256 bits of state, so it should be able to generate
  all permutations of lists of length up to 57.

  Which is arguably too small to make a real difference (I don't think
  "all permutations _can_ be generated for lists up to 57 elements long,
  as opposed to 21 for PCG" is a convincing argument).

  We could go crazy and implement Mersenne Twister and its 19968 bits of
  state, but again, I do not believe this to be a very good argument.
- It's more code (9 SLOC to be precise).

In the end, I believe consistent behavior is somewhat referrable over
avoiding 9 SLOC in libutil/random.

I have a slight preference for a custom PRNG. I'm open to using random()
too.

Cheers,
Elie Le Vaillant
Received on Sat Dec 21 2024 - 00:20:46 CET

This archive was generated by hypermail 2.3.0 : Sat Dec 21 2024 - 00:24:42 CET