luau/rfcs/function-bit32-clz-ctz.md
2021-10-26 16:46:29 -07:00

2.5 KiB

bit32.clz/ctz

Summary

Add bit32.clz (count leading zeroes) and bit32.ctz (count trailing zeroes) to accelerate bit scanning

Motivation

All CPUs have instructions to determine the position of first/last set bit in an integer. These instructions have a variety of uses, the popular ones being:

  • Fast implementation of integer logarithm (essentially allowing to compute floor(log2(value)) quickly)
  • Scanning set bits in an integer, which allows efficient traversal of compact representation of bitmaps
  • Allocating bits out of a bitmap quickly

Today it's possible to approximate clz using floor and log but this approximation is relatively slow; approximating ctz is difficult without iterating through each bit.

Design

bit32 library will gain two new functions, clz and ctz:

function bit32.clz(n: number): number
function bit32.ctz(n: number): number

clz takes an integer number (converting the input number to a 32-bit unsigned integer as all other bit32 functions do), and returns the number of leading zero bits - that is, the number of most significant zero bits in a 32-bit number until the first 1. The result is in [0, 32] range.

For example, when the input number is 0, it's 32. When the input number is 2^k, the result is 31-k.

ctz takes an integer number (converting the input number to a 32-bit unsigned integer as all other bit32 functions do), and returns the number of trailing zero bits - that is, the number of least significant zero bits in a 32-bit number until the first 1. The result is in [0, 32] range.

For example, when the input number is 0, it's 32. When the input number is 2^k, the result is k.

Non-normative: a proof of concept implementation shows that a polyfill for clz takes ~34 ns per loop iteration when computing clz for an increasing number sequence, whereas a builtin implementation takes ~4 ns.

Drawbacks

None known.

Alternatives

These functions can be alternatively specified as "find the position of the most/least significant bit set" (e.g. "ffs"/"fls" for "find first set"/"find last set"). This formulation can be more immediately useful since the bit position is usually more important than the number of bits. However, the bit position is undefined when the input number is zero, returning a sentinel such as -1 seems non-idiomatic, and returning nil seems awkward for calling code. Counting functions don't have this problem.

Of the two functions, clz is vastly more useful than ctz; we could implement just clz, but having both is nice for symmetry.