Create function-bit32-clz-ctz.md

2025-05-04 10:33:46 +01:00 · 2021-10-26 16:46:29 -07:00 · 2021-10-26 16:46:29 -07:00 · 68ec52a17c
commit 68ec52a17c
parent 1ec7be600c
1 changed files with 49 additions and 0 deletions
--- a/rfcs/function-bit32-clz-ctz.md
+++ b/rfcs/function-bit32-clz-ctz.md
@ -0,0 +1,49 @@
+# bit32.clz/ctz
+
+## Summary
+
+Add bit32.clz (count leading zeroes) and bit32.ctz (count trailing zeroes) to accelerate bit scanning
+
+## Motivation
+
+All CPUs have instructions to determine the position of first/last set bit in an integer. These instructions have a variety of uses, the popular ones being:
+
+- Fast implementation of integer logarithm (essentially allowing to compute `floor(log2(value))` quickly)
+- Scanning set bits in an integer, which allows efficient traversal of compact representation of bitmaps
+- Allocating bits out of a bitmap quickly
+
+Today it's possible to approximate `clz` using `floor` and `log` but this approximation is relatively slow; approximating `ctz` is difficult without iterating through each bit.
+
+## Design
+
+`bit32` library will gain two new functions, `clz` and `ctz`:
+
+```
+function bit32.clz(n: number): number
+function bit32.ctz(n: number): number
+```
+
+`clz` takes an integer number (converting the input number to a 32-bit unsigned integer as all other `bit32` functions do), and returns the number of leading zero bits - that is,
+the number of most significant zero bits in a 32-bit number until the first 1. The result is in `[0, 32]` range.
+
+For example, when the input number is `0`, it's `32`. When the input number is `2^k`, the result is `31-k`.
+
+`ctz` takes an integer number (converting the input number to a 32-bit unsigned integer as all other `bit32` functions do), and returns the number of trailing zero bits - that is,
+the number of least significant zero bits in a 32-bit number until the first 1. The result is in `[0, 32]` range.
+
+For example, when the input number is `0`, it's `32`. When the input number is `2^k`, the result is `k`.
+
+> Non-normative: a proof of concept implementation shows that a polyfill for `clz` takes ~34 ns per loop iteration when computing `clz` for an increasing number sequence, whereas
+> a builtin implementation takes ~4 ns.
+
+## Drawbacks
+
+None known.
+
+## Alternatives
+
+These functions can be alternatively specified as "find the position of the most/least significant bit set" (e.g. "ffs"/"fls" for "find first set"/"find last set"). This formulation
+can be more immediately useful since the bit position is usually more important than the number of bits. However, the bit position is undefined when the input number is zero,
+returning a sentinel such as -1 seems non-idiomatic, and returning `nil` seems awkward for calling code. Counting functions don't have this problem.
+
+Of the two functions, `clz` is vastly more useful than `ctz`; we could implement just `clz`, but having both is nice for symmetry.