mirror of
https://github.com/luau-lang/luau.git
synced 2025-05-04 10:33:46 +01:00
Bit more draft RFC about type normalization
This commit is contained in:
parent
a48c2e17cd
commit
bfe07ac261
1 changed files with 65 additions and 12 deletions
|
@ -134,7 +134,7 @@ The idea here is to think of types as sets of values, where intersection and uni
|
|||
|
||||
Values can be thought of as trees, for example a table has children given by the properties, so types are sets of trees such as
|
||||
```
|
||||
o o
|
||||
table table
|
||||
/ \ / \
|
||||
{ p q , p q , ... }
|
||||
↓ ↓ ↓ ↓
|
||||
|
@ -145,31 +145,84 @@ is the set of trees corresponding to `{ p : boolean , q : number }`.
|
|||
|
||||
In semantic subtyping, the subtyping replation is interpreted as subset order.
|
||||
|
||||
To make subset inclusion tractable, we can make use of finite state automata.
|
||||
|
||||
In the same way that sets of strings are the languages of automata, sets
|
||||
of trees are the languages of *tree automata*. Many of the same
|
||||
techniques, such as minimization, construction of union and
|
||||
intersection, etc, apply to tree automata. The main difference is that
|
||||
nondeterministic top-down tree automata are strictly more powerful than derministic ones.
|
||||
|
||||
Tree automata are given by rules of the form `q0 -> f(q1, ..., qN)` where
|
||||
Tree automata have:
|
||||
|
||||
* each `qI` is a state of the automaton, and
|
||||
* `f` is a symbol used to label a tree node with `N` children.
|
||||
* a set of states `Q`,
|
||||
* an initial state `q0`, and
|
||||
* transitions of the form `q0 -> f(p1 = q1, ..., pN = qN)` (where `f` is a function symbol, `pI` are distinct property names, and `qI` are states).
|
||||
|
||||
for example, the automaton for `{ p : boolean, q : number }` has initial state `q0` and transitions:
|
||||
(Luau has namned properties rather than positional arguments, but otherwise this is standard).
|
||||
|
||||
Draw automata where the transition `q0 -> f(p1 = q1, ..., pN = qN)` as
|
||||
```
|
||||
q0 -> { p = q1, q = q2 }
|
||||
q1 -> true
|
||||
q1 -> false
|
||||
q2 -> n for any number n
|
||||
◇
|
||||
|
|
||||
f
|
||||
↓
|
||||
.--□--.
|
||||
| |
|
||||
p1 ... pN
|
||||
↓ ↓
|
||||
◇ ◇
|
||||
```
|
||||
|
||||
Tree automata are closed under union and intersection, in the same way that string automata are.
|
||||
For example, the tree automaton for `{ p: boolean, q: number }` is:
|
||||
```
|
||||
◇
|
||||
|
|
||||
table
|
||||
↓
|
||||
.--□--.
|
||||
/ \
|
||||
p q
|
||||
↓ ↓
|
||||
◇ ◇
|
||||
/ \ |
|
||||
true false n
|
||||
↓ ↓ ↓
|
||||
□ □ □
|
||||
```
|
||||
|
||||
A relation `R` on states is a simulation whenever
|
||||
|
||||
* if `(q, r) ∈ R` and `q -> f(p1=q1, ..., pM=qM)'` then `r -> f(p1=r1,...,pN=rN)'` and for every `qI`, `(qI, rI) ∈ R`.
|
||||
|
||||
A relation is a bisimulation if both it and its inverse are simulations.
|
||||
|
||||
[Partition refinement](https://en.wikipedia.org/wiki/Partition_refinement)
|
||||
computes bisimulation on NFAs, which coincides with language
|
||||
equivalence on DFAs. It can be adapted to compute simulation. We can
|
||||
use it to compute a partition of types in the type graph, up to
|
||||
semantic equivalence of types.
|
||||
|
||||
Once we have a partition, there is a question of which representative
|
||||
to pick. We can read off the minimal type from the partitioned type
|
||||
graph, but this has lost any names given to types by type aliases.
|
||||
We could use any named type from the partition if there is one, and otherwise
|
||||
synthesize a minimal type.
|
||||
|
||||
Related work:
|
||||
|
||||
* XDuce: original work on types-as-tree-automata
|
||||
* CDuce: home of semantic subtyping
|
||||
* XML Schema, XPath, JSON Schema...: schema and query languages based on tree automata.
|
||||
|
||||
Advantages: Computes minimal types; has a sound foundation.
|
||||
|
||||
Disadvantages: complex; the size benefits of minimal types may be overwhelmed by the time to run partitioning.
|
||||
|
||||
## Drawbacks
|
||||
|
||||
Why should we *not* do this?
|
||||
The cost of computing minimal types may be too much.
|
||||
|
||||
## Alternatives
|
||||
|
||||
What other designs have been considered? What is the impact of not doing this?
|
||||
This is a draft RFC outlining the alternatives.
|
||||
|
|
Loading…
Add table
Reference in a new issue