mirror of
https://github.com/luau-lang/luau.git
synced 2024-12-12 21:10:37 +00:00
RFC for new non-strict mode (#1037)
This commit is contained in:
parent
36e0e64715
commit
ffc28da331
1 changed files with 340 additions and 0 deletions
340
rfcs/new-nonstrict.md
Normal file
340
rfcs/new-nonstrict.md
Normal file
|
@ -0,0 +1,340 @@
|
||||||
|
# New non-strict mode
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Currently, strict mode and non-strict mode infer different types for
|
||||||
|
the same program. With this feature, strict and non-strict modes will
|
||||||
|
share the [local type inference](local-type-inference.md)
|
||||||
|
engine, and the only difference between the modes will be in which
|
||||||
|
errors are reported.
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
Having two different type inference engines is unnecessarily
|
||||||
|
confusing, and can result in unexpected behaviors such as changing the
|
||||||
|
mode of a module can cause errors in the users of that module.
|
||||||
|
|
||||||
|
The current non-strict mode infers very coarse types (e.g. all local
|
||||||
|
variables have type `any`) and so is not appropriate for type-driven
|
||||||
|
tooling, which results in expensively and inconsistently using
|
||||||
|
different modes for different tools.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### Code defects
|
||||||
|
|
||||||
|
The main goal of non-strict mode is to minimize false positives, that
|
||||||
|
is if non-strict mode reports an error, then we have high confidence
|
||||||
|
that there is a code defect. Example defects are:
|
||||||
|
|
||||||
|
* Run-time errors
|
||||||
|
* Dead code
|
||||||
|
* Using an expression whose only possible value is `nil`
|
||||||
|
* Writing to a table property that is never read
|
||||||
|
|
||||||
|
*Run-time errors*: this is an obvious defect. Examples include:
|
||||||
|
|
||||||
|
* Built-in operators (`"hi" + 5`)
|
||||||
|
* Luau APIs (`math.abs("hi")`)
|
||||||
|
* Function calls from embeddings (`CFrame.new("hi")`)
|
||||||
|
* Missing properties from embeddings (`CFrame.new().nope`)
|
||||||
|
|
||||||
|
Detecting run-time errors is undecidable, for example
|
||||||
|
|
||||||
|
```lua
|
||||||
|
if cond() then
|
||||||
|
math.abs(“hi”)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
It is undecidable whether this code produces a run-time error, but we
|
||||||
|
do know that if `math.abs("hi")` is executed, it will produce a
|
||||||
|
run-time error, and so report a type error in this case.
|
||||||
|
|
||||||
|
*Expressions guaranteed to be `nil`*: Luau tables do not error when a
|
||||||
|
missing property is accessed (though embeddings may). So something
|
||||||
|
like
|
||||||
|
|
||||||
|
```lua
|
||||||
|
local t = { Foo = 5 }
|
||||||
|
local x = t.Fop
|
||||||
|
```
|
||||||
|
|
||||||
|
won’t produce a run-time error, but is more likely than not a
|
||||||
|
programmer error. In this case, if the programmer intent was to
|
||||||
|
initialize `x` as `nil`, they could have written
|
||||||
|
|
||||||
|
```lua
|
||||||
|
local t = { Foo = 5 }
|
||||||
|
local x = nil
|
||||||
|
```
|
||||||
|
|
||||||
|
For this reason, we consider it a code defect to use a value that the
|
||||||
|
type system guarantees is of type `nil`.
|
||||||
|
|
||||||
|
*Writing properties that are never read*: There is a matching problem
|
||||||
|
with misspelling properties when writing. For example
|
||||||
|
|
||||||
|
```lua
|
||||||
|
function f()
|
||||||
|
local t = {}
|
||||||
|
t.Foo = 5
|
||||||
|
t.Fop = 7
|
||||||
|
print(t.Foo)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
won’t produce a run-time error, but is more likely than not a
|
||||||
|
programmer error, since `t.Fop` is written but never read. We can use
|
||||||
|
read-only and write-only table properties for this, and make it an
|
||||||
|
error to create a write-only property.
|
||||||
|
|
||||||
|
We have to be careful about this though, because if `f` ended with
|
||||||
|
`return t`, then it would be a perfectly sensible function with type
|
||||||
|
`() -> { Foo: number, Fop: number }`. The only way to detect that `Fop`
|
||||||
|
was never read would be whole-program analysis, which is prohibitively
|
||||||
|
expensive.
|
||||||
|
|
||||||
|
*Implicit coercions*: Luau supports various implicit coercions, such
|
||||||
|
as allowing `math.abs("-12")`. These should be reported as defects.
|
||||||
|
|
||||||
|
### New Non-strict error reporting
|
||||||
|
|
||||||
|
The difficult part of non-strict mode error-reporting is detecting
|
||||||
|
guaranteed run-time errors. We can do this using an error-reporting
|
||||||
|
pass that generates a type context such that if any of the `x : T` in
|
||||||
|
the type context are satisfied, then the program is guaranteed to
|
||||||
|
produce a type error.
|
||||||
|
|
||||||
|
For example in the program
|
||||||
|
|
||||||
|
```lua
|
||||||
|
function h(x, y)
|
||||||
|
math.abs(x)
|
||||||
|
string.lower(y)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
an error is reported when `x` isn’t a `number`, or `y` isn’t a `string`, so the generated context is
|
||||||
|
|
||||||
|
```
|
||||||
|
x : ~number
|
||||||
|
y : ~string
|
||||||
|
```
|
||||||
|
|
||||||
|
In the function:
|
||||||
|
|
||||||
|
```lua
|
||||||
|
function f(x)
|
||||||
|
math.abs(x)
|
||||||
|
string.lower(x)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
an error is reported when x isn’t a number or isn’t a string, so the constraint set is
|
||||||
|
|
||||||
|
```
|
||||||
|
x : ~number | ~string
|
||||||
|
```
|
||||||
|
|
||||||
|
Since `~number | ~string` is equivalent to `unknown`, non-strict mode
|
||||||
|
can report a warning, since calling the function is guaranteed to
|
||||||
|
throw a run-time error. In contrast:
|
||||||
|
|
||||||
|
```lua
|
||||||
|
function g(x)
|
||||||
|
if cond() then
|
||||||
|
math.abs(x)
|
||||||
|
else
|
||||||
|
string.lower(x)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
generates context
|
||||||
|
|
||||||
|
```
|
||||||
|
x : ~number & ~string
|
||||||
|
```
|
||||||
|
|
||||||
|
Since `~number & ~string` is not equivalent to `unknown`, non-strict mode reports no warning.
|
||||||
|
|
||||||
|
* The disjunction of contexts `C1` and `C2` contains `x : T1|T2`,
|
||||||
|
where `x : T1` is in `C1` and `x : T2` is in `C2`.
|
||||||
|
* The conjunction of contexts `C1` and `C2` contains `x : T1&T2`,
|
||||||
|
where `x : T1` is in `C1` and `x : T2` is in `C2`.
|
||||||
|
|
||||||
|
The context generated by a block is:
|
||||||
|
|
||||||
|
* `x = E` generates the context of `E : never`.
|
||||||
|
* `B1; B2` generates the disjunction of the context of `B1` and the
|
||||||
|
context of `B2`.
|
||||||
|
* `if C then B1 else B2` end generates the conjunction of the context
|
||||||
|
of `B1` and the context of `B2`.
|
||||||
|
* `local x; B` generates the context of `B`, removing the constraint
|
||||||
|
`x : T`. If the type inferred for `x` is a subtype of `T`, then
|
||||||
|
issue a warning.
|
||||||
|
* `function f(x1,...,xN) B end` generates the context for `B`,
|
||||||
|
removing `x1 : T1, ..., xN : TN`. If any of the `Ti` are equivalent to
|
||||||
|
`unknown`, then issue a warning.
|
||||||
|
|
||||||
|
The constraint set generated by a typed expression is:
|
||||||
|
|
||||||
|
* The context generated by `x : T` is `x : T`.
|
||||||
|
* The context generated by `s : T` (where `s` is a scalar) is
|
||||||
|
trivial. Issue a warning if `s` has type `T`.
|
||||||
|
* The context generated by `F(M1, ..., MN) : T` is the disjunction of
|
||||||
|
the contexts generated by `F : ~function`, and by
|
||||||
|
`M1 : T1`, ...,`MN : TN` where for each `i`, `F` has an overload
|
||||||
|
`(unknown^(i-1),Ti,unknown^(N-i)) -> error`. (Pick `Ti` to be
|
||||||
|
`never` if no such overload exists). Issue a warning if `F` has an
|
||||||
|
overload `(unknown^N) -> S` where `S <: (T | error)`.
|
||||||
|
* The context generated by `M.p` is the context generated by `M : ~table`.
|
||||||
|
* The context generated by `M[N]` is the disjunction of the contexts
|
||||||
|
generated by `M : ~table` and `N : never`.
|
||||||
|
|
||||||
|
For example:
|
||||||
|
|
||||||
|
* The context generated by `math.abs("hi") : never` is
|
||||||
|
* the context generated by `"hi" : ~number`, since `math.abs` has an
|
||||||
|
overload `(~number) -> error`, which is trivial.
|
||||||
|
* A warning is issued since `"hi"` has type `~number`.
|
||||||
|
* The context generated by `function f(x) math.abs(x); string.lower(x) end` is
|
||||||
|
* the context generated by `math.abs(x); string.lower(x)` which is the disjunction of
|
||||||
|
* the context generated by `math.abs(x)`, which is
|
||||||
|
* the context `x : ~number`, since `math.abs` has an overload `(~number)->error`
|
||||||
|
* the context generated by `string.lower(x)`, which is
|
||||||
|
* the context `x : ~string`, since `string.lower` has an overload `(~string)->error`
|
||||||
|
* remove the binding `x : (~number | ~string)`
|
||||||
|
* A warning is issued since `(~number | ~string)` is equivalent to `unknown`.
|
||||||
|
* The context generated by `math.abs(string.lower(x))` is
|
||||||
|
* the context generated by `string.lower(x) : ~number`, since `math.abs` has an overload `(~number)->error`, which is
|
||||||
|
* the text`x : ~string`, since `string.lower` has an overload `(~string)->error`.
|
||||||
|
* An warning is issued, since `string.lower` has an overload `(unknown) -> (string | error)` and `(string | error) <: (~number | error)`.
|
||||||
|
|
||||||
|
### Ergonomics
|
||||||
|
|
||||||
|
*Error reporting*. A straightforward implementation of this design
|
||||||
|
issues warnings at the point that data flows into a place
|
||||||
|
guaranteed to later produce a run-time error, which may not be perfect
|
||||||
|
ergonomics. For example, in the program:
|
||||||
|
|
||||||
|
```lua
|
||||||
|
local x
|
||||||
|
if cond() then
|
||||||
|
x = 5
|
||||||
|
else
|
||||||
|
x = nil
|
||||||
|
end
|
||||||
|
string.lower(x)
|
||||||
|
```
|
||||||
|
|
||||||
|
the type inferred for `x` is `number?` and the context generated is `x
|
||||||
|
: ~string`. Since `number? <: ~string`, a warning is issued at the
|
||||||
|
declaration `local x`. For ergonomics, we might want to identify
|
||||||
|
either `string.lower(x)` or `x = 5` (or both!) in the error report.
|
||||||
|
|
||||||
|
*Stringifying checked functions*. This design depends on functions
|
||||||
|
having overloads with `error` return type. This integrates with
|
||||||
|
[type error suppression](type-error-suppression.md), but would not be
|
||||||
|
a perfect way to present types to users. A common case is that the
|
||||||
|
checked type is the negation of the function type, for example the
|
||||||
|
type of `math.abs`:
|
||||||
|
|
||||||
|
```
|
||||||
|
(number) -> number & (~number) -> error
|
||||||
|
```
|
||||||
|
|
||||||
|
This might be better presented as an annotation on the argument type, something like:
|
||||||
|
|
||||||
|
```
|
||||||
|
@checked (number) -> number
|
||||||
|
```
|
||||||
|
|
||||||
|
The type
|
||||||
|
|
||||||
|
```
|
||||||
|
@checked (S1,...,SN) -> T
|
||||||
|
```
|
||||||
|
|
||||||
|
is equivalent to
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
(S1,...,SN) -> T
|
||||||
|
& (~S1, unknown^N-1) -> error
|
||||||
|
& (unknown, ~S2, unknown^N-2) -> error
|
||||||
|
...
|
||||||
|
& (unknown^N-1, SN) -> error
|
||||||
|
```
|
||||||
|
|
||||||
|
As a further extension, we might allow users to explicitly provide `@checked` type annotations.
|
||||||
|
|
||||||
|
Checked functions are known as strong functions in Elixir.
|
||||||
|
|
||||||
|
## Drawbacks
|
||||||
|
|
||||||
|
This is a breaking change, since it results in more errors being issued.
|
||||||
|
|
||||||
|
Strict mode infers more precise (and hence more complex) types than
|
||||||
|
current non-strict mode, which are presented by type error messages
|
||||||
|
and tools such as type hover.
|
||||||
|
|
||||||
|
## Alternatives
|
||||||
|
|
||||||
|
Success typing (used in Erlang Dialyzer) is the nearest existing
|
||||||
|
solution. It is similar to this design, except that it only works in
|
||||||
|
(the equivalent of) non-strict mode. The success typing function type
|
||||||
|
`(S)->T` is the equivalent of our
|
||||||
|
`(~S)->error & (unknown)->(T|error)`.
|
||||||
|
|
||||||
|
We could put the `@checked` annotation on individual function argments
|
||||||
|
rather than the function type.
|
||||||
|
|
||||||
|
We could use this design to infer checked functions. In function
|
||||||
|
`f(x1, ..., xN) B end`, we could generate the context
|
||||||
|
`(x1 : T1, ..., xN : TN)` for `B`, and add an overload
|
||||||
|
`(unknown^(i-1),Ti,unknown^(N-i))->error` to the inferred type of `f`. For
|
||||||
|
example, for the function
|
||||||
|
|
||||||
|
```lua
|
||||||
|
function h(x, y)
|
||||||
|
math.abs(x)
|
||||||
|
string.lower(y)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
We would infer type
|
||||||
|
|
||||||
|
```
|
||||||
|
(number, string) -> ()
|
||||||
|
& (~number, unknown) -> error
|
||||||
|
& (unknown, ~string) -> error
|
||||||
|
```
|
||||||
|
|
||||||
|
which is the same as
|
||||||
|
|
||||||
|
```
|
||||||
|
@checked (number, string) -> ()
|
||||||
|
```
|
||||||
|
|
||||||
|
The problem with doing this is what to do about recursive functions.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
Lily Brown, Andy Friesen and Alan Jeffrey
|
||||||
|
*Position Paper: Goals of the Luau Type System*,
|
||||||
|
in HATRA '21: Human Aspects of Types and Reasoning Assistants,
|
||||||
|
2021.
|
||||||
|
https://doi.org/10.48550/arXiv.2109.11397
|
||||||
|
|
||||||
|
Giuseppe Castagna, Guillaume Duboc, José Valim
|
||||||
|
*The Design Principles of the Elixir Type System*,
|
||||||
|
2023.
|
||||||
|
https://doi.org/10.48550/arXiv.2306.06391
|
||||||
|
|
||||||
|
Tobias Lindahl and Konstantinos Sagonas,
|
||||||
|
*Practical Type Inference Based on Success Typings*,
|
||||||
|
in PPDP '06: Principles and Practice of Declarative Programming,
|
||||||
|
pp. 167–178, 2006.
|
||||||
|
https://doi.org/10.1145/1140335.1140356
|
Loading…
Reference in a new issue