Implement RFC: 2-component vector constructor. This includes 2-component
overload for `vector.create` and associated fastcall function, and its
type definition. These features are controlled by a new feature flag
`LuauVector2Constructor`. Additionally constant folding now supports two
components when `LuauVector2Constants` feature flag is set.
Note: this work does not include changes to CodeGen. Thus calls to
`vector.create` with only two arguments are not natively compiled
currently. This is left for future work.
To implement math.lerp without branches, we add SELECT_NUM which
selects one of the two inputs based on the comparison condition.
For simplicity, we only support C == D for now; this can be extended to
a more generic version with a IrCondition operand E, but that requires
more work on the SSE side (to flip the comparison for some conditions
like Greater, and expose more generic vcmpsd).
Note: On AArch64 this will effectively result in a change in floating
point
behavior between native code and non-native code: clang synthesizes
fmadd (because floating point contraction is allowed by default, and the
arch always has the instruction), whereas this change will use
fmul+fadd.
I am not sure if this is good or bad, and if this is a problem in C or
not.
Specifically, clang's behavior results in different results between X64
and AArch64 when *not* using codegen, and with this change the behavior
when using codegen is... the same? :)
Fixing this will require either using LERP_NUM instead and hand-coding
lowering, or exposing some sort of "quasi" MADD_NUM (which would
lower to fma on AArch64 and mul+add on X64).
A small benefit to the current approach is `lerp(1, 5, t)`
constant-folds the
subtraction. With LERP_NUM this optimization will need to be implemented
manually as a partial constant-folding for LERP_NUM.
A similar problem exists today for vector.cross & vector.dot. So maybe
this
is not something we need to fix, unsure.
# General
All code has been re-formatted by `clang-format`; this is not
mechanically enforced, so Luau may go out-of-sync over the course of the
year.
# New Solver
* Track free types interior to a block of code on `Scope`, which should
reduce the number of free types that remain un-generalized after type
checking is complete (e.g.: less errors like `'a <: number is
incompatible with number`).
# Autocomplete
* Fragment autocomplete now does *not* provide suggestions within
comments (matching non-fragment autocomplete behavior).
* Autocomplete now respects iteration and recursion limits (some hangs
will now early exit with a "unification too complex error," some crashes
will now become internal complier exceptions).
# Runtime
* Add a limit to how many Luau codegen slot nodes addresses can be in
use at the same time (fixes#1605, fixes#1558).
* Added constant folding for vector arithmetic (fixes#1553).
* Added support for `buffer.readbits` and `buffer.writebits` (see:
https://github.com/luau-lang/rfcs/pull/18).
---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
This change implements math.lerp RFC with C function definition, builtin
function, builtin constant folding and tests.
The tests validate a few lerp properties by providing counter-examples
for popular lerp implementations; the testing is of course not
exhaustive, as exhaustive testing was done offline using fuzzing.
Type definitions will be updated separately.
Codegen support will be implemented separately: it requires new IR for
conditional
selects to represent the desired logic without using a branch.
This PR refactors the CLI folder to use the same project split between
include and src directories that we have for all the other artifacts in
luau. It also includes the require-by-string implementation we already
have as a feature of `Luau.CLI.lib`. Both of these changes are targeted
at making it easier for embedding projects to setup an effective
equivalent to the standalone `luau` executable with whatever runtime
libraries they need attached and without having to unnecessarily
duplicate code from luau itself.
## New Solver
* Type functions should be able to signal whether or not irreducibility
is due to an error
* Do not generate extra expansion constraint for uninvoked user-defined
type functions
* Print in a user-defined type function reports as an error instead of
logging to stdout
* Many e-graphs bugfixes and performance improvements
* Many general bugfixes and improvements to the new solver as a whole
* Fixed issue with used-defined type functions not being able to call
each other
* Infer types of globals under new type solver
## Fragment Autocomplete
* Miscellaneous fixes to make interop with the old solver better
## Runtime
* Support disabling specific built-in functions from being fast-called
or constant-evaluated (Closes#1538)
* New compiler option `disabledBuiltins` accepts a list of library
function names like "tonumber" or "math.cos"
* Added constant folding for vector arithmetic
* Added constant propagation and type inference for vector globals
(Fixes#1511)
* New compiler option `librariesWithKnownMembers` accepts a list of
libraries for members of which a request for constant value and/or type
will be made
* `libraryMemberTypeCb` callback is called to get the type of a global,
return one of the `LuauBytecodeType` values. 'boolean', 'number',
'string' and 'vector' type are supported.
* `libraryMemberConstantCb` callback is called to setup the constant
value of a global. To set a value, C API `luau_set_compile_constant_*`
or C++ API `setCompileConstant*` functions should be used.
---
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Daniel Angel <danielangel@roblox.com>
Co-authored-by: Jonathan Kelaty <jkelaty@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Alexander McCord <amccord@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Lily Brown <lbrown@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Junseo Yoo <jyoo@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <61795485+vrn-sn@users.noreply.github.com>
Co-authored-by: Alexander Youngblood <ayoungblood@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Andrew Miranti <amiranti@roblox.com>
Co-authored-by: Shiqi Ai <sai@roblox.com>
Co-authored-by: Yohoo Lin <yohoo@roblox.com>
Co-authored-by: Daniel Angel <danielangel@roblox.com>
Co-authored-by: Jonathan Kelaty <jkelaty@roblox.com>
* General
- Fix the benchmark require wrapper function to work in Lua
- Fix memory leak in the new Luau C API test
* New Solver
- Luau: type functions should be able to signal whether or not irreducibility is due to an error
- Do not generate extra expansion constraint for uninvoked user-defined type functions
- Print in a user-defined type function should be reported as an error
instead of logging to stdout
- Many e-graphs bugfixes and performance improvements
- Many general bugfixes and improvements to the new solver as a whole
- Fixed issue with Luau used-defined type functions not having all environments initialized
- Infer types of globals under new type solver
* Fragment Autocomplete
- Miscellaneous fixes to make interop with the old solver better
* Runtime
- Support disabling specific Luau built-in functions from being
fast-called or constant-evaluated
- Added constant folding for vector arithmetic
- Added constant propagation and type inference for Vector3 globals
----------------------------------------------------------
9 contributors:
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Daniel Angel <danielangel@roblox.com>
Co-authored-by: Jonathan Kelaty <jkelaty@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
# What's Changed
* Support dead store elimination for `STORE_VECTOR` instruction
* Fix parser hang when a separator is used between Luau class
declaration properties
* Provide properties and metatable for built-in vector type definition
to fix type errors
* Fix Fragment Autocomplete to ensure correct parentheses insertion
behavior.
* Add support for 'thread' and 'buffer' primitive types in user-defined
type functions
---------
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
* Luau: support dead store elimination for STORE_VECTOR instruction
* Fixed hang when Luau class declaration props are incorrectly separated
* Provide properties and a metatable for Luau built-in vector type
* Pick the correct global scope based on the solver
* Conversational AI gets all required scripts as context
* Clip LuauRequireCyclesDontAlwaysReturnAny
* Fix Parentheses in Fragment Autocomplete
* Remove write-only locals in `Luau::getDocumentOffsets`
* The lexer can resume parsing from any arbitrary position
* Added support for 'thread' and 'buffer' primitive types in Luau user-defined type functions
---------
Co-authored-by: Andrew Miranti <amiranti@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Shiqi Ai <sai@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
Co-authored-by: Yohoo Lin <yohoo@roblox.com>
In all other places, L->top is extracted to a local when writing to
stack; this helps compilers without TBAA (MSVC) to not reload L->top
redundantly.
Also assert that we do in fact have 2 slots of stack space (which we
do).
This change folds:
a * 1 => a
a / 1 => a
a * -1 => -a
a / -1 => -a
a * 2 => a + a
a / 2^k => a * 2^-k
a - 0 => a
a + (-0) => a
Note that the following folds are all invalid:
a + 0 => a (breaks for negative zero)
a - (-0) => a (breaks for negative zero)
a - a => 0 (breaks for Inf/NaN)
0 - a => -a (breaks for negative zero)
Various cases of UNM_NUM could be optimized (eg (-a) * (-b) = a * b),
but that doesn't happen in benchmarks.
While it would be possible to also fold inverse multiplications (k * v),
these do not happen in benchmarks and rarely happen in bytecode due
to type based optimizations. Maybe this can be improved with some sort
of
IR canonicalization in the future if necessary.
I've considered moving some of these, like division strength reduction,
to IR translation (as this is where POW is lowered presently) but it
didn't
seem better one way or the other.
This change improves performance on some benchmarks, e.g. trig and
voxelgen,
and should be a strict uplift as it never generates more instructions or
longer
latency chains. On Apple M2, without division->multiplication
optimization, both
benchmarks see 0.1-0.2% uplift. Division optimization makes trig 3%
faster; I expect
the gains on X64 will be more muted, but on Apple this seems to allow
loop iterations
to overlap better by removing the division bottleneck.
## What's Changed?
* Optimized the vector dot product by up to 24%
* Allow for x/y/z/X/Y/Z vector field access by registering a `vector`
metatable
with an `__index` method (Fixes#1521)
* Fixed a bug preventing consistent recovery from parse errors in table
types.
* Optimized `k*n` and `k+n` when types are known
* Allow fragment autocomplete to handle cases like the automatic
insertion of
parens, keywords, strings, etc., while maintaining a correct relative
positioning
### New Solver
* Allow for `nil` assignment to tables and classes with indexers
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
## What's Changed?
* Optimized the vector dot product by up to 24%
* Allow for x/y/z/X/Y/Z vector field access by registering a `vector` metatable
with an `__index` method
* Fixed a bug preventing consistent recovery from parse errors in table types.
* Optimized `k*n` and `k+n` when types are known
* Allow fragment autocomplete to handle cases like the automatic insertion of
parens, keywords, strings, etc., while maintaining a correct relative positioning
### New Solver
* Added support for 'thread' and 'buffer' primitive types in Luau user-defined
type functions
* Allow for `nil` assignment to tables and classes with indexers
---------
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Aviral Goel <agoel@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
When type information is specified, we can compile k*n and k+n into
MULK/ADDK forms that are faster to execute, as long as we think n is a
number. Since we generally restrict type aware optimizations to O2, this
does that as well.
This makes trig benchmark ~4% faster on Apple M2 in VM, and also a tiny
improvement on scimark (~0.1%) can be observed. The optimization only
affects interpreted execution, as NCG already can synthesize optimal
code here.
If the type information is not truthful (e.g. user annotates type as a
number and it's not), the worst case scenario is flipped arguments to
metamethods like __add/__mul for constant left hand side.
Fixes#626 (the fix requires type information or NCG but I doubt any
further work on this is warranted)
---------
Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
### Problem
In release 0.652, `RequireResolver` was refactored to add support for
`luau-analyze`.
As part of this update, `RuntimeRequireContext` introduced a new
convention where a file's chunkname must be prefixed with `@` (e.g.,
`@./some/path.luau`). This change applies to all chunknames generated
within `RuntimeRequireContext`. However, when a `.luau` file is executed
directly from the command line (e.g., `luau ./my/script.luau`), the
chunkname is still generated with the old `=` prefix (e.g.,
`=./some/path.luau`).
Since `RuntimeRequireContext` no longer recognizes chunknames prefixed
with `=`, any attempt to directly execute a `.luau` file from the
command line fails. For example, running `luau ./my/script.luau` results
in an error stating that the context is unsupported. [This issue also
affects tools like the benchmark
runner](https://github.com/luau-lang/luau/pull/1525#issuecomment-2480454018),
which rely on direct file execution.
### Solution
Update `runFile` to replace the `=` prefix in generated chunknames with
`@`.
## What's new?
* Add support for mixed-mode type checking, which allows modules checked
in the old type solver to be checked and autocompleted by the new one.
* Generalize `RequireResolver` to support require-by-string semantics in
`luau-analyze`.
* Fix a bug in incremental autocomplete where `DefId`s associated with
index expressions were not correctly picked up.
* Fix a bug that prevented "complex" types in generic parameters (for
example, `local x: X<(() -> ())?>`).
### Issues fixed
* #1507
* #1518
---
Internal Contributors:
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Fixes https://github.com/luau-lang/luau/issues/1515.
By removing these `noexcept`s, we guarantee that the internal call to
`std::swap` uses move semantics when a `Config` is copy-assigned.
mesh-normal-scalar correctly fills sequential values in the output for
triangle cone function, but mesh-normal-vector accidentally reuses the
loop index, which results in writes to every third index of the array
(1, 4, etc.).
This is both slower (as the table turns into a hash map), and incorrect,
especially as we have a scalar version of the benchmark that does the
right thing.
Note: there's a bunch of inefficiencies in the benchmark code that I
have not fixed (around field access mostly, e.g. writing to `v.n` and
then immediately reading it again). These are not ideal for performance,
but they can be valuable to keep as is because this redundancy is common
in real-world code, and it would be nice to see codegen optimizations
eliminating most of that overhead. This one, however, is a straight up
bug, and sparse arrays should not really be the thing this benchmark
hits.
Instead of doing the dot product related math in scalar IR, we lift the
computation into a dedicated IR instruction.
On x64, we can use VDPPS which was more or less tailor made for this
purpose. This is better than manual scalar lowering that requires
reloading components from memory; it's not always a strict improvement
over the shuffle+add version (which we never had), but this can now be
adjusted in the IR lowering in an optimal fashion (maybe even based on
CPU vendor, although that'd create issues for offline compilation).
On A64, we can either use naive adds or paired adds, as there is no
dedicated vector-wide horizontal instruction until SVE. Both run at
about the same performance on M2, but paired adds require fewer
instructions and temporaries.
I've measured this using mesh-normal-vector benchmark, changing the
benchmark to just report the time of the second loop inside
`calculate_normals`, testing master vs #1504 vs this PR, also increasing
the grid size to 400 for more stable timings.
On Zen 4 (7950X), this PR is comfortably ~8% faster vs master, while I
see neutral to negative results in #1504.
On M2 (base), this PR is ~28% faster vs master, while #1504 is only
about ~10% faster.
If I measure the second loop in `calculate_tangent_space` instead, I
get:
On Zen 4 (7950X), this PR is ~12% faster vs master, while #1504 is ~3%
faster
On M2 (base), this PR is ~24% faster vs master, while #1504 is only
about ~13% faster.
Note that the loops in question are not quite optimal, as they store and
reload various vectors to dictionary values due to inappropriate use of
locals. The underlying gains in individual functions are thus larger
than the numbers above; for example, changing the `calculate_normals`
loop to use a local variable to store the normalized vector (but still
saving the result to dictionary value), I get a ~24% performance
increase from this PR on Zen4 vs master instead of just 8% (#1504 is
~15% slower in this setup).
### What's New?
* Fragment Autocomplete: a new API allows for type checking a small
fragment of code against an existing file, significantly speeding up
autocomplete performance in large files.
### New Solver
* E-Graphs have landed: this is an ongoing approach to make the new type
solver simplify types in a more consistent and principled manner, based
on similar work (see: https://egraphs-good.github.io/).
* Adds support for exporting / local user type functions (previously
they were always exported).
* Fixes a set of bugs in which the new solver will fail to complete
inference for simple expressions with just literals and operators.
### General Updates
* Requiring a path with a ".lua" or ".luau" extension will now have a
bespoke error suggesting to remove said extension.
* Fixes a bug in which whether two `Luau::Symbol`s are equal depends on
whether the new solver is enabled.
---
Internal Contributors:
Co-authored-by: Aaron Weiss <aaronweiss@roblox.com>
Co-authored-by: Andy Friesen <afriesen@roblox.com>
Co-authored-by: David Cope <dcope@roblox.com>
Co-authored-by: Hunter Goldstein <hgoldstein@roblox.com>
Co-authored-by: Varun Saini <vsaini@roblox.com>
Co-authored-by: Vighnesh Vijay <vvijay@roblox.com>
Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
> What's new?
* Fragment Autocomplete: a new API allows for type checking a small
fragment of code against an existing file, significantly speeding up
autocomplete performance in large files.
> New Solver
* E-Graphs have landed: this is an ongoing approach to make the new type solver
simplify types in a more consistent and principled manner, based on
similar work (e.g.: https://egraphs-good.github.io/).
* Adds support for exported / local user type functions.
* Fixes a set of bugs in which the new solver will fail to complete
inference for simple expressions with just literals and operators.
> General
* It is now an explicit runtime error to `require` a path with a ".lua" or
".luau" extension, and the error message will suggest removing the extension.
```
require("path/to/mymodule.lua")
```
* Fixes a bug in which whether two `Symbol`s are equal depends on
whether the new solver is enabled.