luau/CodeGen/src/EmitCommonX64.h

224 lines
8 KiB
C
Raw Normal View History

// This file is part of the Luau programming language and is licensed under MIT License; see LICENSE.txt for details
#pragma once
#include "Luau/AssemblyBuilderX64.h"
Sync to upstream/release/566 (#853) * Fixed incorrect lexeme generated for string parts in the middle of an interpolated string (Fixes https://github.com/Roblox/luau/issues/744) * DeprecatedApi lint can report some issues without type inference information * Fixed performance of autocomplete requests when suggestions have large intersection types (Solves https://github.com/Roblox/luau/discussions/847) * Marked `table.getn`/`foreach`/`foreachi` as deprecated ([RFC: Deprecate table.getn/foreach/foreachi](https://github.com/Roblox/luau/blob/master/rfcs/deprecate-table-getn-foreach.md)) * With -O2 optimization level, we now optimize builtin calls based on known argument/return count. Note that this change can be observable if `getfenv/setfenv` is used to substitute a builtin, especially if arity is different. Fastcall heavy tests show a 1-2% improvement. * Luau can now be built with clang-cl (Fixes https://github.com/Roblox/luau/issues/736) We also made many improvements to our experimental components. For our new type solver: * Overhauled data flow analysis system, fixed issues with 'repeat' loops, global variables and type annotations * Type refinements now work on generic table indexing with a string literal * Type refinements will properly track potentially 'nil' values (like t[x] for a missing key) and their further refinements * Internal top table type is now isomorphic to `{}` which fixes issues when `typeof(v) == 'table'` type refinement is handled * References to non-existent types in type annotations no longer resolve to 'error' type like in old solver * Improved handling of class unions in property access expressions * Fixed default type packs * Unsealed tables can now have metatables * Restored expected types for function arguments And for native code generation: * Added min and max IR instructions mapping to vminsd/vmaxsd on x64 * We now speculatively extract direct execution fast-paths based on expected types of expressions which provides better optimization opportunities inside a single basic block * Translated existing math fastcalls to IR form to improve tag guard removal and constant propagation
2023-03-03 20:21:14 +00:00
#include "EmitCommon.h"
#include "lobject.h"
#include "ltm.h"
// MS x64 ABI reminder:
// Arguments: rcx, rdx, r8, r9 ('overlapped' with xmm0-xmm3)
// Return: rax, xmm0
// Nonvolatile: r12-r15, rdi, rsi, rbx, rbp
// SIMD: only xmm6-xmm15 are non-volatile, all ymm upper parts are volatile
// AMD64 ABI reminder:
// Arguments: rdi, rsi, rdx, rcx, r8, r9 (xmm0-xmm7)
// Return: rax, rdx, xmm0, xmm1
// Nonvolatile: r12-r15, rbx, rbp
// SIMD: all volatile
namespace Luau
{
namespace CodeGen
{
Sync to upstream/release/566 (#853) * Fixed incorrect lexeme generated for string parts in the middle of an interpolated string (Fixes https://github.com/Roblox/luau/issues/744) * DeprecatedApi lint can report some issues without type inference information * Fixed performance of autocomplete requests when suggestions have large intersection types (Solves https://github.com/Roblox/luau/discussions/847) * Marked `table.getn`/`foreach`/`foreachi` as deprecated ([RFC: Deprecate table.getn/foreach/foreachi](https://github.com/Roblox/luau/blob/master/rfcs/deprecate-table-getn-foreach.md)) * With -O2 optimization level, we now optimize builtin calls based on known argument/return count. Note that this change can be observable if `getfenv/setfenv` is used to substitute a builtin, especially if arity is different. Fastcall heavy tests show a 1-2% improvement. * Luau can now be built with clang-cl (Fixes https://github.com/Roblox/luau/issues/736) We also made many improvements to our experimental components. For our new type solver: * Overhauled data flow analysis system, fixed issues with 'repeat' loops, global variables and type annotations * Type refinements now work on generic table indexing with a string literal * Type refinements will properly track potentially 'nil' values (like t[x] for a missing key) and their further refinements * Internal top table type is now isomorphic to `{}` which fixes issues when `typeof(v) == 'table'` type refinement is handled * References to non-existent types in type annotations no longer resolve to 'error' type like in old solver * Improved handling of class unions in property access expressions * Fixed default type packs * Unsealed tables can now have metatables * Restored expected types for function arguments And for native code generation: * Added min and max IR instructions mapping to vminsd/vmaxsd on x64 * We now speculatively extract direct execution fast-paths based on expected types of expressions which provides better optimization opportunities inside a single basic block * Translated existing math fastcalls to IR form to improve tag guard removal and constant propagation
2023-03-03 20:21:14 +00:00
enum class IrCondition : uint8_t;
struct NativeState;
struct IrOp;
Sync to upstream/release/566 (#853) * Fixed incorrect lexeme generated for string parts in the middle of an interpolated string (Fixes https://github.com/Roblox/luau/issues/744) * DeprecatedApi lint can report some issues without type inference information * Fixed performance of autocomplete requests when suggestions have large intersection types (Solves https://github.com/Roblox/luau/discussions/847) * Marked `table.getn`/`foreach`/`foreachi` as deprecated ([RFC: Deprecate table.getn/foreach/foreachi](https://github.com/Roblox/luau/blob/master/rfcs/deprecate-table-getn-foreach.md)) * With -O2 optimization level, we now optimize builtin calls based on known argument/return count. Note that this change can be observable if `getfenv/setfenv` is used to substitute a builtin, especially if arity is different. Fastcall heavy tests show a 1-2% improvement. * Luau can now be built with clang-cl (Fixes https://github.com/Roblox/luau/issues/736) We also made many improvements to our experimental components. For our new type solver: * Overhauled data flow analysis system, fixed issues with 'repeat' loops, global variables and type annotations * Type refinements now work on generic table indexing with a string literal * Type refinements will properly track potentially 'nil' values (like t[x] for a missing key) and their further refinements * Internal top table type is now isomorphic to `{}` which fixes issues when `typeof(v) == 'table'` type refinement is handled * References to non-existent types in type annotations no longer resolve to 'error' type like in old solver * Improved handling of class unions in property access expressions * Fixed default type packs * Unsealed tables can now have metatables * Restored expected types for function arguments And for native code generation: * Added min and max IR instructions mapping to vminsd/vmaxsd on x64 * We now speculatively extract direct execution fast-paths based on expected types of expressions which provides better optimization opportunities inside a single basic block * Translated existing math fastcalls to IR form to improve tag guard removal and constant propagation
2023-03-03 20:21:14 +00:00
namespace X64
{
struct IrRegAllocX64;
Sync to upstream/release/572 (#899) * Fixed exported types not being suggested in autocomplete * `T...` is now convertible to `...any` (Fixes https://github.com/Roblox/luau/issues/767) * Fixed issue with `T?` not being convertible to `T | T` or `T?` (sometimes when internal pointer identity is different) * Fixed potential crash in missing table key error suggestion to use a similar existing key * `lua_topointer` now returns a pointer for strings C++ API Changes: * `prepareModuleScope` callback has moved from TypeChecker to Frontend * For LSPs, AstQuery functions (and `isWithinComment`) can be used without full Frontend data A lot of changes in our two experimental components as well. In our work on the new type-solver, the following issues were fixed: * Fixed table union and intersection indexing * Correct custom type environments are now used * Fixed issue with values of `free & number` type not accepted in numeric operations And these are the changes in native code generation (JIT): * arm64 lowering is almost complete with support for 99% of IR commands and all fastcalls * Fixed x64 assembly encoding for extended byte registers * More external x64 calls are aware of register allocator * `math.min`/`math.max` with more than 2 arguments are now lowered to IR as well * Fixed correctness issues with `math` library calls with multiple results in variadic context and with x64 register conflicts * x64 register allocator learnt to restore values from VM memory instead of always using stack spills * x64 exception unwind information now supports multiple functions and fixes function start offset in Dwarf2 info
2023-04-14 19:06:22 +01:00
constexpr uint32_t kFunctionAlignment = 32;
// Data that is very common to access is placed in non-volatile registers
constexpr RegisterX64 rState = r15; // lua_State* L
constexpr RegisterX64 rBase = r14; // StkId base
constexpr RegisterX64 rNativeContext = r13; // NativeContext* context
constexpr RegisterX64 rConstants = r12; // TValue* k
constexpr unsigned kExtraLocals = 3; // Number of 8 byte slots available for specialized local variables specified below
constexpr unsigned kSpillSlots = 5; // Number of 8 byte slots available for register allocator to spill data into
static_assert((kExtraLocals + kSpillSlots) * 8 % 16 == 0, "locals have to preserve 16 byte alignment");
constexpr uint8_t kWindowsFirstNonVolXmmReg = 6;
constexpr uint8_t kWindowsUsableXmmRegs = 10; // Some xmm regs are non-volatile, we have to balance how many we want to use/preserve
constexpr uint8_t kSystemVUsableXmmRegs = 16; // All xmm regs are volatile
inline uint8_t getXmmRegisterCount(ABIX64 abi)
{
return abi == ABIX64::SystemV ? kSystemVUsableXmmRegs : kWindowsUsableXmmRegs;
}
// Native code is as stackless as the interpreter, so we can place some data on the stack once and have it accessible at any point
// Stack is separated into sections for different data. See CodeGenX64.cpp for layout overview
constexpr unsigned kStackAlign = 8; // Bytes we need to align the stack for non-vol xmm register storage
constexpr unsigned kStackLocalStorage = 8 * kExtraLocals;
constexpr unsigned kStackSpillStorage = 8 * kSpillSlots;
constexpr unsigned kStackExtraArgumentStorage = 2 * 8; // Bytes for 5th and 6th function call arguments used under Windows ABI
constexpr unsigned kStackRegHomeStorage = 4 * 8; // Register 'home' locations that can be used by callees under Windows ABI
inline unsigned getNonVolXmmStorageSize(ABIX64 abi, uint8_t xmmRegCount)
{
if (abi == ABIX64::SystemV)
return 0;
// First 6 are volatile
if (xmmRegCount <= kWindowsFirstNonVolXmmReg)
return 0;
LUAU_ASSERT(xmmRegCount <= 16);
return (xmmRegCount - kWindowsFirstNonVolXmmReg) * 16;
}
// Useful offsets to specific parts
constexpr unsigned kStackOffsetToLocals = kStackExtraArgumentStorage + kStackRegHomeStorage;
constexpr unsigned kStackOffsetToSpillSlots = kStackOffsetToLocals + kStackLocalStorage;
inline unsigned getFullStackSize(ABIX64 abi, uint8_t xmmRegCount)
{
return kStackOffsetToSpillSlots + kStackSpillStorage + getNonVolXmmStorageSize(abi, xmmRegCount) + kStackAlign;
}
constexpr OperandX64 sClosure = qword[rsp + kStackOffsetToLocals + 0]; // Closure* cl
constexpr OperandX64 sCode = qword[rsp + kStackOffsetToLocals + 8]; // Instruction* code
constexpr OperandX64 sTemporarySlot = addr[rsp + kStackOffsetToLocals + 16];
constexpr OperandX64 sSpillArea = addr[rsp + kStackOffsetToSpillSlots];
inline OperandX64 luauReg(int ri)
{
return xmmword[rBase + ri * sizeof(TValue)];
}
inline OperandX64 luauRegAddress(int ri)
{
return addr[rBase + ri * sizeof(TValue)];
}
inline OperandX64 luauRegValue(int ri)
{
return qword[rBase + ri * sizeof(TValue) + offsetof(TValue, value)];
}
inline OperandX64 luauRegTag(int ri)
{
return dword[rBase + ri * sizeof(TValue) + offsetof(TValue, tt)];
}
inline OperandX64 luauRegValueInt(int ri)
{
return dword[rBase + ri * sizeof(TValue) + offsetof(TValue, value)];
}
inline OperandX64 luauRegValueVector(int ri, int index)
{
return dword[rBase + ri * sizeof(TValue) + offsetof(TValue, value) + (sizeof(float) * index)];
}
inline OperandX64 luauConstant(int ki)
{
return xmmword[rConstants + ki * sizeof(TValue)];
}
inline OperandX64 luauConstantAddress(int ki)
{
return addr[rConstants + ki * sizeof(TValue)];
}
inline OperandX64 luauConstantTag(int ki)
{
return dword[rConstants + ki * sizeof(TValue) + offsetof(TValue, tt)];
}
inline OperandX64 luauConstantValue(int ki)
{
return qword[rConstants + ki * sizeof(TValue) + offsetof(TValue, value)];
}
inline OperandX64 luauNodeKeyValue(RegisterX64 node)
{
return qword[node + offsetof(LuaNode, key) + offsetof(TKey, value)];
}
// Note: tag has dirty upper bits
inline OperandX64 luauNodeKeyTag(RegisterX64 node)
{
Sync to upstream/release/577 (#934) Lots of things going on this week: * Fix a crash that could occur in the presence of a cyclic union. We shouldn't be creating cyclic unions, but we shouldn't be crashing when they arise either. * Minor cleanup of `luau_precall` * Internal change to make L->top handling slightly more uniform * Optimize SETGLOBAL & GETGLOBAL fallback C functions. * https://github.com/Roblox/luau/pull/929 * The syntax to the `luau-reduce` commandline tool has changed. It now accepts a script, a command to execute, and an error to search for. It no longer automatically passes the script to the command which makes it a lot more flexible. Also be warned that it edits the script it is passed **in place**. Do not point it at something that is not in source control! New solver * Switch to a greedier but more fallible algorithm for simplifying union and intersection types that are created as part of refinement calculation. This has much better and more predictable performance. * Fix a constraint cycle in recursive function calls. * Much improved inference of binary addition. Functions like `function add(x, y) return x + y end` can now be inferred without annotations. We also accurately typecheck calls to functions like this. * Many small bugfixes surrounding things like table indexers * Add support for indexers on class types. This was previously added to the old solver; we now add it to the new one for feature parity. JIT * https://github.com/Roblox/luau/pull/931 * Fuse key.value and key.tt loads for CEHCK_SLOT_MATCH in A64 * Implement remaining aliases of BFM for A64 * Implement new callinfo flag for A64 * Add instruction simplification for int->num->int conversion chains * Don't even load execdata for X64 calls * Treat opcode fallbacks the same as manually written fallbacks --------- Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com> Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-05-19 20:37:30 +01:00
return dword[node + offsetof(LuaNode, key) + kOffsetOfTKeyTagNext];
}
inline void setLuauReg(AssemblyBuilderX64& build, RegisterX64 tmp, int ri, OperandX64 op)
{
LUAU_ASSERT(op.cat == CategoryX64::mem);
build.vmovups(tmp, op);
build.vmovups(luauReg(ri), tmp);
}
inline void jumpIfTagIs(AssemblyBuilderX64& build, int ri, lua_Type tag, Label& label)
{
build.cmp(luauRegTag(ri), tag);
build.jcc(ConditionX64::Equal, label);
}
inline void jumpIfTagIsNot(AssemblyBuilderX64& build, int ri, lua_Type tag, Label& label)
{
build.cmp(luauRegTag(ri), tag);
build.jcc(ConditionX64::NotEqual, label);
}
// Note: fallthrough label should be placed after this condition
inline void jumpIfFalsy(AssemblyBuilderX64& build, int ri, Label& target, Label& fallthrough)
{
jumpIfTagIs(build, ri, LUA_TNIL, target); // false if nil
jumpIfTagIsNot(build, ri, LUA_TBOOLEAN, fallthrough); // true if not nil or boolean
build.cmp(luauRegValueInt(ri), 0);
build.jcc(ConditionX64::Equal, target); // true if boolean value is 'true'
}
// Note: fallthrough label should be placed after this condition
inline void jumpIfTruthy(AssemblyBuilderX64& build, int ri, Label& target, Label& fallthrough)
{
jumpIfTagIs(build, ri, LUA_TNIL, fallthrough); // false if nil
jumpIfTagIsNot(build, ri, LUA_TBOOLEAN, target); // true if not nil or boolean
build.cmp(luauRegValueInt(ri), 0);
build.jcc(ConditionX64::NotEqual, target); // true if boolean value is 'true'
}
Sync to upstream/release/566 (#853) * Fixed incorrect lexeme generated for string parts in the middle of an interpolated string (Fixes https://github.com/Roblox/luau/issues/744) * DeprecatedApi lint can report some issues without type inference information * Fixed performance of autocomplete requests when suggestions have large intersection types (Solves https://github.com/Roblox/luau/discussions/847) * Marked `table.getn`/`foreach`/`foreachi` as deprecated ([RFC: Deprecate table.getn/foreach/foreachi](https://github.com/Roblox/luau/blob/master/rfcs/deprecate-table-getn-foreach.md)) * With -O2 optimization level, we now optimize builtin calls based on known argument/return count. Note that this change can be observable if `getfenv/setfenv` is used to substitute a builtin, especially if arity is different. Fastcall heavy tests show a 1-2% improvement. * Luau can now be built with clang-cl (Fixes https://github.com/Roblox/luau/issues/736) We also made many improvements to our experimental components. For our new type solver: * Overhauled data flow analysis system, fixed issues with 'repeat' loops, global variables and type annotations * Type refinements now work on generic table indexing with a string literal * Type refinements will properly track potentially 'nil' values (like t[x] for a missing key) and their further refinements * Internal top table type is now isomorphic to `{}` which fixes issues when `typeof(v) == 'table'` type refinement is handled * References to non-existent types in type annotations no longer resolve to 'error' type like in old solver * Improved handling of class unions in property access expressions * Fixed default type packs * Unsealed tables can now have metatables * Restored expected types for function arguments And for native code generation: * Added min and max IR instructions mapping to vminsd/vmaxsd on x64 * We now speculatively extract direct execution fast-paths based on expected types of expressions which provides better optimization opportunities inside a single basic block * Translated existing math fastcalls to IR form to improve tag guard removal and constant propagation
2023-03-03 20:21:14 +00:00
void jumpOnNumberCmp(AssemblyBuilderX64& build, RegisterX64 tmp, OperandX64 lhs, OperandX64 rhs, IrCondition cond, Label& label);
void getTableNodeAtCachedSlot(AssemblyBuilderX64& build, RegisterX64 tmp, RegisterX64 node, RegisterX64 table, int pcpos);
void convertNumberToIndexOrJump(AssemblyBuilderX64& build, RegisterX64 tmp, RegisterX64 numd, RegisterX64 numi, Label& label);
void callArithHelper(IrRegAllocX64& regs, AssemblyBuilderX64& build, int ra, int rb, OperandX64 c, TMS tm);
void callLengthHelper(IrRegAllocX64& regs, AssemblyBuilderX64& build, int ra, int rb);
void callGetTable(IrRegAllocX64& regs, AssemblyBuilderX64& build, int rb, OperandX64 c, int ra);
void callSetTable(IrRegAllocX64& regs, AssemblyBuilderX64& build, int rb, OperandX64 c, int ra);
void checkObjectBarrierConditions(AssemblyBuilderX64& build, RegisterX64 tmp, RegisterX64 object, int ra, int ratag, Label& skip);
void callBarrierObject(IrRegAllocX64& regs, AssemblyBuilderX64& build, RegisterX64 object, IrOp objectOp, int ra, int ratag);
void callBarrierTableFast(IrRegAllocX64& regs, AssemblyBuilderX64& build, RegisterX64 table, IrOp tableOp);
void callStepGc(IrRegAllocX64& regs, AssemblyBuilderX64& build);
void emitClearNativeFlag(AssemblyBuilderX64& build);
void emitExit(AssemblyBuilderX64& build, bool continueInVm);
void emitUpdateBase(AssemblyBuilderX64& build);
void emitInterrupt(AssemblyBuilderX64& build);
Sync to upstream/release/577 (#934) Lots of things going on this week: * Fix a crash that could occur in the presence of a cyclic union. We shouldn't be creating cyclic unions, but we shouldn't be crashing when they arise either. * Minor cleanup of `luau_precall` * Internal change to make L->top handling slightly more uniform * Optimize SETGLOBAL & GETGLOBAL fallback C functions. * https://github.com/Roblox/luau/pull/929 * The syntax to the `luau-reduce` commandline tool has changed. It now accepts a script, a command to execute, and an error to search for. It no longer automatically passes the script to the command which makes it a lot more flexible. Also be warned that it edits the script it is passed **in place**. Do not point it at something that is not in source control! New solver * Switch to a greedier but more fallible algorithm for simplifying union and intersection types that are created as part of refinement calculation. This has much better and more predictable performance. * Fix a constraint cycle in recursive function calls. * Much improved inference of binary addition. Functions like `function add(x, y) return x + y end` can now be inferred without annotations. We also accurately typecheck calls to functions like this. * Many small bugfixes surrounding things like table indexers * Add support for indexers on class types. This was previously added to the old solver; we now add it to the new one for feature parity. JIT * https://github.com/Roblox/luau/pull/931 * Fuse key.value and key.tt loads for CEHCK_SLOT_MATCH in A64 * Implement remaining aliases of BFM for A64 * Implement new callinfo flag for A64 * Add instruction simplification for int->num->int conversion chains * Don't even load execdata for X64 calls * Treat opcode fallbacks the same as manually written fallbacks --------- Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com> Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-05-19 20:37:30 +01:00
void emitFallback(IrRegAllocX64& regs, AssemblyBuilderX64& build, int offset, int pcpos);
void emitUpdatePcForExit(AssemblyBuilderX64& build);
void emitContinueCallInVm(AssemblyBuilderX64& build);
void emitReturn(AssemblyBuilderX64& build, ModuleHelpers& helpers);
Sync to upstream/release/566 (#853) * Fixed incorrect lexeme generated for string parts in the middle of an interpolated string (Fixes https://github.com/Roblox/luau/issues/744) * DeprecatedApi lint can report some issues without type inference information * Fixed performance of autocomplete requests when suggestions have large intersection types (Solves https://github.com/Roblox/luau/discussions/847) * Marked `table.getn`/`foreach`/`foreachi` as deprecated ([RFC: Deprecate table.getn/foreach/foreachi](https://github.com/Roblox/luau/blob/master/rfcs/deprecate-table-getn-foreach.md)) * With -O2 optimization level, we now optimize builtin calls based on known argument/return count. Note that this change can be observable if `getfenv/setfenv` is used to substitute a builtin, especially if arity is different. Fastcall heavy tests show a 1-2% improvement. * Luau can now be built with clang-cl (Fixes https://github.com/Roblox/luau/issues/736) We also made many improvements to our experimental components. For our new type solver: * Overhauled data flow analysis system, fixed issues with 'repeat' loops, global variables and type annotations * Type refinements now work on generic table indexing with a string literal * Type refinements will properly track potentially 'nil' values (like t[x] for a missing key) and their further refinements * Internal top table type is now isomorphic to `{}` which fixes issues when `typeof(v) == 'table'` type refinement is handled * References to non-existent types in type annotations no longer resolve to 'error' type like in old solver * Improved handling of class unions in property access expressions * Fixed default type packs * Unsealed tables can now have metatables * Restored expected types for function arguments And for native code generation: * Added min and max IR instructions mapping to vminsd/vmaxsd on x64 * We now speculatively extract direct execution fast-paths based on expected types of expressions which provides better optimization opportunities inside a single basic block * Translated existing math fastcalls to IR form to improve tag guard removal and constant propagation
2023-03-03 20:21:14 +00:00
} // namespace X64
} // namespace CodeGen
} // namespace Luau