luau/CodeGen/src/CodeBlockUnwind.cpp

163 lines
5.2 KiB
C++
Raw Normal View History

// This file is part of the Luau programming language and is licensed under MIT License; see LICENSE.txt for details
#include "Luau/CodeBlockUnwind.h"
#include "Luau/CodeAllocator.h"
#include "Luau/UnwindBuilder.h"
#include <string.h>
Sync to upstream/release/577 (#934) Lots of things going on this week: * Fix a crash that could occur in the presence of a cyclic union. We shouldn't be creating cyclic unions, but we shouldn't be crashing when they arise either. * Minor cleanup of `luau_precall` * Internal change to make L->top handling slightly more uniform * Optimize SETGLOBAL & GETGLOBAL fallback C functions. * https://github.com/Roblox/luau/pull/929 * The syntax to the `luau-reduce` commandline tool has changed. It now accepts a script, a command to execute, and an error to search for. It no longer automatically passes the script to the command which makes it a lot more flexible. Also be warned that it edits the script it is passed **in place**. Do not point it at something that is not in source control! New solver * Switch to a greedier but more fallible algorithm for simplifying union and intersection types that are created as part of refinement calculation. This has much better and more predictable performance. * Fix a constraint cycle in recursive function calls. * Much improved inference of binary addition. Functions like `function add(x, y) return x + y end` can now be inferred without annotations. We also accurately typecheck calls to functions like this. * Many small bugfixes surrounding things like table indexers * Add support for indexers on class types. This was previously added to the old solver; we now add it to the new one for feature parity. JIT * https://github.com/Roblox/luau/pull/931 * Fuse key.value and key.tt loads for CEHCK_SLOT_MATCH in A64 * Implement remaining aliases of BFM for A64 * Implement new callinfo flag for A64 * Add instruction simplification for int->num->int conversion chains * Don't even load execdata for X64 calls * Treat opcode fallbacks the same as manually written fallbacks --------- Co-authored-by: Arseny Kapoulkine <arseny.kapoulkine@gmail.com> Co-authored-by: Vyacheslav Egorov <vegorov@roblox.com>
2023-05-19 20:37:30 +01:00
#include <stdlib.h>
#if defined(_WIN32) && defined(_M_X64)
#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif
#ifndef NOMINMAX
#define NOMINMAX
#endif
#include <windows.h>
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#elif defined(__linux__) || defined(__APPLE__)
// Defined in unwind.h which may not be easily discoverable on various platforms
extern "C" void __register_frame(const void*);
extern "C" void __deregister_frame(const void*);
extern "C" void __unw_add_dynamic_fde() __attribute__((weak));
#endif
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#if defined(__APPLE__) && defined(__aarch64__)
#include <sys/sysctl.h>
#include <mach-o/loader.h>
#include <dlfcn.h>
struct unw_dynamic_unwind_sections_t
{
uintptr_t dso_base;
uintptr_t dwarf_section;
size_t dwarf_section_length;
uintptr_t compact_unwind_section;
size_t compact_unwind_section_length;
};
typedef int (*unw_add_find_dynamic_unwind_sections_t)(int (*)(uintptr_t addr, unw_dynamic_unwind_sections_t* info));
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#endif
namespace Luau
{
namespace CodeGen
{
#if defined(__APPLE__) && defined(__aarch64__)
static int findDynamicUnwindSections(uintptr_t addr, unw_dynamic_unwind_sections_t* info)
{
// Define a minimal mach header for JIT'd code.
static const mach_header_64 kFakeMachHeader = {
MH_MAGIC_64,
CPU_TYPE_ARM64,
CPU_SUBTYPE_ARM64_ALL,
MH_DYLIB,
};
info->dso_base = (uintptr_t)&kFakeMachHeader;
info->dwarf_section = 0;
info->dwarf_section_length = 0;
info->compact_unwind_section = 0;
info->compact_unwind_section_length = 0;
return 1;
}
#endif
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#if defined(__linux__) || defined(__APPLE__)
static void visitFdeEntries(char* pos, void (*cb)(const void*))
{
// When using glibc++ unwinder, we need to call __register_frame/__deregister_frame on the entire .eh_frame data
// When using libc++ unwinder (libunwind), each FDE has to be handled separately
// libc++ unwinder is the macOS unwinder, but on Linux the unwinder depends on the library the executable is linked with
// __unw_add_dynamic_fde is specific to libc++ unwinder, as such we determine the library based on its existence
if (__unw_add_dynamic_fde == nullptr)
return cb(pos);
for (;;)
{
unsigned partLength;
memcpy(&partLength, pos, sizeof(partLength));
if (partLength == 0) // Zero-length section signals completion
break;
unsigned partId;
memcpy(&partId, pos + 4, sizeof(partId));
if (partId != 0) // Skip CIE part
cb(pos); // CIE is found using an offset in FDE
pos += partLength + 4;
}
}
#endif
void* createBlockUnwindInfo(void* context, uint8_t* block, size_t blockSize, size_t& beginOffset)
{
UnwindBuilder* unwind = (UnwindBuilder*)context;
// All unwinding related data is placed together at the start of the block
Sync to upstream/release/572 (#899) * Fixed exported types not being suggested in autocomplete * `T...` is now convertible to `...any` (Fixes https://github.com/Roblox/luau/issues/767) * Fixed issue with `T?` not being convertible to `T | T` or `T?` (sometimes when internal pointer identity is different) * Fixed potential crash in missing table key error suggestion to use a similar existing key * `lua_topointer` now returns a pointer for strings C++ API Changes: * `prepareModuleScope` callback has moved from TypeChecker to Frontend * For LSPs, AstQuery functions (and `isWithinComment`) can be used without full Frontend data A lot of changes in our two experimental components as well. In our work on the new type-solver, the following issues were fixed: * Fixed table union and intersection indexing * Correct custom type environments are now used * Fixed issue with values of `free & number` type not accepted in numeric operations And these are the changes in native code generation (JIT): * arm64 lowering is almost complete with support for 99% of IR commands and all fastcalls * Fixed x64 assembly encoding for extended byte registers * More external x64 calls are aware of register allocator * `math.min`/`math.max` with more than 2 arguments are now lowered to IR as well * Fixed correctness issues with `math` library calls with multiple results in variadic context and with x64 register conflicts * x64 register allocator learnt to restore values from VM memory instead of always using stack spills * x64 exception unwind information now supports multiple functions and fixes function start offset in Dwarf2 info
2023-04-14 19:06:22 +01:00
size_t unwindSize = unwind->getSize();
unwindSize = (unwindSize + (kCodeAlignment - 1)) & ~(kCodeAlignment - 1); // Match code allocator alignment
LUAU_ASSERT(blockSize >= unwindSize);
Sync to upstream/release/572 (#899) * Fixed exported types not being suggested in autocomplete * `T...` is now convertible to `...any` (Fixes https://github.com/Roblox/luau/issues/767) * Fixed issue with `T?` not being convertible to `T | T` or `T?` (sometimes when internal pointer identity is different) * Fixed potential crash in missing table key error suggestion to use a similar existing key * `lua_topointer` now returns a pointer for strings C++ API Changes: * `prepareModuleScope` callback has moved from TypeChecker to Frontend * For LSPs, AstQuery functions (and `isWithinComment`) can be used without full Frontend data A lot of changes in our two experimental components as well. In our work on the new type-solver, the following issues were fixed: * Fixed table union and intersection indexing * Correct custom type environments are now used * Fixed issue with values of `free & number` type not accepted in numeric operations And these are the changes in native code generation (JIT): * arm64 lowering is almost complete with support for 99% of IR commands and all fastcalls * Fixed x64 assembly encoding for extended byte registers * More external x64 calls are aware of register allocator * `math.min`/`math.max` with more than 2 arguments are now lowered to IR as well * Fixed correctness issues with `math` library calls with multiple results in variadic context and with x64 register conflicts * x64 register allocator learnt to restore values from VM memory instead of always using stack spills * x64 exception unwind information now supports multiple functions and fixes function start offset in Dwarf2 info
2023-04-14 19:06:22 +01:00
char* unwindData = (char*)block;
unwind->finalize(unwindData, unwindSize, block, blockSize);
Sync to upstream/release/572 (#899) * Fixed exported types not being suggested in autocomplete * `T...` is now convertible to `...any` (Fixes https://github.com/Roblox/luau/issues/767) * Fixed issue with `T?` not being convertible to `T | T` or `T?` (sometimes when internal pointer identity is different) * Fixed potential crash in missing table key error suggestion to use a similar existing key * `lua_topointer` now returns a pointer for strings C++ API Changes: * `prepareModuleScope` callback has moved from TypeChecker to Frontend * For LSPs, AstQuery functions (and `isWithinComment`) can be used without full Frontend data A lot of changes in our two experimental components as well. In our work on the new type-solver, the following issues were fixed: * Fixed table union and intersection indexing * Correct custom type environments are now used * Fixed issue with values of `free & number` type not accepted in numeric operations And these are the changes in native code generation (JIT): * arm64 lowering is almost complete with support for 99% of IR commands and all fastcalls * Fixed x64 assembly encoding for extended byte registers * More external x64 calls are aware of register allocator * `math.min`/`math.max` with more than 2 arguments are now lowered to IR as well * Fixed correctness issues with `math` library calls with multiple results in variadic context and with x64 register conflicts * x64 register allocator learnt to restore values from VM memory instead of always using stack spills * x64 exception unwind information now supports multiple functions and fixes function start offset in Dwarf2 info
2023-04-14 19:06:22 +01:00
#if defined(_WIN32) && defined(_M_X64)
if (!RtlAddFunctionTable((RUNTIME_FUNCTION*)block, uint32_t(unwind->getFunctionCount()), uintptr_t(block)))
{
LUAU_ASSERT(!"Failed to allocate function table");
return nullptr;
}
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#elif defined(__linux__) || defined(__APPLE__)
visitFdeEntries(unwindData, __register_frame);
#endif
#if defined(__APPLE__) && defined(__aarch64__)
// Starting from macOS 14, we need to register unwind section callback to state that our ABI doesn't require pointer authentication
// This might conflict with other JITs that do the same; unfortunately this is the best we can do for now.
static unw_add_find_dynamic_unwind_sections_t unw_add_find_dynamic_unwind_sections =
unw_add_find_dynamic_unwind_sections_t(dlsym(RTLD_DEFAULT, "__unw_add_find_dynamic_unwind_sections"));
static int regonce = unw_add_find_dynamic_unwind_sections ? unw_add_find_dynamic_unwind_sections(findDynamicUnwindSections) : 0;
LUAU_ASSERT(regonce == 0);
#endif
beginOffset = unwindSize + unwind->getBeginOffset();
return block;
}
void destroyBlockUnwindInfo(void* context, void* unwindData)
{
#if defined(_WIN32) && defined(_M_X64)
Sync to upstream/release/572 (#899) * Fixed exported types not being suggested in autocomplete * `T...` is now convertible to `...any` (Fixes https://github.com/Roblox/luau/issues/767) * Fixed issue with `T?` not being convertible to `T | T` or `T?` (sometimes when internal pointer identity is different) * Fixed potential crash in missing table key error suggestion to use a similar existing key * `lua_topointer` now returns a pointer for strings C++ API Changes: * `prepareModuleScope` callback has moved from TypeChecker to Frontend * For LSPs, AstQuery functions (and `isWithinComment`) can be used without full Frontend data A lot of changes in our two experimental components as well. In our work on the new type-solver, the following issues were fixed: * Fixed table union and intersection indexing * Correct custom type environments are now used * Fixed issue with values of `free & number` type not accepted in numeric operations And these are the changes in native code generation (JIT): * arm64 lowering is almost complete with support for 99% of IR commands and all fastcalls * Fixed x64 assembly encoding for extended byte registers * More external x64 calls are aware of register allocator * `math.min`/`math.max` with more than 2 arguments are now lowered to IR as well * Fixed correctness issues with `math` library calls with multiple results in variadic context and with x64 register conflicts * x64 register allocator learnt to restore values from VM memory instead of always using stack spills * x64 exception unwind information now supports multiple functions and fixes function start offset in Dwarf2 info
2023-04-14 19:06:22 +01:00
if (!RtlDeleteFunctionTable((RUNTIME_FUNCTION*)unwindData))
LUAU_ASSERT(!"Failed to deallocate function table");
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
#elif defined(__linux__) || defined(__APPLE__)
visitFdeEntries((char*)unwindData, __deregister_frame);
#endif
}
Sync to upstream/release/576 (#928) * `ClassType` can now have an indexer defined on it. This allows custom types to be used in `t[x]` expressions. * Fixed search for closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function * Fixed how unification is performed for two optional types `a? <: b?`, previously it might have unified either 'a' or 'b' with 'nil'. Note that this fix is not enabled by default yet (see the list in `ExperimentalFlags.h`) In the new type solver, a concept of 'Type Families' has been introduced. Type families can be thought of as type aliases with custom type inference/reduction logic included with them. For example, we can have an `Add<T, U>` type family that will resolve the type that is the result of adding two values together. This will help type inference to figure out what 'T' and 'U' might be when explicit type annotations are not provided. In this update we don't define any type families, but they will be added in the near future. It is also possible for Luau embedders to define their own type families in the global/environment scope. Other changes include: * Fixed scope used to find out which generic types should be included in the function generic type list * Fixed a crash after cyclic bound types were created during unification And in native code generation (jit): * Use of arm64 target on M1 now requires macOS 13 * Entry into native code has been optimized. This is especially important for coroutine call/pcall performance as they involve going through a C call frame * LOP_LOADK(X) translation into IR has been improved to enable type tag/constant propagation * arm64 can use integer immediate values to synthesize floating-point values * x64 assembler removes duplicate 64bit numbers from the data section to save space * Linux `perf` can now be used to profile native Luau code (when running with --codegen-perf CLI argument)
2023-05-12 18:50:47 +01:00
bool isUnwindSupported()
{
#if defined(_WIN32) && defined(_M_X64)
return true;
#elif defined(__APPLE__) && defined(__aarch64__)
char ver[256];
size_t verLength = sizeof(ver);
// libunwind on macOS 12 and earlier (which maps to osrelease 21) assumes JIT frames use pointer authentication without a way to override that
return sysctlbyname("kern.osrelease", ver, &verLength, NULL, 0) == 0 && atoi(ver) >= 22;
#elif defined(__linux__) || defined(__APPLE__)
return true;
#else
return false;
#endif
}
} // namespace CodeGen
} // namespace Luau