mirror of
https://github.com/luau-lang/luau.git
synced 2025-01-19 17:28:06 +00:00
24cacc94ed
Some checks are pending
benchmark / callgrind (map[branch:main name:luau-lang/benchmark-data], ubuntu-22.04) (push) Waiting to run
build / macos (push) Waiting to run
build / macos-arm (push) Waiting to run
build / ubuntu (push) Waiting to run
build / windows (Win32) (push) Waiting to run
build / windows (x64) (push) Waiting to run
build / coverage (push) Waiting to run
build / web (push) Waiting to run
release / macos (push) Waiting to run
release / ubuntu (push) Waiting to run
release / windows (push) Waiting to run
release / web (push) Waiting to run
To implement math.lerp without branches, we add SELECT_NUM which selects one of the two inputs based on the comparison condition. For simplicity, we only support C == D for now; this can be extended to a more generic version with a IrCondition operand E, but that requires more work on the SSE side (to flip the comparison for some conditions like Greater, and expose more generic vcmpsd). Note: On AArch64 this will effectively result in a change in floating point behavior between native code and non-native code: clang synthesizes fmadd (because floating point contraction is allowed by default, and the arch always has the instruction), whereas this change will use fmul+fadd. I am not sure if this is good or bad, and if this is a problem in C or not. Specifically, clang's behavior results in different results between X64 and AArch64 when *not* using codegen, and with this change the behavior when using codegen is... the same? :) Fixing this will require either using LERP_NUM instead and hand-coding lowering, or exposing some sort of "quasi" MADD_NUM (which would lower to fma on AArch64 and mul+add on X64). A small benefit to the current approach is `lerp(1, 5, t)` constant-folds the subtraction. With LERP_NUM this optimization will need to be implemented manually as a partial constant-folding for LERP_NUM. A similar problem exists today for vector.cross & vector.dot. So maybe this is not something we need to fix, unsure. |
||
---|---|---|
.. | ||
AddressA64.h | ||
AssemblyBuilderA64.h | ||
AssemblyBuilderX64.h | ||
BytecodeAnalysis.h | ||
BytecodeSummary.h | ||
CodeAllocator.h | ||
CodeBlockUnwind.h | ||
CodeGen.h | ||
CodeGenCommon.h | ||
ConditionA64.h | ||
ConditionX64.h | ||
IrAnalysis.h | ||
IrBuilder.h | ||
IrCallWrapperX64.h | ||
IrData.h | ||
IrDump.h | ||
IrRegAllocX64.h | ||
IrUtils.h | ||
IrVisitUseDef.h | ||
Label.h | ||
NativeProtoExecData.h | ||
OperandX64.h | ||
OptimizeConstProp.h | ||
OptimizeDeadStore.h | ||
OptimizeFinalX64.h | ||
RegisterA64.h | ||
RegisterX64.h | ||
SharedCodeAllocator.h | ||
UnwindBuilder.h | ||
UnwindBuilderDwarf2.h | ||
UnwindBuilderWin.h |