The RtlU*ByteSwap() family:
- has FASTCALL calling convention
- is only exported from ntdll and ntoskrnl.exe in 32bit mode
(didn't check ARM though)
Wine's support for RtlUlonglongByteSwap() doesn't follow these constraints.
Note: in __fastcall, 64bit paramaters are passed on the stack, to
RtlUlonglongByteSwap() calling convention acts as __stdcall.
So:
- fix ntdll.spec (resp. ntoskrnl.exe.spec) to only export
(resp. forward) RtlUlonglongByteSwap for i386
- always provide an inline implementation in winternl.h
- reimplement ntdll.RtlUlonglongByteSwap() for i386 with
__fastcall calling convention.
- fix ntdll/tests/rtl.c to use correct calling convention.
- add test in ntdll/tests/rtl.c for inlined version.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=53536
Signed-off-by: Eric Pouech <eric.pouech@gmail.com>
Manual tests on Windows 10 show that calling Sleep(0) or NtDelayExecution() with zero timeout in a
loop do consume 100% of a CPU core, which is closer to the behavior of NtYieldExecution() than
usleep(0). usleep(0) gives up the remaining timeslices even if there are no other threads to switch
to, causing low utilization of CPU and performance issues.
The original patch is b1a79c6 and the idea is to use usleep(0) to avoid a thread taking 100% of a
CPU core for StarCraft 2 and Shadow of the Tomb Raider. However with wine-7.22, reverting the
usleep(0) patch causes no behavior changes. For Shadow of the Tomb Raider, the 100% CPU issue is
gone with or without the patch. For StarCraft 2, there is always a thread taking 100% CPU even with
the patch. After discussing with Matteo, we decided it's better to revert the patch.
Fix Mortal Kombat X performance drop during tower selection and Ragnarok Online bad performance.
This reverts commit e86b4015ff.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=53327
This moves unsafe_block_from_ptr calls outside of the heap lock.
We assume here that concurrent call to another heap function on a block
being freed is undefined, and it should then be safe to do so:
* The block type or base offset never change after a block has been
allocated and until it is freed.
* Block flags such as BLOCK_FLAG_LARGE, or BLOCK_FLAG_USER_INFO also
never change after a block has been allocated.
* Other block flags are only read and modified inside the heap lock.