I was unable to write tests for this; it seems it doesn't consistently work on
Windows. However, Rayman 3 seems to rely on it; it maps the same buffer twice
immediately after creation, with DISCARD flags on both maps, and expects the
same address to be returned.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=53752
This was probably added on the assumption that
IDirect3DDevice8::CopyRects() behaves like the similar
IDirect3DDevice9::UpdateSurface(), but it does not.
We would like to use two different textures for the CPU and GPU parts of managed
textures, which means that wined3d_resource_access_is_managed() as such will no
longer be useful.
Garou: Mark of the Wolves calls IDirect3D9::GetAdapterModeCount() on every
frame. This results in calling EnumDisplaySettingsExW() once per available mode,
which is a very slow operation, both on Windows and Wine.
Manual testing shows that Windows caches the mode list (as well as the adapter
list, which is already cached in Wine) in Direct3D 9 and lower. Calls to
GetAdapterModeCount() and EnumAdapterDisplayModes() are fast, and they also do
not change if monitors are added or removed.
DXGI behaves differently, however. The list of outputs attached to an adapter is
cached—that is, calls to IDXGIAdapter::EnumOutputs() are fast, and return stale
data. However, at least some other calls are slow and do not seem to be cached,
including IDXGIOutput::GetDisplayModeList() and IDXGIOutput::GetDesc().
ddraw is also slow and uncached. Since all testing was done on Windows 10 (for
lack of available older hardware to test with) it is not unlikely that ddraw was
reimplemented over dxgi on newer Windows, and that older Windows versions would
be fast and cached, but this is speculation. In any case I have not included
patches to cache ddraw modes.
Tests were done on Windows 10 21H2, both on real hardware with NVidia drivers
and on software drivers via qemu/KVM. In the latter case only speed could be
tested, but this was consistent with the results from the NVidia machine.
Bloodrayne: Terminal Cut (and Bloodrayne 2: Terminal Cut, and probably other
games in the series) streams from a SYSTEMMEM index buffer, updating it and
drawing from it every frame. This is currently slow on Wine, since each map
needs to wait for the previous upload (on the CS) to complete.
There are a few ways to avoid waiting, but this patch takes the approach of
effectively uploading from the SYSTEMMEM buffer on the client side, while using
a dynamic buffer to avoid client/CS synchronization. This brings performance
from 20-30 FPS to a (locked) 60, on NVidia GL drivers.
When blitting from a staging texture in SYSMEM to an as-yet uninitialized GPU
texture, we want to avoid ever loading the source texture into TEXTURE_RGB, or
loading the destination texture into SYSMEM.
This isn't really a proper fix for the relevant tests; it just manages to avoid
the problematic code path.
This manages to fix test_yv12_overlay because it causes those blits to use the
upload path (whereas currently they go through the raw blitter, and before
1632b8e7a4 they would go through the CPU blitter), and
wined3d_texture_gl_upload_data() is one of the only functions to correctly
handle planar YUV formats.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=52684
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>