mirror of
https://github.com/dart-lang/sdk
synced 2024-11-02 15:01:30 +00:00
06d7a2352e
This CL adds the ability to pass the payload address of the source and destination directly to the MemoryCopy instruction as an untagged value. The new translation of the _TypedListBase._memMoveN methods use the new MemoryCopy constructor, retrieving the untagged value of the data field of both the source and destination. This way, if inlining exposes the allocation of the object from which the data field is being retrieved, then allocation sinking can remove the intermediate allocation if there are no escaping uses of the object. Since Pointer.asTypedList allocates such ExternalTypedData objects, this CL makes that method inlined if at all possible, which removes the intermediate allocation if the only use of the TypedData object is to call setRange for memory copying purposes. This CL also separates unboxed native slots into two groups: those that contain untagged addresses and those that do not. The former group now have the kUntagged representation, which mimics the old use of LoadUntagged for the PointerBase data field and also ensures that any arithmetic operations on untagged addresses must first be explicitly converted to an unboxed integer and then explicitly converted back to untagged before being stored in a slot that contains untagged addresses. When a unboxed native slot that contains untagged addresses is defined, the definition also includes a boolean which represents whether addresses that may be moved by the GC can be stored in this slot or not. The redundancy eliminator uses this to decide whether it is safe to eliminate a duplicate load, replace a load with the value originally stored in the slot, or lift a load out of a loop. In particular, the PointerBase data field may contain GC-moveable addresses, but only for internal TypedData objects and views, not for external TypedData objects or Pointers. To allow load optimizations involving the latter, the LoadField and StoreField instructions now take boolean flags for whether loads or stores from the slot are guaranteed to not be GC-moveable, to override the information from the slot argument. Notable benchmark changes on x64 (similar for other archs unless noted): JIT: * FfiMemory.PointerPointer: 250.7% * FfiStructCopy.Copy1Bytes: -26.73% (only x64) * FfiStructCopy.Copy32Bytes: -25.18% (only x64) * MemoryCopy.64.setRange.Pointer.Uint8: 19.36% * MemoryCopy.64.setRange.Pointer.Double: 18.96% * MemoryCopy.8.setRange.Pointer.Double: 17.59% * MemoryCopy.8.setRange.Pointer.Uint8: 19.46% AOT: * FfiMemory.PointerPointer: 323.5% * FfiStruct.FieldLoadStore: 483.3% * FileIO_readwrite_64kb: 15.39% * FileIO_readwrite_512kb (Intel Xeon): 46.22% * MemoryCopy.512.setRange.Pointer.Uint8: 35.20% * MemoryCopy.64.setRange.Pointer.Uint8: 55.40% * MemoryCopy.512.setRange.Pointer.Double: 29.45% * MemoryCopy.64.setRange.Pointer.Double: 60.37% * MemoryCopy.8.setRange.Pointer.Double: 59.54% * MemoryCopy.8.setRange.Pointer.Uint8: 55.40% * FfiStructCopy.Copy32Bytes: 398.3% * FfiStructCopy.Copy1Bytes: 1233% TEST=vm/dart/address_local_pointer, vm/dart/pointer_as_typed_list Issue: https://github.com/dart-lang/sdk/issues/42072 Fixes: https://github.com/dart-lang/sdk/issues/53124 Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try Change-Id: I563e0bfac5b1ac6cf1111649934067c12891b631 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324820 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com> |
||
---|---|---|
.. | ||
bin | ||
lib | ||
.gitignore | ||
api_readme.md | ||
BUILD.gn | ||
OWNERS |