Truncating the value is mathematically incorrect, this error made the
conversion to grayscale unstable. In other world, calling `to_grayscale`
on a gray value would return a different value. As an example,
`Color::from_string("#686868ff"sv).to_grayscale()` used to return
#676767ff.
Merging registers, constants and locals into single vector means:
- Better data locality
- No need to check type in Interpreter::get() and Interpreter::set()
which are very hot functions
Performance improvement is visible in almost all Octane and Kraken
tests.
Instead of copying the image data pixel-by-pixel, we can memcpy full
scanlines at a time.
This knocks a 4% item down to <1% in profiles of Another World JS.
Filling a typed array with an integer shouldn't have to go through the
generic Set for every element.
This knocks a 7% item down to <1% in profiles of Another World JS at
https://cyxx.github.io/another_js/
When comparing two numbers, we can avoid a lot of implicit type
conversion nonsense and go straight to comparison, saving time in the
most common case.
To align the cursor theme tab to how DisplaySettings themes tab works,
this change forces the theme combo box to not allow free-text. Currently
on a click it puts the text cursor in the box to allow typing anything
rather than acting as a dropdown when clicking anywhere on the field.
Fixes#24306
These were out-of-line because we had some ideas about marking
instruction streams PROT_READ only, but that seems pretty arbitrary and
there's a lot of performance to be gained by putting these inline.
Performing a lookup in the blob URL registry does not work in the case
of a web worker - as the registry is not shared between processes.
However - the URL itself passed to a worker has the blob attached to it,
which we can pull out of the URL on a fetch.
Implement this function to the spec, and use the full blown URL parser
that handles blob URLs, instead of the basic-url-parser.
Also clean up a FIXME that does not seem relevant any more.
Instead, generate bytecode to execute their AST nodes and save the
resulting operands inside the NewClass instruction.
Moving property expression evaluation to happen before NewClass
execution also moves along creation of new private environment and
its population with private members (private members should be visible
during property evaluation).
Before:
- NewClass
After:
- CreatePrivateEnvironment
- AddPrivateName
- ...
- AddPrivateName
- NewClass
- LeavePrivateEnvironment
This matches the IDL definition. Without this we experience a compile
time error when adding the WheelEvent constructor due to a mimatched
type between the enum class and UnsignedLong that the prototype is
passing through to the event init struct for the constructor.
This was an unfortunate artifact from development. I originally added
this class in the DOM namespace alongside HTMLCollection until I
realized it was defined by the HTML spec instead. To remind myself to
move it over to the HTML namespace, I left this comment. It was moved
to the correct namespace before upstreaming to the main repo, but I
forgot to remove this comment!
Once we see a frame with transparent pixels, we now toggle the
"has alpha" bit in the header.
To not require a SeekableStream opened for reading, we now pass the
unmodified original flag bit to WebPAnimationWriter.
The high-level design is that we have a static method on WebPWriter that
returns an AnimationWriter object. AnimationWriter has a virtual method
for writing individual frames. This allows streaming animations to disk,
without having to buffer up the entire animation in memory first.
The semantics of this function, add_frame(), are that data is flushed
to disk every time the function is called, so that no explicit `close()`
method is needed.
For some formats that store animation length at the start of the file,
including WebP, this means that this needs to write to a SeekableStream,
so that add_frame() can seek to the start and update the size when a
frame is written.
This design should work for GIF and APNG writing as well. We can move
AnimationWriter to a new header if we add writers for these.
Currently, `animation` can read any animated image format we can read
(apng, gif, webp) and convert it to an animated webp file.
The written animated webp file is not compressed whatsoever, so this
creates large output files at the moment.
This patch stops emitting the BlockDeclarationInstantiation instruction
when there are no locals, and no function declarations in the scope.
We were spending 20% of CPU time on https://ventrella.com/Clusters/ just
creating empty environments for no reason.
Instead of relying on native stack overflows to kick us out of circular
proxy chains, we now keep track of the recursion depth and kick
ourselves out if it exceeds 10'000.
This fixes an issue where compiler tail/sibling call optimizations would
turn infinite recursion into infinite loops, and thus never hit a stack
overflow to kick itself out.
For whatever reason, we've only seen the issue on SerenityOS with UBSAN,
but it could theoretically happen on any platform.
If the minimal amount of required bindings is known in advance, it could
be used to ensure capacity to avoid resizing the internal vector that
holds bindings.
By doing that all instructions required for instantiation are emitted
once in compilation and then reused for subsequent calls, instead of
running generic instantiation process for each call.
Fix the incorrect parsing in `Chess::Move::from_algebraic`
of moves with a capture and promotion (e.g. gxf8=Q) due to
the promoted piece type not being passed through to the
`Board::is_legal` method. This addresses a FIXME regarding
this issue in `ChessWidget.cpp`.
Instead of storing a full JS::Completion for the "throw completion"
case, we now store a simple JS::Value (wrapped in ErrorValue for the
type system).
This shrinks TCO<void> and TCO<Value> (the most common TCOs) by 8 bytes,
which has a non-trivial impact on performance.
If the callee is already a temporary register, we don't need to copy it
to *another* temporary before evaluating arguments. None of the
arguments will clobber the existing temporary anyway.
We only need to make copies of locals here, in case the locals are
modified by something like increment/decrement expressions.
Registers and constants can slip right through, without being Mov'ed
into a temporary first.
We know that `undefined` in the global scope is always the proper
undefined value. This commit takes advantage of that by simply emitting
a constant undefined value instead.
Unfortunately we can't be so sure in other scopes.
This helps some of the Cloudflare Turnstile stuff run faster, since they
are deliberately screwing with JS engines by asking us to do a bunch of
bitwise operations on e.g 65535.56
By rounding such values in bytecode generation, the interpreter can stay
on the happy path while executing, and finish quite a bit faster.
We now fuse sequences like [LessThan, JumpIf] to JumpLessThan.
This is only allowed for temporaries (i.e VM registers) with no other
references to them.
Specify the time in seconds to wait for a reply for each packet sent.
Replies received out of order will not be printed as replied but they
will be considered as replied when calculating statistics. Setting it
to 0 means infinite timeout.
In flood ping mode, the time interval between each request is set
to zero to provide a rapid display of how many packets are being
dropped. For each request a period '.' is printed, while for every
reply a backspace is printed.
Before this change, switch codegen would interleave bytecode like this:
(test for case 1)
(code for case 1)
(test for case 2)
(code for case 2)
This meant that we often had to make many large jumps while looking for
the matching case, since code for each case can be huge.
It now looks like this instead:
(test for case 1)
(test for case 2)
(code for case 1)
(code for case 2)
This way, we can just fall through the tests until we hit one that fits,
without having to make any large jumps.
This removes a layer of indirection in the bytecode where we had to make
sure all the initializer elements were laid out in sequential registers.
Array expressions no longer clobber registers permanently, and they can
be reused immediately afterwards.
This patch adds a register freelist to Bytecode::Generator and switches
all operands inside the generator to a new ScopedOperand type that is
ref-counted and automatically frees the register when nothing uses it.
This dramatically reduces the size of bytecode executable register
windows, which were often in the several thousands of registers for
large functions. Most functions now use less than 100 registers.
The first block in every executable will always execute first, so if it
ends up doing a ResolveThisBinding, it's fine for all other blocks
within the same executable to use the same `this` value.
In the common case, parseInt() is being called with strings that don't
have leading whitespace. By checking for that first, we can avoid the
cost of trimming and multiple malloc/GC allocations.
Once executed, this instruction will always produce the same result
in subsequent executions, so it's okay to cache it.
Unfortunately it may throw, so we can't just hoist it to the top of
every executable, since that would break observable execution order.
Two bugs:
1. Correctly set bits in VP8X header.
Turns out these were set in the wrong order.
2. Correctly set the `has_alpha` flag.
Also add a test for writing webp files with icc data. With the
additional checks in other commits in this PR, this test catches
the bug in WebPWriter.
Rearrange some existing functions to make it easier to write this test:
* Extract encode_bitmap() from get_roundtrip_bitmap().
encode_bitmap() allows passing extra_args that the test uses to pass
in ICC data.
* Extract expect_bitmaps_equal() from test_roundtrip()
If this turns out to be too strict in practice, we can replace it with
a `dbgln("VP8X and VP8L headers disagree about alpha; ignoring VP8X");`
instead.
ALso update catdog-alert-13-alpha-used-false.webp to not trigger this.
I had manually changed the VP8L alpha flag at offset 0x2a in
da48238fbd to clear it, but I hadn't changed the VP8X flag.
This changes the byte at offset 0x14 from 0x10 (has_alpha) to 0x00
(no alpha) as well, to match.
If this turns out to be too strict in practice, we can replace it with
a dbgln() with the same message, but with with ", ignoring header."
tacked on at the end.