This new Kernel StdLib function will be used to copy contents of a
FixedStringBuffer with a null character to a user process.
The first user of this new function is the prctl option of
PR_GET_PROCESS_NAME which would copy a process name including a null
character to a user provided buffer.
There was a small mishmash of argument order, as seen on the table:
| Traits<T>::equals(U, T) | Traits<T>::equals(T, U)
============= | ======================= | =======================
uses equals() | HashMap | Vector, HashTable
defines equals() | *String[^1] | ByteBuffer
[^1]: String, DeprecatedString, their Fly-type equivalents and KString.
This mostly meant that you couldn't use a StringView for finding a value
in Vector<String>.
I'm changing the order of arguments to make the trait type itself first
(`Traits<T>::equals(T, U)`), as I think it's more expected and makes us
more consistent with the rest of the functions that put the stored type
first (like StringUtils functions and binary_serach). I've also renamed
the variable name "other" in find functions to "entry" to give more
importance to the value.
With this change, each of the following lines will now compile
successfully:
Vector<String>().contains_slow("WHF!"sv);
HashTable<String>().contains("WHF!"sv);
HashMap<ByteBuffer, int>().contains("WHF!"sv.bytes());
The proper syntax for defining user-defined literals does not require a
space between the `operator""` token and the operator name:
> error: identifier 'sv' preceded by whitespace in a literal operator
> declaration is deprecated
This change introduces HeapFunction, which is intended to be used as a
replacement for SafeFunction. The new type behaves like a regular
GC-allocated object, which means it needs to be visited from
visit_edges, and unlike SafeFunction, it does not create new roots for
captured parameters.
Co-Authored-By: Andreas Kling <kling@serenityos.org>
IFF was a generic container fileformat that was popular on the Amiga
since it was the only file format supported by Deluxe Paint.
ILBM is an image format popular in the late eighties/nineties
that uses the IFF container.
This is a very first version of the decoder that only supports
(byterun) compressed files with bpp <= 8.
Only the minimal chunks are decoded: CMAP, BODY, BMHD.
I am planning to add support for the following variants:
- EHB (32 colours + lighter 32 colours)
- HAM6 / HAM8 (special mode that allowed to display the whole Amiga
4096 colours / 262 144 colours palette)
- TrueColor (24bit)
Things that could be fun to do:
- Still images could be animated using color cycle information
The web specs do not expect decoding or decoding to happen when calling
these helpers. This allows us to remove the raw_fragment helper function
from the URL class.
This encoder can handle all integer formats and sample rates, though
only two channels well. It uses fixed LPC and performs a
close-to-optimal parameter search on the LPC order and residual Rice
parameter, leading to decent compression already.
Just like with input buffered streams, we don't currently have a use
case for output buffered streams which aren't seekable, since the main
application are files.
To ensure this happens without duplicating code, we allow forcing a
StringBuilder object to only use the inline buffer, so the code in the
AK/Format.cpp file doesn't need to deal with different underlying
storage types (expandable or inline-fixed) at all.
I couldn't run the parser in a debugger like I normally would, so I
added printouts to understand where the parser is failing.
More could be added, but these are enough to get a good idea of what
the parser is doing. It's very spammy, though, so enable it by flicking
the IMAP_PARSER_DEBUG switch :^)
Instead, use the FixedCharBuffer class to ensure we always use a static
buffer storage for these names. This ensures that if a Process or a
Thread were created, there's a guarantee that setting a new name will
never fail, as only copying of strings should be done to that static
storage.
The limits which are set are 32 characters for processes' names and 64
characters for thread names - this is because threads' names could be
more verbose than processes' names.
This class encapsulates a fixed Array with compile-time size definition
for storing ASCII characters.
There are also new Kernel StdLib functions to copy user data into such
objects so this class will be useful later on.
The spec indicates we should support serializing opaque hosts, but we
were assuming the host contained a String. Opaque hosts are represented
with Empty. Return an empty string here instead to prevent crashing on
an invalid variant access.
Now that ""_string is infallible, the only benefit of explicitly
constructing a short string is the ability to do it at compile-time. But
we never do that, so let's simplify the API and remove this
implementation detail from it.
This could happen if a sequence of '0' parts was followed by a longer
sequence of '0' parts at the end of the host. The first sequence was
being used for the compress, and not the second.
For example, [1:1:0:0:1:0:0:0] was being serialized as: [1:1::1:0:0:0]
instead of [1:1:0:0:1::].
Fix this by checking at the end of the loop if we are in the middle of a
sequence of '0' parts that is longer than the current longest.
Everywhere only ever expects percent encoding to occur, so let's just
remove this flag altogether. At the same time, replace some
DeprecatedString with StringView.
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.
Because parsing 'data:' didn't use standard fields, running the
following JS code:
new URL('#a', 'data:text/plain,hello').toString()
not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.
With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
By not clearing the buffer, we were leaking the path part of a URL into
the query for URLs without an authority component (no '//host').
This could be seen most noticeably in mailto: URLs with header fields
set, as the query part of `mailto:user@example.com?subject=test` was
parsed to `user@example.comsubject=test`.
data: URLs didn't have this problem, because we have a special case for
parsing them.
This is defined in the spec, but was missing in our table. Fix this, and
add a spec comment for what is missing. Also begin a basic text based
test for URL, so we can get some coverage of LibWeb's usage of URL too.
This takes the previous alternation optimisation and applies it to all
the alternation blocks instead of just the few instructions at the
start.
By generating a trie of instructions, all logically equivalent
instructions will be consolidated into a single node, allowing the
engine to avoid checking the same thing multiple times.
For instance, given the pattern /abc|ac|ab/, this optimisation would
generate the following tree:
- a
| - b
| | - c
| | | - <accept>
| | - <accept>
| - c
| | - <accept>
which will attempt to match 'a' or 'b' only once, and would also limit
the number of backtrackings performed in case alternatives fails to
match.
This optimisation is currently gated behind a simple cost model that
estimates the number of instructions generated, which is pessimistic for
small patterns, though the change in performance in such patterns is not
particularly large.
Similar to floor and ceil, we were forcing the values to be doubles here
Also adds a big FIXME about my findings trying to add a general
implementation for these
Both GCC and Clang inline this function to use bit-wise logic and/or
appropriate instructions even on -O0 and allow their use in a constexpr
context, see
https://godbolt.org/z/de1393vha
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.
Making that change resulted in a whole bunch of fallout.
After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.
This implementation will allow us to fix serialization of IPv6
addresses not being surrounded by '[' and ']'.
Nothing is calling this function yet - this will come in the next
(larger) commit where the underlying host representation inside of
AK::URL is changed from DeprecatedString to URL::Host.
This doesn't seem trivial enough to be defining in the header like this,
and should not be a performance critical function anyhow.
Also add spec comments while we are at it, and a FIXME since we do not
seem to exactly align.
And use them where applicable. This will allow us to store the host in
the deserialized format as the spec specifies.
Ideally these typdefs would instead be the existing AK interfaces, but
in the meantime, we can just use this.
These methods are slightly more convenient than storing the Bytes
separately. However, it it feels unsanitary to reach in and access this
data directly. Both of the users of these already have the
[Readonly]Bytes available in their constructors, and can easily avoid
using these methods, so let's remove them entirely.
Due to overload resolutions rules, this simple code provokes a crash:
ReadonlyBytes readonly_bytes{};
FixedMemoryStream stream{readonly_bytes};
ReadonlyBytes give_them_back{stream.bytes()};
// -> Panics on VERIFY(m_writing_enabled);
// but this is fine:
auto bytes = static_cast<FixedMemoryStream const&>(*stream).bytes()
If we need to be explicit about it, let's rename the overload instead of
adding that `static_cast`.
I misunderstood the spec step for checking whether the host 'ends with a
number'. We can't simply check for it if ends with a number, this check
is actually an algorithm which is required to avoid detecting hosts that
end with a number from an IPv4 host.
Implement this missing step, and add a test to cover this.
clamp_to_int clamps value to valid range of int values so resulting
value does not overflow.
It is going to be used to clamp float or double values to int that
represents fixed-point value of CSSPixels.
This reverts commit d48c68cf3f.
Unfortunately, this currently copies some warn() invocations that we do
*not* want in the debug console, such as test-js's use of OSC command 9
to report progress.
This is just a straight (and fairly inefficient) implementation of IPv6
parsing and serialization from the URL spec.
Note that we don't use AK::IPv6Address here because the URL spec
requires a specific serialization behavior.
The array which contains the actual parameters is always located
immediately after the base `TypeErasedFormatParams` object of
`VariadicFormatParams`. Hence, storing a pointer to it inside a `Span`
is redundant. Changing it to a zero-length array saves 8 bytes.
Secondly, we limit the number of parameters to 256, so `m_size` and
`m_next_index` can be stored in a smaller data type than `size_t`,
saving us another 8 bytes.
This decreases the size of a single-element `VariadicFormatParams` from
48 to 32 bytes, thus reducing the code size overhead of setting up
parameters for `dbgln()`.
Note that [arrays of length zero][1] are a GNU extension, but it's used
elsewhere in the codebase already and is explicitly supported by Clang
and GCC.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Xcode 15 betas 1-3 lack https://reviews.llvm.org/D135772, which fixes a
bug that causes trailing `requires` clauses to be evaluated twice,
failing the second time. Reported as FB12284201.
This caused compile errors when instantiating types derived from RefPtr:
> error: invalid reference to function 'NonnullRefPtr': constraints not
> satisfied
> note: because substituted constraint expression is ill-formed: value
> of type '<dependent type>' is not contextually convertible to 'bool'.
This commit works around the issue by moving the `requires` clauses
after the template parameter list.
In most cases, trailing `requires` clauses and those specified after the
template parameter list work identically, so this change should not
impact the code's behavior. The only difference is that trailing
requires clauses are evaluated *after* constrained placeholder types
(i.e. `Integral auto i` function parameter).
All elements of the vector were moved to the left, for each element to
remove. This patch makes the function move each element exactly once.
On the same test case as the previous commit, it makes the function
disappear from the profile. These two commits combined reduce the
decompression time by 12%.
As confusing as it may sound, reusing them is terrible performance wise.
When profiling the PNG decoder, the result (which is dominated by the
Zlib decompression) shows that the `cleanup_unused_chunks()` function
represented 14.26% of the profile before this patch and only 7.7%
afterward.
On a 6.5 MB PNG image, it reduces the decompression time by more than
5%.
This uses one of Sun OS's algorithms, for a comparison to other
algorithms please refer to
https://gist.github.com/Hendiadyoin1/f58346d66637deb9156ef360aa158bf9
This is used on aarch64 builds and for x86 floats and doubles
for performance gains check
https://quick-bench.com/q/_2jTykshP6cUqtgdepFaoQ53YC8
which shows approximately 2x gains
Co-Authored-By: Ben Wiederhake <BenWiederhake.GitHub@gmx.de>
Co-Authored-By: kleines Filmröllchen <filmroellchen@serenityos.org>
Co-Authored-By: Dan Klishch <danilklishch@gmail.com>
This now searches the memory in blocks, which should be slightly more
efficient. However, it doesn't make much difference (e.g. ~1% in LZMA
compression) in most real-world applications, as the non-hint function
is more expensive by orders of magnitude.