Commit graph

3202 commits

Author SHA1 Message Date
Andreas Kling 25eee91811 AK: Make "foo"_fly_string infallible
Stop worrying about tiny OOMs.

Work towards #20405.
2023-08-07 16:03:27 +02:00
Andreas Kling 34344120f2 AK: Make "foo"_string infallible
Stop worrying about tiny OOMs.

Work towards #20405.
2023-08-07 16:03:27 +02:00
aryanbaburajan a94c0eea94 AK: Add trim_ascii_whitespace method to String 2023-08-06 22:21:10 +02:00
Shannon Booth faf9d08371 AK: Fix IPv6 serialization on multiple '0' parts ending in a '0' part
This could happen if a sequence of '0' parts was followed by a longer
sequence of '0' parts at the end of the host. The first sequence was
being used for the compress, and not the second.

For example, [1:1:0:0:1:0:0:0] was being serialized as: [1:1::1:0:0:0]
instead of [1:1:0:0:1::].

Fix this by checking at the end of the loop if we are in the middle of a
sequence of '0' parts that is longer than the current longest.
2023-08-06 10:53:32 +02:00
Shannon Booth db5ad0c2b0 AK: Remove ApplyPercentDecoding from URL
Nowhere was setting this flag from the default.
2023-08-06 08:57:23 +02:00
Shannon Booth 98666b012d AK: Remove URL::ApplyPercentEncoding
Everywhere only ever expects percent encoding to occur, so let's just
remove this flag altogether. At the same time, replace some
DeprecatedString with StringView.
2023-08-06 08:57:23 +02:00
Shannon Booth c4d7be100e AK: Directly append URL paths where applicable
This is a little closer to the spec text, and helps us avoid using
the ApplyPercentEncoding flag.
2023-08-06 08:57:23 +02:00
Hendiadyoin1 127b966219 AK: Expose Checked::saturating_[add|sub] as static helpers 2023-08-05 20:03:09 +02:00
Hendiadyoin1 af161a8b83 AK+LibWeb: Round to int in clamp_to_int instead of truncating
This caused inaccuracies in float->CssPixel conversions
2023-08-05 20:03:09 +02:00
Karol Kosek eb41f0144b AK: Decode data URLs to separate class (and parse like every other URL)
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.

Because parsing 'data:' didn't use standard fields, running the
following JS code:

    new URL('#a', 'data:text/plain,hello').toString()

not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.

With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
2023-08-01 14:19:05 +02:00
Karol Kosek 58017a0581 AK: Clear buffer after leaving CannotBeABaseUrlPath in URLParser
By not clearing the buffer, we were leaking the path part of a URL into
the query for URLs without an authority component (no '//host').

This could be seen most noticeably in mailto: URLs with header fields
set, as the query part of `mailto:user@example.com?subject=test` was
parsed to `user@example.comsubject=test`.

data: URLs didn't have this problem, because we have a special case for
parsing them.
2023-08-01 10:10:07 +02:00
Shannon Booth aa7ca80d7c AK: Fix missing step step for serialization of IPv6 hosts
This was resulting in the incorrect host serialization of:

http://[0:1:0:1:0:1:0:1] to [::1:0:1:0:1:0:1]

and:

http://[1:0:1:0:1:0:1:0] to [1::1:0:1:0:1:0]
2023-07-31 14:48:24 +02:00
Shannon Booth 4fdd4dd979 AK: Add missing default port definitions for FTP scheme URLs
This is defined in the spec, but was missing in our table. Fix this, and
add a spec comment for what is missing. Also begin a basic text based
test for URL, so we can get some coverage of LibWeb's usage of URL too.
2023-07-31 14:48:24 +02:00
Ali Mohammad Pur 4e69eb89e8 LibRegex: Generate a search tree when patterns would benefit from it
This takes the previous alternation optimisation and applies it to all
the alternation blocks instead of just the few instructions at the
start.
By generating a trie of instructions, all logically equivalent
instructions will be consolidated into a single node, allowing the
engine to avoid checking the same thing multiple times.
For instance, given the pattern /abc|ac|ab/, this optimisation would
generate the following tree:
    - a
    | - b
    | | - c
    | | | - <accept>
    | | - <accept>
    | - c
    | | - <accept>
which will attempt to match 'a' or 'b' only once, and would also limit
the number of backtrackings performed in case alternatives fails to
match.

This optimisation is currently gated behind a simple cost model that
estimates the number of instructions generated, which is pessimistic for
small patterns, though the change in performance in such patterns is not
particularly large.
2023-07-31 05:31:33 +02:00
Ali Mohammad Pur 7a471b7cf5 AK: Allow customising Trie's underlying map type
This makes it possible to use an ordered map and keep the insertion
order intact.
2023-07-31 05:31:33 +02:00
Hendiadyoin1 07e4358c63 AK: Use correct builtins for fmod and remainder
Similar to floor and ceil, we were forcing the values to be doubles here

Also adds a big FIXME about my findings trying to add a general
implementation for these
2023-07-31 05:22:12 +02:00
Hediadyoin1 c9808f0d4a AK: Use correct builtins for floor and ceil
We were using the double ones, forcing casts to and from them for floats
2023-07-31 05:22:12 +02:00
Hediadyoin1 29e0494e56 AK: Use builtins in fabs implementation and move it to the top
Both GCC and Clang inline this function to use bit-wise logic and/or
appropriate instructions even on -O0 and allow their use in a constexpr
context, see
https://godbolt.org/z/de1393vha
2023-07-31 05:22:12 +02:00
Hediadyoin1 594369121a AK: Add trunc and rint to AK/Math.h
These are useful in some algorithms, which require specific rounding.
2023-07-31 05:22:12 +02:00
Hediadyoin1 6573ace8f1 AK: Move rounding function to the top of AK/Math.h
These are useful in other algorithms, so lets move them up
2023-07-31 05:22:12 +02:00
Shannon Booth 8751be09f9 AK: Serialize URL hosts with 'concept-host-serializer'
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.

Making that change resulted in a whole bunch of fallout.

After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.
2023-07-31 05:18:51 +02:00
Shannon Booth 768f070b86 AK: Implement 'concept-host-serializer' in URL spec
This implementation will allow us to fix serialization of IPv6
addresses not being surrounded by '[' and ']'.

Nothing is calling this function yet - this will come in the next
(larger) commit where the underlying host representation inside of
AK::URL is changed from DeprecatedString to URL::Host.
2023-07-31 05:18:51 +02:00
Shannon Booth 803ca8cc80 AK: Make serialize_ipv6_address take a StringBuilder
This will allow us to implement 'concept-host-serializer' without
needing to call String::formatted.
2023-07-31 05:18:51 +02:00
Shannon Booth a1ae701a7d AK: Move URL::cannot_have_a_username_or_password_or_port out of line
This doesn't seem trivial enough to be defining in the header like this,
and should not be a performance critical function anyhow.

Also add spec comments while we are at it, and a FIXME since we do not
seem to exactly align.
2023-07-31 05:18:51 +02:00
Shannon Booth 0c0117fc86 AK: Add typdefs for host URL definitions
And use them where applicable. This will allow us to store the host in
the deserialized format as the spec specifies.

Ideally these typdefs would instead be the existing AK interfaces, but
in the meantime, we can just use this.
2023-07-31 05:18:51 +02:00
Andrew Kaster f5e8bba092 AK: Add argument to LexicalPath::basename to strip the extension 2023-07-30 17:50:44 -06:00
Sam Atkins 3f7d97f098 AK+Libraries: Remove FixedMemoryStream::[readonly_]bytes()
These methods are slightly more convenient than storing the Bytes
separately. However, it it feels unsanitary to reach in and access this
data directly. Both of the users of these already have the
[Readonly]Bytes available in their constructors, and can easily avoid
using these methods, so let's remove them entirely.
2023-07-30 19:32:52 +01:00
Shannon Booth bf7af25a82 AK: Allow testing Empty instances for equality
This also makes it possible to compare `Variant<Empty, Ts...>`
objects if operator== exists for all Ts
2023-07-28 20:47:48 +03:30
kleines Filmröllchen a0705202ea Kernel/Ext2: Write superblock backups
We don't ever read them out, but this should make fsck a lot less mad.
2023-07-28 14:51:07 +02:00
Lucas CHOLLET 18b7ddd0b5 AK: Rename the const overload of FixedMemoryStream::bytes()
Due to overload resolutions rules, this simple code provokes a crash:

ReadonlyBytes readonly_bytes{};
FixedMemoryStream stream{readonly_bytes};
ReadonlyBytes give_them_back{stream.bytes()};
    // -> Panics on VERIFY(m_writing_enabled);
    // but this is fine:
auto bytes = static_cast<FixedMemoryStream const&>(*stream).bytes()

If we need to be explicit about it, let's rename the overload instead of
adding that `static_cast`.
2023-07-27 14:40:00 +01:00
Shannon Booth 7b3902e3d5 AK: Remove unused URL::scheme_requires_port
THis function does not seem to be used anywhere, and I cannot find any
spec equivalent for this function.
2023-07-25 06:43:50 -04:00
Shannon Booth c8da880806 AK: Add spec comments to URL::serialize 2023-07-25 06:43:50 -04:00
Shannon Booth 177b04dcfc AK: Fix url host parsing check for 'ends in a number'
I misunderstood the spec step for checking whether the host 'ends with a
number'. We can't simply check for it if ends with a number, this check
is actually an algorithm which is required to avoid detecting hosts that
end with a number from an IPv4 host.

Implement this missing step, and add a test to cover this.
2023-07-25 06:43:50 -04:00
Aliaksandr Kalenik d216621d2a AK: Add clamp_to_int(value) in Math.h
clamp_to_int clamps value to valid range of int values so resulting
value does not overflow.

It is going to be used to clamp float or double values to int that
represents fixed-point value of CSSPixels.
2023-07-25 11:52:02 +02:00
Shannon Booth 8d2ccf0f4f AK: Implement IPV4 host URL parsing to specification
This implements both the parsing and serialization IPV4 parts from
the URL spec.
2023-07-24 17:07:16 -04:00
Shannon Booth 50359567e0 AK: Add spec comments for URL spec defined member variables 2023-07-24 17:07:16 -04:00
Timothy Flynn 685c8c3d40 Revert "AK: Automatically copy all warn/warnln logs to debug console"
This reverts commit d48c68cf3f.

Unfortunately, this currently copies some warn() invocations that we do
*not* want in the debug console, such as test-js's use of OSC command 9
to report progress.
2023-07-22 12:19:53 -04:00
Lucas CHOLLET f79165cefe AK: Make FixedArray movable 2023-07-21 10:47:34 -06:00
Andrew Kaster 3533d3e452 AK: Enable consteval workaround for Android NDK
Android isn't shipping clang-15 yet in any NDK, so use the existing
workaround on that platform.
2023-07-19 04:22:28 -06:00
Andreas Kling f0ec104131 AK: Implement IPv6 host parsing in URLParser
This is just a straight (and fairly inefficient) implementation of IPv6
parsing and serialization from the URL spec.

Note that we don't use AK::IPv6Address here because the URL spec
requires a specific serialization behavior.
2023-07-17 07:47:58 +02:00
Sam Atkins d48c68cf3f AK: Automatically copy all warn/warnln logs to debug console
This is only enabled inside Serenity, as on Lagom, all out/warn/dbg logs
go to the same console anyway.
2023-07-16 00:59:13 +02:00
Shannon Booth 5625ca5cb9 AK: Rename URLParser::parse to URLParser::basic_parse
To make it more clear that this function implements
'concept-basic-url-parser' instead of 'concept-url-parser'.
2023-07-15 09:45:16 +02:00
Shannon Booth 7ef4689383 AK: Implement steps for state override in URL parser 2023-07-15 09:45:16 +02:00
Daniel Bertalan aaf1b762ea AK: Remove redundant information from TypeErasedFormatParams
The array which contains the actual parameters is always located
immediately after the base `TypeErasedFormatParams` object of
`VariadicFormatParams`. Hence, storing a pointer to it inside a `Span`
is redundant. Changing it to a zero-length array saves 8 bytes.

Secondly, we limit the number of parameters to 256, so `m_size` and
`m_next_index` can be stored in a smaller data type than `size_t`,
saving us another 8 bytes.

This decreases the size of a single-element `VariadicFormatParams` from
48 to 32 bytes, thus reducing the code size overhead of setting up
parameters for `dbgln()`.

Note that [arrays of length zero][1] are a GNU extension, but it's used
elsewhere in the codebase already and is explicitly supported by Clang
and GCC.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
2023-07-14 06:37:11 +02:00
Nico Weber 9fb0de1cfe AK: Mark Error nodiscard
...instead of manually marking all methods returning Error nodiscard.

No real behavior change.
2023-07-12 17:03:07 +02:00
Daniel Bertalan cfadbcd950 AK: Work around Xcode 15 beta mishandling trailing requires clauses
Xcode 15 betas 1-3 lack https://reviews.llvm.org/D135772, which fixes a
bug that causes trailing `requires` clauses to be evaluated twice,
failing the second time. Reported as FB12284201.

This caused compile errors when instantiating types derived from RefPtr:
> error: invalid reference to function 'NonnullRefPtr': constraints not
> satisfied
> note: because substituted constraint expression is ill-formed: value
> of type '<dependent type>' is not contextually convertible to 'bool'.

This commit works around the issue by moving the `requires` clauses
after the template parameter list.

In most cases, trailing `requires` clauses and those specified after the
template parameter list work identically, so this change should not
impact the code's behavior. The only difference is that trailing
requires clauses are evaluated *after* constrained placeholder types
(i.e. `Integral auto i` function parameter).
2023-07-12 15:43:18 +01:00
Lucas CHOLLET 398f7ae988 AK: Move chunks a single time in cleanup_unused_chunks()
All elements of the vector were moved to the left, for each element to
remove. This patch makes the function move each element exactly once.

On the same test case as the previous commit, it makes the function
disappear from the profile. These two commits combined reduce the
decompression time by 12%.
2023-07-10 21:35:10 -04:00
Lucas CHOLLET 44bedf7844 AK: Don't reuse chunks in AllocatingMemoryStream
As confusing as it may sound, reusing them is terrible performance wise.
When profiling the PNG decoder, the result (which is dominated by the
Zlib decompression) shows that the `cleanup_unused_chunks()` function
represented 14.26% of the profile before this patch and only 7.7%
afterward.

On a 6.5 MB PNG image, it reduces the decompression time by more than
5%.
2023-07-10 21:35:10 -04:00
Tim Schumacher 8a9761de20 AK: Let Array::from_span take a readonly Span
We are copying here, so there is no need to require a non-const Span.
2023-07-09 15:40:41 +01:00
Hendiadyoin1 467bc5b093 AK: Add a generic version of log2
This uses one of Sun OS's algorithms, for a comparison to other
algorithms please refer to
https://gist.github.com/Hendiadyoin1/f58346d66637deb9156ef360aa158bf9

This is used on aarch64 builds and for x86 floats and doubles
for performance gains check
https://quick-bench.com/q/_2jTykshP6cUqtgdepFaoQ53YC8
which shows approximately 2x gains

Co-Authored-By: Ben Wiederhake <BenWiederhake.GitHub@gmx.de>
Co-Authored-By: kleines Filmröllchen <filmroellchen@serenityos.org>
Co-Authored-By: Dan Klishch <danilklishch@gmail.com>
2023-07-09 15:39:52 +01:00