serenity/AK
Andreas Kling a3e82eaad3 AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.

Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.

Problems in DeprecatedString:

- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.

- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.

- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)

- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.

How we solve these issues in the new String:

- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.

- String has no null state. Use Optional<String> when needed.

- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!

- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.

Furthermore, it has two nifty new features:

- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.

- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.

One caveat:

- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
2022-12-06 15:21:26 +01:00
..
.clang-tidy
AllOf.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
AnyOf.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
ArbitrarySizedEnum.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Array.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Assertions.cpp AK: Print VERIFY() error messages in release builds 2022-10-06 15:29:38 +02:00
Assertions.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Atomic.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
AtomicRefCounted.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Badge.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Base64.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
Base64.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
BinaryBufferWriter.h
BinaryHeap.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
BinarySearch.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
BitCast.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Bitmap.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
BitmapView.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
BitStream.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Buffered.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
BuiltinWrappers.h AK: Implement FloatExtractor<f128> 2022-12-02 16:22:51 +01:00
BumpAllocator.h AK: Take the bump-allocated chunk header into account in destroy_all() 2022-12-06 11:19:50 +01:00
ByteBuffer.h Everywhere: Remove redundant inequality comparison operators 2022-11-06 10:25:08 -07:00
ByteReader.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
CharacterTypes.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Checked.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
CheckedFormatString.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
CircularDeque.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
CircularDuplexStream.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
CircularQueue.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
CMakeLists.txt AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
Complex.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Concepts.h AK: Export Details and Concepts into the AK namespace 2022-11-27 23:54:40 +01:00
DateConstants.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
DateTimeLexer.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Debug.h.in LibVideo: Add PlaybackManager to load and decode videos 2022-10-31 14:47:13 +01:00
Demangle.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
DeprecatedString.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
DeprecatedString.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
DisjointChunks.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
DistinctNumeric.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
DoublyLinkedList.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Endian.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
EnumBits.h
Error.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
ExtraMathConstants.h
FileStream.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Find.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
FixedArray.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
FixedPoint.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
FloatingPoint.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
FloatingPointStringConversions.cpp Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
FloatingPointStringConversions.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
FlyString.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
FlyString.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Format.cpp AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Format.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Forward.h AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
FPControl.h
Function.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
FuzzyMatch.cpp
FuzzyMatch.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
GenericLexer.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
GenericLexer.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
GenericShorthands.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
HashFunctions.h
HashMap.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
HashTable.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Hex.cpp AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Hex.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
IDAllocator.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
IntegralMath.h AK: Implement FloatExtractor<f128> 2022-12-02 16:22:51 +01:00
IntrusiveDetails.h
IntrusiveList.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
IntrusiveListRelaxedConst.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
IntrusiveRedBlackTree.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
IPv4Address.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
IPv6Address.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
IterationDecision.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Iterator.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
JsonArray.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonArraySerializer.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
JsonObject.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonObjectSerializer.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
JsonParser.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonParser.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
JsonPath.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonPath.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonValue.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
JsonValue.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
kmalloc.cpp Everywhere: Replace uses of __serenity__ with AK_OS_SERENITY 2022-10-10 12:23:12 +02:00
kmalloc.h AK: Fully qualify some usages of AK features outside of the AK namespace 2022-11-27 23:54:40 +01:00
kstdio.h Everywhere: Replace uses of __serenity__ with AK_OS_SERENITY 2022-10-10 12:23:12 +02:00
LEB128.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
LexicalPath.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LexicalPath.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
MACAddress.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
Math.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
MemMem.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Memory.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
MemoryStream.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
NeverDestroyed.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
NoAllocationGuard.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Noncopyable.h
NonnullOwnPtr.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
NonnullOwnPtrVector.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
NonnullPtrVector.h
NonnullRefPtr.h Everywhere: Remove 'clang-format off' comments that are no longer needed 2022-12-03 23:52:23 +00:00
NonnullRefPtrVector.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
NumberFormat.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
NumericLimits.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Optional.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
OwnPtr.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Platform.h Everywhere: Remove 'clang-format off' comments that are no longer needed 2022-12-03 23:52:23 +00:00
PrintfImplementation.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Ptr32.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Queue.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
QuickSort.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Random.cpp
Random.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
RecursionDecision.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
RedBlackTree.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
RefCounted.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
RefCountForwarder.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
RefPtr.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Result.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
ReverseIterator.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
ScopedValueRollback.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
ScopeGuard.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
ScopeLogger.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
SIMD.h
SIMDExtras.h
SIMDMath.h
Singleton.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
SinglyLinkedList.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
SinglyLinkedListWithCount.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
SourceGenerator.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
SourceLocation.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Span.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Stack.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
StackInfo.cpp Everywhere: Add support for compilation under emscripten 2022-11-26 02:23:15 +03:30
StackInfo.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Statistics.h
StdLibExtraDetails.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
StdLibExtras.h Everywhere: Remove 'clang-format off' comments that are no longer needed 2022-12-03 23:52:23 +00:00
Stream.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
String.cpp AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
String.h AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
StringBuilder.cpp AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
StringBuilder.h AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
StringFloatingPointConversions.cpp Everywhere: Fix a few comment typos 2022-11-09 16:00:32 +00:00
StringFloatingPointConversions.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
StringHash.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
StringImpl.cpp
StringImpl.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
StringUtils.cpp AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
StringUtils.h AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
StringView.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
StringView.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
TemporaryChange.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Time.cpp Lagom: Win32 support baby steps 2022-09-29 17:01:22 +01:00
Time.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Traits.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Trie.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Try.h AK: Document the non-standard extensions in TRY 2022-10-16 22:05:42 +02:00
Tuple.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
TypeCasts.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
TypedTransfer.h AK: Use TypedTransfer to move vector's inline buffer 2022-11-17 20:13:04 +03:30
TypeList.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Types.h AK: Fully qualify some usages of AK features outside of the AK namespace 2022-11-27 23:54:40 +01:00
UBSanitizer.h
UFixedBigInt.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
UnicodeUtils.h
URL.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
URL.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
URLParser.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
URLParser.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Userspace.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
Utf8View.cpp AK: Add Utf8View::iterator_at_byte_offset_without_validation() 2022-11-24 16:06:20 +00:00
Utf8View.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Utf16View.cpp AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Utf16View.h AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
Utf32View.h AK: Make it possible to not using AK classes into the global namespace 2022-11-26 15:51:34 +01:00
UUID.cpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
UUID.h Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
Variant.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Vector.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
Weakable.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00
WeakPtr.h Everywhere: Run clang-format 2022-12-03 23:52:23 +00:00