Commit graph

34 commits

Author SHA1 Message Date
Sahan Fernando b37139e111 AK: Make JsonParser::parse_number properly parse >32bit ints 2020-12-21 00:15:44 +01:00
Tucker Polomik ad8284bac6 AK: check fractional string has_value() in JsonParser
Resolves #3670
2020-10-06 14:27:51 +02:00
Benoît Lormeau f0f6b09acb AK: Remove the ctype adapters and use the actual ctype functions instead
This finally takes care of the kind-of excessive boilerplate code that were the
ctype adapters. On the other hand, I had to link `LibC/ctype.cpp` to the Kernel
(for `AK/JsonParser.cpp` and `AK/Format.cpp`). The previous commit actually makes
sense now: the `string.h` includes in `ctype.{h,cpp}` would require to link more LibC
stuff to the Kernel when it only needs the `_ctype_` array of `ctype.cpp`, and there
wasn't any string stuff used in ctype.
Instead of all this I could have put static derivatives of `is_any_of()` in the
concerned AK files, however that would have meant more boilerplate and workarounds;
so I went for the Kernel approach.
2020-09-27 21:15:25 +02:00
Benoît Lormeau 7b356c33cb
AK: Add a GenericLexer and extend the JsonParser with it (#2696) 2020-08-09 11:34:26 +02:00
Nico Weber ce95628b7f Unicode: Try s/codepoint/code_point/g again
This time, without trailing 's'. Ran:

    git grep -l 'codepoint' | xargs sed -ie 's/codepoint/code_point/g
2020-08-05 22:33:42 +02:00
Nico Weber 19ac1f6368 Revert "Unicode: s/codepoint/code_point/g"
This reverts commit ea9ac3155d.
It replaced "codepoint" with "code_points", not "code_point".
2020-08-05 22:33:42 +02:00
Andreas Kling ea9ac3155d Unicode: s/codepoint/code_point/g
Unicode calls them "code points" so let's follow their style.
2020-08-03 19:06:41 +02:00
LepkoQQ b1c99c1891 AK: Fix JsonParser double encoding multibyte utf-8 chararcters 2020-06-20 17:04:03 +02:00
Hüseyin ASLITÜRK 0aad21fff2 AK: JsonParser, replace char type to u32 for code point 2020-06-16 13:15:17 +02:00
Matthew Olsson e8e728454c AK: JsonParser improvements
- Parsing invalid JSON no longer asserts
    Instead of asserting when coming across malformed JSON,
    JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
  parse() call. Examples of things that were valid and no longer are:
    - undefineddfz
    - {"foo": 1}abcd
    - [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
2020-06-13 12:43:22 +02:00
Andreas Kling fdfda6dec2 AK: Make string-to-number conversion helpers return Optional
Get rid of the weird old signature:

- int StringType::to_int(bool& ok) const

And replace it with sensible new signature:

- Optional<int> StringType::to_int() const
2020-06-12 21:28:55 +02:00
Hüseyin ASLITÜRK 8e9776165f AK: JSON, Escape spacial character in string serialization 2020-06-03 21:52:40 +02:00
Tibor Nagy 1bf93d504f AK: Break on end of input in JsonParser::consume_quoted_string
Fixes #1599
2020-04-04 10:31:01 +02:00
Emanuel Sprung c54855682c AK: A few JSON improvements
* Add double number to object serializer

* Handle negative double numbers correctly

* Handle \r and \n in quoted strings independently
  This improves the situation when keys contain \r or \n that currently
  has the effect that "a\rkey" and "a\nkey" in an JSON object are the
  same key value.
2020-03-31 13:42:39 +02:00
Andreas Kling e34fe556ca AK: Fix JsonParser kernel build (no floats/doubles in kernel code) 2020-03-24 22:37:24 +01:00
Emanuel Sprung bca5762542 AK: Add parsing of JSON double values
This patch adds the parsing of double values to the JSON parser.
There is another char buffer that get's filled when a "." is present
in the number parsing. When number finished, a divider is calculated
to transform the number behind the "." to the actual fraction value.
2020-03-24 22:20:07 +01:00
Andreas Kling 900f51ccd0 AK: Move memory stuff (fast memcpy, etc) to a separate header
Move the "fast memcpy" stuff out of StdLibExtras.h and into Memory.h.
This will break a ton of things that were relying on StdLibExtras.h
to include a bunch of other headers. Fix will follow immediately after.

This makes it possible to include StdLibExtras.h from Types.h, which is
the main point of this exercise.
2020-03-08 13:06:51 +01:00
Andreas Kling 22d0a6d92f AK: Remove unnecessary casts to size_t, after Vector changes
Now that Vector uses size_t, we can remove a whole bunch of redundant
casts to size_t.
2020-03-01 12:58:22 +01:00
Andreas Kling 94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Andreas Kling 821484f170 AK: Fix JSON parser crashing when encountering UTF-8
The mechanism that caches the most recently seen string for each first
character was indexing into the cache using a 'char' subscript. Oops!
2019-12-29 22:20:21 +01:00
Andreas Kling 6f4c380d95 AK: Use size_t for the length of strings
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
2019-12-09 17:51:21 +01:00
Andreas Kling 1ac963b5c8 JsonParser: "" is an empty string, not a null value 2019-08-14 15:01:42 +02:00
Andreas Kling 6d97caf124 JsonParser: Scan ahead to find the first special char in quoted strings
This allows us to take advantage of the now-optimized (to do memmove())
Vector::append(const T*, int count) for collecting these strings.

This is a ~15% speedup on the load_4chan_catalog benchmark.
2019-08-07 11:57:51 +02:00
Andreas Kling 0d0230ab10 JsonParser: Fold extract_while() into parse_number()
It wasn't unsed anywhere else anyway, and this is actually ~1% faster
on the load_4chan_catalog benchmark.
2019-08-04 20:23:46 +02:00
Andreas Kling 72b69b82bb JsonParser: Oops, fix build. 2019-08-04 18:59:06 +02:00
Andreas Kling 4e004a664f JsonParser: Cache the last seen string starting with each possible char
Keep a 256-entry string cache during parse to avoid creating some new
strings when possible. This cache is far from perfect but very cheap.
Since none of the strings are transient, this only costs us a couple of
pointers and a bit of ref-count manipulation.

The cache hit rate on 4chan_catalog.json is ~33% and the speedup on
the load_4chan_catalog benchmark is ~7%.
2019-08-04 18:41:24 +02:00
Andreas Kling b62a12c687 JsonParser: Some minor optimizations
- Return more specific types from parse_array() and parse_object().
- Don't create a throwaway String in extract_while().
- Use a StringView in parse_number() to avoid a throwaway String.
2019-08-04 11:47:21 +02:00
Andreas Kling c55129e573 JsonParser: Use Vector<char, 1024> instead of StringBuilder in parsing
This is a 10-12% speedup on the 4chan thread catalog JSON.
2019-08-04 10:38:15 +02:00
Andreas Kling 6797f71e73 JsonParser: When encountering \uXXXX, just emit a "?" for now. 2019-08-04 08:58:45 +02:00
Andreas Kling 2923d39106 JsonParser: Merge the parsing of '\n' and '\r' in quoted strings 2019-08-01 10:45:37 +02:00
Andreas Kling a8aadf73e9 AK: Add JsonObject::set(key, &&value) overload.
This dodges a whole bunch of value copying in JsonParser.
2019-07-08 13:08:21 +02:00
Andreas Kling 2bd8118843 Kernel: Change the format of /proc/all to JSON.
Update ProcessManager, top and WSCPUMonitor to handle the new format.

Since the kernel is not allowed to use floating-point math, we now compile
the JSON classes in AK without JsonValue::Type::Double support.
To accomodate large unsigned ints, I added a JsonValue::Type::UnsignedInt.
2019-06-29 09:04:45 +02:00
Andreas Kling e9b619c4aa JsonParser: Support basic escaped string characters.
I didn't implement \uXXXX-style escape in this patch. That's a FIXME.
2019-06-25 16:58:30 +02:00
Andreas Kling 266fed8c00 AK: Let's put the JSON parsing in a separate class. 2019-06-24 13:39:45 +02:00