From d86cbd2ac59a2cff9531aa3d451d54cb45358d17 Mon Sep 17 00:00:00 2001 From: JMARyA Date: Thu, 7 Mar 2024 00:06:37 +0100 Subject: [PATCH] update utf8 --- technology/files/ASCII.md | 2 +- technology/files/MessagePack.md | 2 +- technology/files/TOML.md | 2 +- technology/files/Unicode.md | 1 + technology/tools/JSON Schema.md | 2 +- 5 files changed, 5 insertions(+), 4 deletions(-) diff --git a/technology/files/ASCII.md b/technology/files/ASCII.md index 7fc1b7b..eab2f8b 100644 --- a/technology/files/ASCII.md +++ b/technology/files/ASCII.md @@ -5,7 +5,7 @@ wiki: https://en.wikipedia.org/wiki/ASCII # **American Standard Code for Information Interchange** (ASCII) -ASCII  abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices ([Binary](../../science/math/Binary%20System.md) to Text). Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set. +ASCII  abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices ([Binary](../../science/math/Binary%20System.md) to Text). Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use [Unicode](Unicode.md), which has millions of code points, but the first 128 of these are the same as the ASCII set. ## ASCII Table | Dec | Hex | Oct | ASCII | diff --git a/technology/files/MessagePack.md b/technology/files/MessagePack.md index 9af2614..07b3892 100644 --- a/technology/files/MessagePack.md +++ b/technology/files/MessagePack.md @@ -18,7 +18,7 @@ MessagePack has two concepts: type system and formats. - Boolean represents true or false - Float represents a IEEE 754 double precision floating point number including NaN and Infinity - Raw -- - String extending Raw type represents a UTF-8 string +- - String extending Raw type represents a [UTF-8](Unicode.md) string - - Binary extending Raw type represents a byte array - Array represents a sequence of objects - Map represents key-value pairs of objects diff --git a/technology/files/TOML.md b/technology/files/TOML.md index 56d5ce2..191c879 100644 --- a/technology/files/TOML.md +++ b/technology/files/TOML.md @@ -86,7 +86,7 @@ In [JSON](JSON.md) land, that would give you the following structure: ``` ### Strings -There are four ways to express strings: basic, multi-line basic, literal, and multi-line literal. All strings must contain only valid UTF-8 characters. +There are four ways to express strings: basic, multi-line basic, literal, and multi-line literal. All strings must contain only valid [UTF-8](Unicode.md) characters. **Basic strings** are surrounded by quotation marks (`"`). Any [Unicode](Unicode.md) character may be used except those that must be escaped: quotation mark, backslash, and the control characters other than tab (U+0000 to U+0008, U+000A to U+001F, U+007F). diff --git a/technology/files/Unicode.md b/technology/files/Unicode.md index 695887b..5162243 100644 --- a/technology/files/Unicode.md +++ b/technology/files/Unicode.md @@ -1,6 +1,7 @@ --- obj: concept website: https://unicode.org +aliases: ["utf8", "UTF-8"] --- # Unicode diff --git a/technology/tools/JSON Schema.md b/technology/tools/JSON Schema.md index 9262559..49ebb42 100644 --- a/technology/tools/JSON Schema.md +++ b/technology/tools/JSON Schema.md @@ -392,7 +392,7 @@ The acceptable values are `7bit`, `8bit`, `binary`, `quoted-printable`, `base16` Without getting into the low-level details of each of these encodings, there are really only two options useful for modern usage: -- If the content is encoded in the same encoding as the enclosing [JSON](../files/JSON.md) document (which for practical purposes, is almost always UTF-8), leave `contentEncoding` unspecified, and include the content in a string as-is. This includes text-based content types, such as `text/html` or `application/xml`. +- If the content is encoded in the same encoding as the enclosing [JSON](../files/JSON.md) document (which for practical purposes, is almost always [UTF-8](../files/Unicode.md)), leave `contentEncoding` unspecified, and include the content in a string as-is. This includes text-based content types, such as `text/html` or `application/xml`. - If the content is binary data, set `contentEncoding` to `base64` and encode the contents using [Base64](https://tools.ietf.org/html/rfc4648). This would include many image types, such as `image/png` or audio types, such as `audio/mpeg`. ```json