json: Fix lexer for lookahead character beyond '\x7F'

The lexer fails to end a valid token when the lookahead character is
beyond '\x7F'.  For instance, input

    true\xC2\xA2

produces the tokens

    JSON_ERROR     true\xC2
    JSON_ERROR     \xA2

This should be

    JSON_KEYWORD   true
    JSON_ERROR     \xC2
    JSON_ERROR     \xA2

instead.

The culprit is

    #define TERMINAL(state) [0 ... 0x7F] = (state)

It leaves [0x80..0xFF] zero, i.e. IN_ERROR.  Has always been broken.
Fix it to initialize the complete array.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-2-armbru@redhat.com>
This commit is contained in:
Markus Armbruster 2018-08-31 09:58:36 +02:00
parent d5a515738e
commit 2a96042a8d

View file

@ -123,7 +123,7 @@ enum json_lexer_state {
QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START_INTERP);
QEMU_BUILD_BUG_ON(IN_START_INTERP != IN_START + 1);
#define TERMINAL(state) [0 ... 0x7F] = (state)
#define TERMINAL(state) [0 ... 0xFF] = (state)
/* Return whether TERMINAL is a terminal state and the transition to it
from OLD_STATE required lookahead. This happens whenever the table