LibPDF: Reject unterminated literal strings with an error

0000459.pdf in 0000.zip in the pdfa dataset contains this as the
very first object:

```
1 0 obj
<<
/Creator (Developer 2000)
/CreatorDate (
/Author (Oracle Reports)
/Producer (Oracle PDF driver)
/Title (2021_06_29 Tutoritzacions APTES.PDF)
>>
endobj
```

The `/CreatorDate` value string is unterminated.

Before, we'd assert when trying to check if the first object is
a linearization dict.

Now, we never read the first object (an error during the linearization
dict reading is treated as "file is not linearized") unless we try
to print the document's metadata -- and there we now show an error
instead of asserting.
This commit is contained in:
Nico Weber 2023-10-25 00:05:25 -07:00 committed by Andreas Kling
parent c0f3f1674c
commit 4675700057

View file

@ -282,6 +282,9 @@ PDFErrorOr<DeprecatedString> Parser::parse_literal_string()
auto opened_parens = 0;
while (true) {
if (m_reader.done())
return error("unterminated string literal");
if (m_reader.matches('(')) {
opened_parens++;
builder.append(m_reader.consume());