[3.12] GH-119496: accept UTF-8 BOM in .pth files (GH-119509)

`Out-File -Encoding utf8` and similar commands in Windows Powershell 5.1 emit
UTF-8 with a BOM marker, which the regular `utf-8` codec decodes incorrectly.

`utf-8-sig` accepts a BOM, but also works correctly without one.

This change also makes .pth files match the way Python source files are handled.

(cherry picked from commit bf5b6467f8)

Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
This commit is contained in:
Miss Islington (bot) 2024-05-24 16:52:09 +02:00 committed by GitHub
parent 078da88ad1
commit 4c0bc69238
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -185,7 +185,9 @@ def addpackage(sitedir, name, known_paths):
return
try:
pth_content = pth_content.decode()
# Accept BOM markers in .pth files as we do in source files
# (Windows PowerShell 5.1 makes it hard to emit UTF-8 files without a BOM)
pth_content = pth_content.decode("utf-8-sig")
except UnicodeDecodeError:
# Fallback to locale encoding for backward compatibility.
# We will deprecate this fallback in the future.