Fix garbled UTF-8 progress bars in wslc build output (#40356)

* Fix garbled UTF-8 progress bars in wslc build output

The CRT locale was set to the system default (L"") which uses the ANSI
codepage (e.g. Windows-1252) for narrow-to-wide string conversions. When
Docker build callbacks print UTF-8 output via wprintf(L"%hs", status),
multi-byte characters like block elements (U+2588) were decoded incorrectly,
producing garbled output (e.g. garbled chars instead of solid blocks).

Override LC_CTYPE to .UTF-8 (with .65001 fallback) so that all CRT
narrow-to-wide conversions correctly handle UTF-8 encoded strings from
Linux/container processes. The system default locale is preserved for
all other categories (numeric, time, collation).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Guard LC_CTYPE UTF-8 override with Mode check

Only set LC_CTYPE to .UTF-8 when the CRT output mode is _O_U8TEXT,
making the function correct for potential future callers with other modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Ben Hillis
2026-04-29 23:52:18 -07:00
committed by GitHub
parent 06cd17f111
commit 6caee1676d

View File

@@ -1363,8 +1363,15 @@ void wsl::windows::common::wslutil::SetCrtEncoding(int Mode)
setMode(stdout, Mode);
setMode(stderr, Mode);
// Set the locale to the current environment's default locale.
// Set the locale to the current environment's default locale for regional
// formatting (numeric, time, collation), then override LC_CTYPE to UTF-8
// so that narrow-to-wide conversions (e.g. %hs in wprintf) correctly decode
// UTF-8 multi-byte sequences from Linux/container processes.
WI_VERIFY(_wsetlocale(LC_ALL, L"") != NULL);
if (Mode == _O_U8TEXT)
{
WI_VERIFY(_wsetlocale(LC_CTYPE, L".UTF-8") != NULL);
}
}
void wsl::windows::common::wslutil::SetThreadDescription(LPCWSTR Name)