Improves virtiofs and VirtioProxy performance by giving each virtio
device its own SWIOTLB aperture instead of sharing a single global
pool. The guest kernel reserves a contiguous physical range at boot,
publishes the (base, size), and the host programs a matching
per-device aperture in wsldevicehost.
1. The WSL kernel allocates a contiguous range at boot
(alloc_contig_pages with __GFP_DMA32 | __GFP_ZERO) and exposes
the chosen physical (base, size) under
/sys/bus/vmbus/drivers/hv_pci/swiotlb_{base,size}
2. mini_init (WSL2) and the WSLC init handler read those sysfs files
and return the values in LX_INIT_GUEST_CAPABILITIES and
WSLC_GET_GUEST_CAPABILITIES_RESULT respectively.
3. WslCoreVm::ReadGuestCapabilities and
WSLCVirtualMachine::ReadGuestCapabilities capture the values.
WSLC forwards them to wslservice via the new
HcsVirtualMachine::ApplyGuestCapabilities IDL method (with a
WSLCGuestCapabilities struct so future kernel-published values
can be added without bumping the interface IID).
4. Both VM owners format "swiotlb=0x{base:x},{size}" once into
m_swiotlbOption and pass it verbatim to AddGuestDevice /
AddSharePath for every virtiofs share and virtio-net adapter
(VirtioProxy networking). wsldevicehost consumes the token and
creates the per-device SWIOTLB aperture.
If the kernel does not publish the sysfs files (older kernel) both
values come back as zero, the host omits the device-options token,
and the WSL2 path emits a one-time user warning via
MessageSwiotlbKernelUnsupported so users understand why performance
is degraded. (The WSLC path always uses the bundled kernel, so the
warning does not apply there.)
Other changes:
* Bump Microsoft.WSL.Kernel to 6.18.26.3-1, which is the first
official kernel that publishes the hv_pci swiotlb_{base,size}
sysfs files this PR consumes.
* Bump Microsoft.WSL.DeviceHost to 1.2.29-0 for the device-side
SWIOTLB aperture support.
* Default pool sizing moves to helpers::ComputeDefaultSwiotlbConfig
and is only requested on the kernel command line when a virtio
device that needs bounce buffers (VirtioFs / Virtio9p /
VirtioProxy) is in use.
* Telemetry: emit GuestKernelInfo / WSLCReadGuestCapabilities /
WSLCApplyGuestCapabilities events with the kernel-chosen base
and size so we can validate the handshake in CI.
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Windows interop in every running WSL2 distro silently breaks whenever a
sibling systemd-enabled distro shuts down, surfacing to users as:
/bin/bash: line 1: /mnt/c/Windows/system32/cmd.exe:
cannot execute binary file: Exec format error
Root cause: `systemd-shutdown` calls `disable_binfmt()` during clean
shutdown, which writes `-1` to `/proc/sys/fs/binfmt_misc/status`.
binfmt_misc is a single kernel-global registry shared across the WSL VM
(distros do not isolate it via a user namespace), so that one write wipes
every entry -- including WSLInterop -- for every running distro.
Fix: each per-distro init bind-mounts a read-only file over
`/proc/sys/fs/binfmt_misc/status` in its own mount namespace before
exec'ing the distro's init. systemd-shutdown's wipe write then fails with
EROFS; systemd logs a warning and continues normally (its
`binfmt_mounted_and_writable()` helper deliberately tolerates this
case). Per-entry unregister (`echo -1 > .../<name>`) and runtime
registration (`echo ... > .../register`) target different files and are
unaffected, so callers retain full control over their own binfmt entries.
`LockBinfmtStatusReadOnly` is idempotent: it bails early if binfmt_misc
isn't mounted, no-ops if `/status` already resolves to our lock file,
and recovers from a stale foreign mount via `umount2(MNT_DETACH)`
followed by a retry. The existing `[boot] protectBinfmt` wsl.conf key
(default true) now controls the bind-mount and acts as a kill switch for
users who want to manage binfmt_misc themselves.
WSLInterop is also re-registered from mini_init with the `F`
(fix-binary) flag so the interpreter is opened at registration time and
remains valid across mount namespaces.
Tests:
* `BinfmtStatusIsLocked` -- mechanism test: `/status` is its own
mountpoint, writes fail with EROFS, WSLInterop survives the wipe
attempt, /register and per-entry unregister still work, and the
`protectBinfmt=false` kill switch removes the bind-mount.
* `BinfmtSurvivesDistroTermination` -- end-to-end regression test:
imports a systemd-enabled peer distro, terminates it, and asserts
that the primary distro's Windows interop still works.
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
HCS fails with E_ACCESSDENIED when starting a VM whose user-supplied
kernelModules or systemDistro VHDs live somewhere VMWP cannot read
(e.g. under the user profile). Eagerly call HcsGrantVmAccess on those
paths while impersonating the user, before the VM is started.
The grant is best-effort: it requires WRITE_DAC on the file (typically
via ownership), which the impersonated user may lack for VHDs they only
have READ access to (e.g. SYSTEM-owned VHDs reachable via inherited
folder ACLs). Failures are logged via CATCH_LOG; if VMWP truly cannot
read the VHD, StartComputeSystem will still surface a clear
E_ACCESSDENIED.
Adds two regression tests:
- CustomVhdsInUserProfile: VHDs under %TEMP%, exercises the grant path.
- CustomVhdsAccessibleViaInheritedAcls: VHDs in the install dir launched
as a non-elevated user, exercises the swallowed-grant-failure path.
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Update the VM termination logic to enforce timeouts and avoid hang if init is stuck during session termination
* Save state
* Save state
* Rethink IO logic
* Fix build
* Apply PR feedback
* Fix error check
* Apply PR feedback
* Apply PR feedback
* Update Microsoft.WSL.Kernel to 6.18.26.1-1
* Update tests for Linux 6.18 kernel behavior changes
Adjust eventfd size validation, lxtfs writev, and mount option
format expectations to match 6.18 kernel behavior.
* Update test patterns for new kernel /proc/mounts cache format
The kernel now outputs cache=0x5 (hex) instead of cache=5 (decimal) in
/proc/mounts for 9p filesystems. Update the ExpectMount patterns in
WSLCTests::WindowsMounts and WSLCTests::GPU to match the new format.
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix VHD ownership after cross-volume move to prevent E_ACCESSDENIED
When MoveDistribution moves a VHD across volumes, MoveFileEx copies the
file and the new file's owner may not be the user's SID. This causes
HcsGrantVmAccess to fail with E_ACCESSDENIED when later launching the
distro, because the impersonated user lacks WRITE_DAC on the file
(only implicitly granted to the owner).
Fix by explicitly setting the VHD owner to the user's SID after the
move, matching what CreateVhd already does at creation time. Uses
handle-based SetSecurityInfo with FILE_FLAG_OPEN_REPARSE_POINT to
avoid TOCTOU races and symlink following.
Also fixes a pre-existing build break in MountTests.cpp from the test
refactor (WSL2_TEST_ONLY -> WSL2_TEST_METHOD).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Preserve original VHD owner instead of using GetUserSid()
Instead of unconditionally setting the VHD owner to the caller's SID
after a cross-volume move, read the original owner before the move and
restore it afterward. This avoids changing ownership to someone who
didn't originally own the file.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cherry-pick WSL1/WSL2 test changes from 9c4dba91 (feature/wsl-for-apps).
Replace runtime WSL1_TEST_ONLY()/WSL2_TEST_ONLY() skip macros with
WSL1_TEST_METHOD()/WSL2_TEST_METHOD() TAEF metadata macros. This moves
version filtering to the test runner level via /select: queries, so
inapplicable tests are excluded entirely instead of appearing as skipped.
Updated files:
- test/windows/Common.h: New macros + removed old skip macros
- test/windows/*.cpp: Converted all test methods
- tools/test/run-tests.ps1: Auto-add /select: when no user filter
- cloudtest/TestGroup.xml.in: Add version filter to TAEF args
- test/README.md: Document new macros
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Mask console-getty.service to prevent multi-distro failures (#13595)
When multiple WSL distros run concurrently, /dev/tty devices are shared
at the VM level. The second distro's console-getty.service fails because
the tty is already held by the first, causing systemd to report failed
units and triggering user@UID.service failures.
Mask console-getty.service during WSL systemd unit generation, similar
to the existing masking of networkd-wait-online. This service provides
no value in WSL since users don't connect to the underlying tty.
Fixes#13595
* format source
* pr feedback
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* Revert "test: enable virtiofs tests and enable WSLG during testing (#14387)"
* enable wslg for SystemdNoClearTmpUnit test
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* test: Add arm64 test distro support
* update unit test baseline
* more test baseline updates
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* detach terminal before running mount -a
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* use _exit on error before execv in child process to avoid unintentional resource release
* Add regression test
* Fix clang format issue
* fix all clang format issue
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* resolve ai comments
* move test to unit test
* Fix string literal
* Overwrite fstab to resolve pipeline missing file issue
---------
Co-authored-by: Feng Wang <wangfen@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Refactor: trim unnecessary DLL deps from COMMON_LINK_LIBRARIES
- Split MSI/Wintrust install functions from wslutil.cpp into install.cpp
- Remove MI.lib, wsldeps.lib, msi.lib, Wintrust.lib, computecore.lib,
computenetwork.lib, Iphlpapi.lib from COMMON_LINK_LIBRARIES
- Add per-target MSI_LINK_LIBRARIES, HCS_LINK_LIBRARIES, SERVICE_LINK_LIBRARIES
- Delay-load msi.dll and WINTRUST.dll for wsl.exe and wslg.exe
- Result: wslhost, wslrelay, wslcsdk, testplugin lose msi/wintrust startup imports;
wsl.exe and wslg.exe defer msi/wintrust loading until actually needed;
wslservice is the only target that imports computecore/computenetwork/Iphlpapi
* minor fixes to install.cpp that were caught during PR
* move to wsl::windows::common::install namespace
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* Mask NetworkManager-wait-online.service during boot
Fixes#13772. Similar to PR #13611, this masks NetworkManager-wait-online.service to prevent 60-second timeouts during boot since WSL interfaces are unmanaged by NetworkManager. Also added the service to the discouraged units list in validate-modern.py and added a unit test.
* Addressed Copilot feedback
* Fix
* virtiofs: add support for mounting directories (not just full volumes)
* disable virtiofs tests for now
* spelling
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* Resolve issue with config file writing sections outside of their expected header.
* add more writewslconfig variations
* formatting
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* build: fix minor compiler errors when building with VS2026
* s
* use VS2022 for clang format and cross compiling
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
* Introduce a new wsl.conf config value to allow distributions to opt-in to cgroupv1 mounts
* Add test coverage
* Fix tmpfs on wsl1
---------
Co-authored-by: Ben Hillis <benhillis@gmail.com>
* Introduce a new kernel command line argument to collect hvsocket event logs during boot
* Cleanup diff
* unset env
* Add test coverage
* Fix format
* Remove prefix
* Switch WSLg to use wslinfo --vm-id instead of relying on environment variable
* DO NOT MERGE: bad WSLg nuget
* dead code removal
* always send response to LxInitMessageQueryVmId message
* add back invalid WslInfoMode error
* remove unneeded wsl2 check
* use temporary workaround until WSLg update is ready
* unit test update
* Update string compare
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Ben Hillis <benhill@ntdev.microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add --vm-id to wslinfo usage string
* Pass the VM id to init
This change ensures that we pass the vm id to an
instances init. The id is then set as an environment
variable and can be accessed at runtime.
* Expose VM id to wslinfo
Add a new argument --vm-id to wslinfo so that
the caller can retrieve the VM id by calling the
binary.
Although it is an environment variable, it can be useful
here too to save additional string parsing from the caller.