Lock in the prefix-normalization behavior for the real-world case
surfaced by trace 395395a, where the server collapsed
'claude-opus-4.7-1m-internal' to 'claude-opus-4-7'.
Extends normalizeResponseModel to also echo the request model when the
canonical response is a less specific prefix of the canonical request.
This handles cases where the server strips a request-only qualifier
(e.g. reasoning effort 'high') and uses '-' punctuation, so request
'claude-opus-4.7-high' and response 'claude-opus-4-7' no longer create
distinct rows when grouping by gen_ai.response.model.
* Fix WorktreeCreatedTaskDispatcher overfiring on restored sessions (#318241)
Introduce onDidStartSession event to distinguish sessions we just started from sessions that appear in the catalog (via cloud sync, refresh, etc).
Problem:
WorktreeCreatedTaskDispatcher listened to onDidChangeSessions and tried to filter on status=Untitled to identify newly started sessions. This was wrong because:
- Agent-host sessions are never Untitled when arriving via change events (skeleton gets setStatus(InProgress) before firing)
- onDidChangeSessions fires for any catalog change including cloud-synced/restored sessions from other devices
- This caused tasks to run for existing sessions on every sync, not just newly started ones
Solution:
- Add onDidStartSession: Event<ISession> to ISessionsManagementService
- Fire it from sendNewChatRequest after provider.sendRequest() resolves with committed session
- Rewrite WorktreeCreatedTaskDispatcher to listen ONLY to onDidStartSession instead of added/changed events
- This correctly identifies 'sessions we just started locally' vs 'sessions in the catalog'
Changes:
- ISessionsManagementService: add onDidStartSession event
- SessionsManagementService: implement emitter + fire in sendNewChatRequest
- WorktreeCreatedTaskDispatcher: rewrite to use onDidStartSession exclusively
- Tests: comprehensive rewrite with regression test for #318241
- Docs: update SESSIONS.md spec
Fixes#318241
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address Copilot PR review comments on WorktreeCreatedTaskDispatcher (#318880)
Fix 4 issues identified in review:
1. **Autorun synchronous-dispatch leak** (line 79-91):
- autorun was running synchronously before store.add() completed
- when conditions were met immediately, it would dispose the store before the autorun was registered
- Switch to registerAutorunSelfDisposable to ensure the autorun is registered before it can dispose itself
2. **Config default documentation mismatch** (line 18-21):
- JSDoc claimed setting defaults to false, but chat.contribution.ts sets true
- Update JSDoc to say 'Defaults to `true`'
3. **Test doesn't reflect new default** (line 258-268):
- Test 'skips agent host sessions when the setting is disabled (default)' relied on default
- Now that default is true, explicitly set config to false
- Remove '(default)' from test name
4. **Stale comment** (line 74-77):
- Comment still referred to removed 'dispatched' flag
- Update to describe actual guard (disposing store removes subscriptions)
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address Copilot review on #318869:
- chatMLFetcher: compute normalizeResponseModel once and reuse for
metricAttrs, span attributes, and emitInferenceDetailsEvent so token
metrics and inference details event stay consistent with the chat span.
- toolCallingLoop: consolidate two getChatEndpoint calls into one and
drop the misleading 'will be set on response' comment (there is no
later fallback).
* Chronicle: Support subcommands
* Feedback update
* Lazy-hydrate known prompt slash command names
Defer the initial getPromptSlashCommands() call and onDidChangeSlashCommands subscription until the first hasPromptSlashCommand() call. The previous constructor-time hydration fired during test workspace setup, priming cachedSlashCommands with empty results before mockFiles() registered files, causing 13 PromptsService test failures.
* vscode to agents window handoff
* fix plan review stuff in this PR and address some comments
* fix CI
* address more comments, empty workspace
* address comments
* fix sometimes session is not ready yet
* add telemetry and setting
* off by default
* add exp in setting
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
The Linux desktop sanity tests (deb/rpm/snap) ran cleanup unconditionally
after `testDesktopApp` without a try/finally, so when that step threw
(e.g. extension install flake tracked in microsoft/vscode-engineering#2877)
the uninstall step was skipped. Mocha then retried the test, the install
step hit `package code-insiders is already installed`, and the retry
masked the real failure with a misleading error
(microsoft/vscode-engineering#2878).
Fix:
- Wrap install/test/uninstall in try/finally so cleanup runs even when
the test body throws.
- Make the uninstall helpers idempotent by detecting whether the package
binary exists; this lets the finally block run safely even if install
itself fails (and avoids double-uninstall on the happy path).
- Add a pre-install assertion to each install helper that fails fast
with a clear diagnostic when the package is already present, so any
leftover state from a prior killed run surfaces as a real bug pointing
at the prior failure instead of as a confusing `rpm -i` /
`dpkg -i` / `snap install` error.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Anthropic responses echo the same logical model with a different separator
(e.g. request 'claude-opus-4.6' resolves to 'claude-opus-4-6'), causing
GROUP BY / DISTINCT on gen_ai.response.model to produce duplicate rows.
Add normalizeResponseModel(requestModel, responseModel) that echoes the
request value when the two strings only differ in '.' vs '-', and returns
the resolved value unchanged when it adds real specificity (e.g.
'gpt-5.4-mini' -> 'gpt-5.4-mini-2026-03-17'). Wire into the three call
sites that set gen_ai.response.model: chatMLFetcher (chat spans),
toolCallingLoop (foreground invoke_agent), and claudeOTelTracker
(Claude invoke_agent).
Fixes#318805
* Bump @github/copilot-sdk to 1.0.0-beta.8 (Written by Copilot)
Also bumps @github/copilot CLI to 1.0.55-3 to satisfy the SDK's
`^1.0.55-1` peer requirement.
Key SDK breaking changes adapted in `src/vs/platform/agentHost/`:
- `CopilotClientOptions`: `useStdio`/`cliPath`/`autoStart` →
`connection: RuntimeConnection.forStdio({ path })`; `remote` →
`enableRemoteSessions`.
- `SessionContext.cwd` → `workingDirectory`.
- `CopilotSession.getMessages()` → `getEvents()`,
`destroy()` → `disconnect()`.
- `PermissionRequest` is now a discriminated union; the host-side
`ITypedPermissionRequest` is no longer `extends PermissionRequest` but
a standalone bag-of-optionals interface so existing call sites continue
to work. Extended host signal with the new `extension-management` /
`extension-permission-access` kinds.
- `BaseHookInput`: `timestamp: number` → `Date`, `cwd` →
`workingDirectory`, new required `sessionId`.
- `Tool.handler` is now optional — tests use `tool.handler!(...)`.
- `ToolBinaryResult.type` narrowed to literal `'image' | 'resource'`.
- `AssistantUsageData.copilotUsage` and `ShutdownData.totalPremiumRequests`
removed; corresponding accumulation/trace code dropped.
Validation: 0 TS errors, layers check passes, all 1357 agent-host unit
tests pass. Real-SDK integration tests: 9/11 passing. Two failures
documented in the session plan:
- plan-mode session-state test times out (likely needs migration of the
`_enablePlanModeOnClient` shim to the new public
`SessionConfig.onExitPlanModeRequest`).
- subagent-routing test: `read_agent` now appears on the parent session
(likely a CLI 1.0.55 behavior change).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix broken @github/copilot-win32-x64 entry in remote/package-lock.json
The committed remote/package-lock.json had a malformed stub entry:
"node_modules/@github/copilot/node_modules/@github/copilot-win32-x64": {
"optional": true
}
with no version field, which caused npm install to crash with:
TypeError: Invalid Version:
at Node.canDedupe (.../@npmcli/arborist/lib/node.js:1137:32)
at PlaceDep.pruneDedupable (.../@npmcli/arborist/lib/place-dep.js:426:14)
Replaced with a proper top-level node_modules entry for
@github/copilot-win32-x64@1.0.55-3 (matches the resolution used for
the other platform optionals on the @github/copilot dep). This is what
npm naturally produces when regenerating the lockfile from scratch.
Fixes the failing 'Install dependencies' step on the macOS / Remote
CI job.
(Written by Copilot)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Bump deb amd64 libc6 floor to 2.15 for new @github/copilot runtime.node
The new @github/copilot 1.0.55-3 ships a Linux x64 runtime.node that
references GLIBC_2.15 (was GLIBC_2.14 in 1.0.49). dpkg-shlibdeps now
emits libc6 (>= 2.15) for amd64; update the reference list so the deb
prepare task no longer fails the build.
Only amd64 is affected: arm64 runtime.node GLIBC version set is
unchanged, and @github/copilot does not ship an armhf binary. The
overall libc6 floor for the package is still 2.28, so distro support
is unchanged.
(Written by Copilot)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Make sure permission modes are correct across chat kinds for Claude ext host
* Allow yolo in Agents window for Claude welcome chat
* Allow auto mode in Agents window for Claude chat sessions
* feedback
- Complete the trailing comment documenting the reasoning / reasoning_content fields.
- Correct field-ownership comments in thinking.ts (reasoning_content is DeepSeek/Kimi/Minimax; reasoning is OpenRouter).
- Add SSE stream-parsing regression tests for the reasoning_content and reasoning delta field names.
* CAPI package update
* Missindg updates
* Missing lock
* Render cloud task history from typed events
- Extract shared session-event-to-chat-parts renderer into
chatSessions/common/sessionEventRenderer.ts so both the Copilot CLI
and Cloud Tasks providers produce identical tool/text formatting.
- Render Task API history via the shared renderer, remapping
custom_agent.* to the equivalent subagent.* names so tool cards,
bash terminal output, edits, search results, MCP results and
subagent groups all render the same way they do in the CLI.
- Match github.com/github/github-ui presentation by suppressing
intermediate assistant.message events that echo tool input/output
and only rendering the final summary message per turn.
* Address review: break circular dep & restore flush timing
- Inject CLI tool-event handlers (processStart/processComplete/
enrichSubagent/isEditToolCall/getEditedUris) into the shared
renderer via a ToolEventHandlers<T> bundle, so common/sessionEventRenderer
no longer imports from copilotcli/. Layering now only flows one way.
- Flush buffered assistant.message_delta chunks at the top of the
CLI loop for non-message events so the session.model_change /
assistant.usage guards see streamed text exactly the way the
pre-extraction code did.
* Missing lock changes
* Revert unintended copilot package manifest edits
* Revert unintended root package manifest edits
* remote package revert
* lock
---------
Co-authored-by: Dmitriy Vasyura <dmitriv@microsoft.com>
Recurring transient failures of the desktop-linux-arm64 sanity test
surface only as 'Failed to install extension after 3 attempts' with no
context. This makes diagnosis hard.
- Carry the last per-attempt failure message into the final thrown error
- Capture a screenshot on each failed install attempt
- Log any visible Marketplace message during failed install attempts
- Add a 5s backoff between install retries
Refs microsoft/vscode-engineering#2877
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes#312746. DeepSeek / Moonshot (Kimi) / Minimax and OpenRouter reject the turn after a tool call with HTTP 400 unless the assistant message replays its reasoning. OpenAIEndpoint now emits reasoning_content and reasoning (alongside cot_id / cot_summary) on the Chat Completions path, gated on the model's thinking capability so non-reasoning endpoints are unaffected.
Add reasoning_content (DeepSeek / Moonshot (Kimi) / Minimax) and reasoning (OpenRouter) to ThinkingDataInMessage and RawThinkingDelta, and read them when extracting reasoning text from responses.
The rules at `.agent-sessions-workbench .interactive-session
.interactive-item-container`, `> .chat-suggest-next-widget`, and
`.interactive-input-part` used to apply to every chat surface inside the
Agents window. That includes surfaces that reuse the chat widget
components but do not want the 950px-wide centered "card" layout
notably the terminal inline chat (Ctrl/Cmd+I in terminal), the editor
inline chat (Ctrl/Cmd+I in an editor), and the agent-session hover
widget. In those surfaces the items got centered with a 950px max-width
which left empty space on the left and overflowed on the right.
Scope the three rules to `.part.sessionspart` and `.chat-editor-relative`,
the two surfaces that actually want the card layout. With the narrowed
needed (no competing rules at the target selectors) and have been
removed.
Verified via CDP-injected test DOM against the running Agents window:
- `.part.sessionspart` -> max-width 950px, auto-centered (preserved)
- `.chat-editor-relative` -> max-width 950px, auto-centered (preserved)
- `.terminal-inline-chat` -> max-width none, margin 0 (fixed)
- `.zone-widget.inline-chat-widget` -> max-width none, margin 0 (fixed)
- `.agent-session-hover` -> max-width none, margin 0 (fixed)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reverts the default change made in #316900 which set the default of editor.quickSuggestions.other to 'offWhenInlineCompletions'. When Copilot inline completions are active, that default suppresses the suggest widget for snippet, IntelliSense and emmet completions and disrupts IME composition.
Fixes#317916Fixes#318380Fixes#318549Fixes#318735Fixes#318522Fixes#317863Fixes#318694Fixes#314220