Microsoft is testing its Phi Silica small language model family on NVIDIA RTX GPUs, according to WinBuzzer, expanding the hardware that can run the company’s built-in Windows AI models beyond the NPU-equipped Copilot+ PCs they were designed for.
Phi Silica was built from the ground up to run on the dedicated Neural Processing Units found in Copilot+ PCs, which Microsoft introduced in 2024 as a hardware category defined by on-device AI requirements. The models, derived from the Phi-3 architecture, deliver low-latency language processing without a cloud call. Extending the same models to discrete GPUs is a directional shift: the NPU was the point of distinction for Copilot+ hardware, and bringing Phi Silica to graphics cards softens that boundary.
The GPU path has a specific hardware floor. An RTX 30-series card or newer with at least 6GB of video memory is required. AMD support is listed as coming later, with no date given. Access is further gated by enrollment in Microsoft’s Experimental Channel for Windows Insider, Developer Mode being enabled, the Windows App SDK build 2.2.2-experimental9 or higher, and current GPU drivers. That stack of requirements keeps this firmly in developer-preview territory rather than a consumer toggle.
The capability gap with NPU hardware is real and worth noting. GPU-based execution does not include prompt compression or speculative decoding, two NPU-specific features that affect context handling and generation throughput. NPUs in Copilot+ laptops are low-power, purpose-built inference chips; discrete GPUs bring more raw compute but do not replicate every part of the NPU software stack that Microsoft has built around Phi Silica. The release announcement does not include independent benchmarks comparing inference latency between the two paths.
Microsoft’s Windows AI API surface now officially covers three hardware classes: Copilot+ NPUs, supported discrete GPUs, and CPUs meeting recommended specifications. Phi Silica sits within the narrower GPU exception inside that matrix. Separately, Windows ML, Microsoft’s unified inference framework, already supports custom and open-source models across NPUs, GPUs, and CPUs from AMD, Intel, NVIDIA, and Qualcomm. The Phi Silica GPU path does not alter Windows ML’s broader remit; the two systems address different use cases, with Phi Silica remaining a tightly specified first-party model API.
For developers, the practical question this raises is whether on-device inference for a Windows app should depend on NPU availability or accept a broader GPU audience. Today, Phi Silica on GPU is a developer preview with missing features and a multi-step setup barrier. Apps must check a GetReadyState flag before calling the model and must not invoke EnsureReadyAsync on unsupported hardware. The model is not preinstalled on GPU-lane devices; it downloads on first request. Those guardrails exist to prevent apps from advertising GPU-based local AI before all system conditions are confirmed, but they also reflect how far this path is from a shipping consumer experience.
Microsoft has not published a consumer release timeline for the GPU path. WinBuzzer reported the current state as of June 15, 2026.
Developers building Windows AI features today should treat the NPU path as the production target and the GPU path as an expansion to prototype against. If Microsoft ships GPU support broadly before late 2026, the addressable hardware base for Phi Silica-powered features grows significantly, since RTX-equipped desktops and gaming PCs outnumber Copilot+ laptops in most developer households. That math matters when deciding whether to build feature parity around the NPU-only capabilities now or to wait for the GPU path to close the gap.
Reported by WinBuzzer (winbuzzer.com), published June 15, 2026.