Computex has been quite the wild ride this year. We've had new RAM be announced amid incredible supply constraints, intensified demand from data centers, and obscene pricing on consumer kits. There has also been quite a stir within the AI community with Nvidia rolling out RTX Spark, a new Windows-on-Arm platform built with Microsoft aimed at running local AI agents. Qualcomm isn't the only Arm kid on the Windows block anymore, but while Nvidia is excellent at creating an ecosystem and working with developers, it can obscure what most people will actually require local inferencing to do.

Think about it. Just how large a large language model (LLM) do we actually require to have running on a laptop, let alone a desktop PC? I'm comfortably running a 7B model on a mini PC and enjoying the prompting and responses with specific models, even going as far as to help me with some coding. It fits my needs entirely. Would increased performance be welcomed? Absolutely. I miss running models on my RTX 4060 Ti with 16 GB of VRAM, but I barely utilized it, and I doubt I'd do the same with an Nvidia RTX Spark-enabled laptop. I much prefer to have AI running in the cloud, so I can access it from anywhere.

Nvidia's RTX Spark is a flex, nothing more

I remain skeptical whether it will sell well enough

Even if you're a hobbyist running 35B models from home, it's likely you won't need an entire agentic AI OS stack. So long as you have ample memory, a decent enough token speed, privacy, and a stable enough environment for running all your prompts, that's all you really need from a local AI deployment. It's why AMD isn't particularly worried just yet with its own Ryzen AI offerings. The company is most certainly keeping tabs on Nvidia since Team Green is now directly competing on the SoC front, but there's a clear distinction between the two approaches.

According to Nvidia, RTX Spark PCs will feature up to 1 petaflop of AI compute with a whopping 128GB of unified memory (so it's super-fast), an integrated Blackwell RTX GPU with more than 6,000 CUDA cores, fifth-gen Tensor Cores, and a 20-core CPU connected over NVLink-C2C. All that means these PCs should be absolute monsters at specific tasks. AI, gaming, and content generation. Essentially, anything that can leverage all that high-speed and advanced technology within the SoC will see the RTX Spark platform utterly decimate the competition. But it feels a lot like the GeForce RTX 5090.

If everything holds out at launch, a top-spec RTX Spark PC should be able to run a 120B-parameter LLM with up to a million token context, which is huge compared to local CPU and GPU-bound agent deployments. The system is slated to be able to render 90GB 3D scenes, edit 12K video (because 8K isn't impressively far out already), and generate 4K AI video, with the ability to play modern games at 1440p and 100 frames-per-second (FPS). DLSS will help a lot there, but it's a serious platform that should command quite the price tag. So, Nvidia isn't really building an AI chip, but more of a complete package.

An RTX Spark PC will be a developer box, gaming laptop, local agent platform, and rendering powerhouse for Windows on Arm. All of Nvidia's other technologies will be fully utilized to make this reality, including CUDA, TensorRT, DLSS 4.5, OptiX, Reflex, and G-SYNC. Honestly, it's bloody impressive and something I had to read through a couple of times. Nvidia is going all-out with RTX Spark. But that's just the thing. People are already suffering from the prices of hardware today. How is anyone going to afford such a PC?

AMD is going about it differently

Unified platform with quantization

Credit: AMD

AMD has been testing its latest Ryzen AI Max+ chipset using ROCm, Ollama, and Qwen 3.5 models. Because the CPU and GPU within these chips share the same physical memory pool, similarly to how Nvidia is doing it with RTX Spark, it's actually pretty good at running local agents, especially when it's paired with 128GB of RAM. AMD itself used quantization with Qwen 3.5 to achieve 29.84 tok/s with a dense 9B model, 42.04 tok/s with a 35B MoE model, and 8.59 tok/s with a 122B MoE model. That's impressive enough and perfectly usable. This was a combination of the CPU and GPU, both loaded.

I see Nvidia optimizing for the edge case, only to market it as the mainstream later on.

It's vital to point out just how much faster the 35B model was compared to the 9B model, thanks to just 3B parameters being available per token within the MoE architecture. Improved models, quantization, and memory residency are king when it comes to powering these agents. It's not really about whether this system can win a TOPS competition and break world records, but how best a model can fit, can it stay adequately fed with memory, and can I tolerate the latency with each response? I personally find anything north of 20 tok/s to be perfectly acceptable.

Once again, I see Nvidia optimizing for the edge case, only to market it as the mainstream later on. The average PC users like you and I aren't going to be rendering 100GB scenes with a 120B model kitted out with an obscenely large context. We're summarising, getting help with coding apps, making lightweight apps, working with home automation, transcription, classification, or general queries and research. These are tasks that don't require a lot of computing power. It's how and why AMD can place Ryzen AI Max+ as the better choice for those who don't need an overkill system.

👁 ASRock Arc B570 Challenger Intel Arc branding
6 reasons to pick Intel or AMD for your next GPU instead of Nvidia's RTX 50 series

Nvidia's RTX 50 series might be hot right now, but there are enough reasons to consider an AMD and Intel GPU

Nvidia still has CUDA ... and Arm

It's an uphill struggle for AMD

Regardless of how Nvidia and AMD are positioning their own SoC solutions, there's no denying just how far ahead CUDA is. It's the default mental model for GPU acceleration, and all of this (and more) is coming with RTX Spark. But AMD is making moves with ROCm 7.2 rolled out with support for Ryzen AI 400-series processors on both Linux and Windows. Local AI is becoming increasingly tool-led, as we've covered right here on XDA. People don't care about TOPS. They want to know how well Ollama, llama.cpp, LM Studio, and ComfyUI run with decent models.

On the flip side, for Nvidia, it's using Arm. While Windows on Arm has come a long way and even Valve is working to bring Steam to the architecture, x86 is still where it's at for most tasks. Nvidia does have plenty of vendors looking to release their own RTX Spark platform, and AMD has a strong history with x86 Linux and Windows experience with integrated Radeon graphics and unified memory. While AMD wants to evolve the PC into an actual AI PC, Nvidia is almost rolling out a new class of PC. Both are valid approaches, but only time will tell how effective they are.

RTX Spark could well be the future

It seems like Nvidia is betting on the future of AI. RTX Spark could seed the ecosystem for local agents the same way Nvidia used RTX to launch ray tracing and DLSS. I noted how Nvidia's RTX Spark could appear as overkill here, but it could well become mainstream one day. But for now, most people who dabble with LLMs seem to focus more on usable speeds, workflows, and affordable systems. So it's not which platform is outright better and can perform the best in a demo, but more so what will offer the best value for PC users in 2026.