So I treated myself to one of these
measures 6x9x3.5 or so, weighs the same as a brick more or less, has 128GB of RAM, 96 of which I’ve assigned as video memory. I can run sizeable large language models at home, with… leisurely speed, let’s say, but it’ll run things you can’t run without setting up several Nvidia GPUs in a single machine or some kind of cluster. And it’s dead silent most of the time, when you’re not asking for heavy work. Lovely little box.
I’m learning, though, esp. under Windows, how much work lies ahead to make common AI tooling comfortably compatible with ROCm compared to stuff just working with NVDA. I’m making good progress – got Docker, ollama, and some chatbots up, a nice UI for chatting and image generation, and a workflow engine that can eventually support Agentic AI… Trying to understand how, in reality, all these pieces fit together in the real world by building myself a modest but meaningfully capable environment of my own. So… many… moving parts, and then the fact that some core things don’t work under ROCm without A LOT of unsupported hoop-jumping and hacks. Still, gratifying to see progress.
I can only assume that a) on Linux a lot of things work better (don’t ask me about Docker for Windows not exposing your graphics hardware to anything running in Docker) and b) when Lisa Su sells the AI hardware to hyperscalers and large enterprises, they do A LOT of handholding.
On a tangentially related note: it scares me that there are Strix Halo handheld consoles with 128GB of RAM, and the same ability to assign massive memory to VRAM and run the kind of stack I have now. Who needs THAT for a game?