A lot of people misunderstand the market for the DGX Spark.
If you want to run a small model at home, or create a LoRA for a tiny model, you don't want to do it on this - you want to do it on gaming GPUs.
If you want to create a large foundation model, or run commercial inference, you don't want to do it on this - you want to do this on high-end AI servers.
This fits the middle ground between these two things. It gives you a far larger memory than you can get on gaming GPUs (allowing you to do inference on / tune / train much larger models, esp. when you combine two Sparks). It sacrifices some memory bandwidth and FLOPs and costs somewhat more, but it lets you do things that you simply can't do in any meaningful way on gaming GPUs, that you'd normally have to buy / rent big expensive servers to do.
The closest current alternative is Mac Studio M2 or M3 Ultras. You get better bandwidth on the macs, but way worse TOPS. The balance of these factors depends greatly on what sort of application you're running, but in most cases they'll be in the ballpark of each other. For example, one $7,5k Mac M3 Ultra with 256GB is said to run Qwen 3 235B GGUF at 16 tok/s, while two linked $4,2k DGX Sparks with the same total 256GB are said to do it at 12 tok/s, with similar quantization. Your mileage may vary depending on what you're doing.
Either way, you're not going to be training a big foundation model or serving commercial inference on either, at least not economically. But if you want something that can work with large models at home, these are the sort of solutions that you want. The Spark is the sort of system that you train your toy and small models on before renting out a cluster for a YOLO run, or to run inference a large open model for your personal or office internal use.