Basically the only thing that matters for LLM hosting is VRAM capacity
I’ll also add that some frameworks and backends still require CUDA. This is improving but before you go and buy an AMD card, make sure the things you want to run will actually run on it.
For example, bitsandbytes support for non-CUDA backends is still in alpha stage. https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
Yeah, AMD is lagging behind Nvidia in machine learning performance by like a full generation, maybe more. Similar with raytracing.
If you want absolute top-tier performance, then the RTX 4090 is the best consumer card out there, period. Considering the price and power consumption, this is not surprising. It’s hardly fair to compare AMD’s top-end to Nvidia’s top-end when Nvidia’s is over twice the price in the real world.
If your budget for a GPU is <$1600, the 7900 XTX is probably your best bet if you don’t absolutely need CUDA. Any performance advantage Nvidia has goes right out the window if you can’t fit your whole model in VRAM. I’d take a 24GB AMD card over a 16GB Nvidia card any day.
You could also look at an RTX 3090 (which also has 24GB), but then you’d take a big hit to gaming/raster performance and it’d still probably cost you more than a 7900XTX. Not really sure how a 3090 compares to a 7900XTX in Blender. Anyway, that’s probably a more fair comparison if you care about VRAM and price.