Comment Re:Technobabble translation... (Score 5, Insightful) 70
The demand for DDR5 will evaporate in 2 years whether the AI bubble pops or not. IF new factories are coming online they need to be sized for new tech, but only LPDDR6 has a ratified spec and it's not ideal for AI. So ramping up lot of extra factories for LPDDR6 makes little sense
Sandisk has been devloping an HBM-killer memory called High-Bacndwidth-Flash... but they aren't sharing much information. I'd say this announcement either means High-Bacndwidth-Flash isn't panning out or that they won't need new factories to produce it.
As far as what others are doing that makes investing in DDR5 dumb:
1. Cerebras doesn't use memory chips, just a crazy crazy amount of on-CPU cache.
2. Intel is re-entering the memory market with a sorta radical new ZAM memory that is considered an HBM-killer.
3. Turboquant is fairly useless at actually speeding things up currently, but it proves that we can shrink the need for memory if compute can handle this type of quantization in-line. In the near-term this will mean accelerators coming out in the next few years will have these schemes baked into the hardware and need far less RAM. This is roughly comparable to when we started getting hardware accelerated texture compression with 3D graphics cards. It looks like the Turboquant idea was plagiarized and overhyped, but the idea is likely amazing if applied at the CPU level.
Sandisk has been devloping an HBM-killer memory called High-Bacndwidth-Flash... but they aren't sharing much information. I'd say this announcement either means High-Bacndwidth-Flash isn't panning out or that they won't need new factories to produce it.
As far as what others are doing that makes investing in DDR5 dumb:
1. Cerebras doesn't use memory chips, just a crazy crazy amount of on-CPU cache.
2. Intel is re-entering the memory market with a sorta radical new ZAM memory that is considered an HBM-killer.
3. Turboquant is fairly useless at actually speeding things up currently, but it proves that we can shrink the need for memory if compute can handle this type of quantization in-line. In the near-term this will mean accelerators coming out in the next few years will have these schemes baked into the hardware and need far less RAM. This is roughly comparable to when we started getting hardware accelerated texture compression with 3D graphics cards. It looks like the Turboquant idea was plagiarized and overhyped, but the idea is likely amazing if applied at the CPU level.