In your glass tower, yes.
In the real world, not so much.
Here is an example of one of the world's most optimized pieces of software: x264. It's also one of the few real-world loads that can take advantage of multiple processor and SSE. So how much speedup did this incredible piece of software see with AVX2, which DOUBLED the width of the integer pipelines?
FIVE PERCENT! Yup, that's it!
All that work for so very little improvement, because in the REAL WORLD data does not align on perfect AVX2 boundaries, and data fetch is as much of a hindrance as the actual processing of that data. Read more about WHY this is the best that could be done here, if you don't mind paying for SCRIBD.
Parading around test results form something like Passmark is just self-delusion. It only tests that the features do in-fact work, and these tests tend to work directly from cache in small data sets that are usually not branch-heavy. IT gives score for number of MIPS, but does not take into account the fact that most real software can't actually make use of these features at-speed.
And when they increase the vector size yet-again to 512-bits wide in a year, it will once-again be a limited real-world improvement, because optimization of real loads is hard, and auto-vectorization of arbitrary loads is even harder problem to solve. So Intel keeps adding new features, and they keep adding about 5-7% each (real world). So I don't see how you get above 3x from those puny performance increases, while not deluding yourself.