Just use something like libsimdpp and you are sure that your code stays vectorized between compiler versions. As a bonus, this and similar wrapper libraries give you an option to produce assembly for multiple instruction sets (say SSE2, AVX and NEON) from the same code. : https://github.com/p12tic/libsimdpp
Sorry, my post was directed to the parent of your post. Somehow I misclicked somewhere and didn't notice.
Just use libsimdpp ( https://github.com/p12tic/libsimdpp ) or any of the myriad similar wrappers. With modest time investment you get almost optimal implementation for multiple instruction sets on any compiler you use.