Comment API (Score 2) 23
Programmers were moving to DirectX and off of 3D effects glide API.
By that time point (2000s), Glide itself wasn't that much relevant: most game engine relied on high level API (DirectX 3D as you mention, and also OpenGL: Quake3 had already been out for 1 year, and "mini GL drivers" that serve as an adaptation between high level OpenGL and low level such as Glide were all the rage).
Very few engines had Glide-specific optimizations.
API exclusivity wasn't playing a role anymore.
But failing to have distinguishing features that attract users did play a role.
VSA-100 in Voodoo 4/5/6 had a few interesting new feature (*vastly improved Voodoo 1, 2, and 3's ability to do "pseudo 22bit": more than 65k colors in 16-bit modes and to avoid error and dither amplification in 16 modes; new FSAA with a rotated grid that both allows better edge anti-aliasing and circumvents the need for anisotropic filtering; motion blur for supporting games; better texture compression, etc.) but these failed to attract users' interest (I suppose most users didn't even understand those features), whereas Nvidia managed to gain more users' interest with better number in 24/32bit modes benchmarks and some marketing around T&L (despite T&L not being that much actually used in games of that era -- better CPUs with SIMD achieving similar scene-processing in practice)
The next generation would have been more interesting with 3Dfx plans to add programmable pipelines with Rampage and Sage ("Spectre" graphics cards) instead of fixed pipelines (i.e.: to add geometry and pixel shaders in the parlance of high-level API like D3D and OpenGL) but it never reached market, only prototypes existed when 3Dfx folded.
Though some of that development eventually helped the GeForce FX
Of all the graphics card things I always wondered about is what if the power VR card had working drivers.
Well, look at Apple's iPhones...
It was a card based on the same techniques in architecture is the Sega Dreamcast and for about $120 it could perform like a $300 G-Force card.
Speaking the Sega Dreamcast, the whole snafu of "Katana" (the actual NEC / SuperH + PowerVR) vs "Black belt" (competing prototype with a 3Dfx gfx card) also did cost quite some money to 3Dfx and accelerated their demise.
When it work.
The main problem is that PowerVR works in a way that is completely alien when compared to everything else (outputting fully-rendered tiles, when everyone else is drawing polygons one-by-one on a pair of frame and depth buffers).
And tile-based-rendering's performance boost is less significant when there are more transparency layers in a scene which is where most of the industry was heading. (So the advantage that Kyro II had would have melted away with successor cards).
Speaking of TBR, 3Dfx had acquired GigaPixel to get patents to such technologies, did toy with software-based hidden surface removal (HSR) that worked in some OpenGL engines (Everything built on id tech 3 / Quake III engine) and was hoping to add hardware HSR with TBR in the next iteration of Rampage/Sage ("Spectre 2" / Mojo) but never got there before folding.
(And again, more transparency layer in scenes would eventually have made the performance gains less significant in future games).
---
*: Even the venerable Voodoo 1 "Pseudo 22bit" 16bit modes are different from everyone else on the market.
All competition used pure 16bit math pipeline. When combining multiple pixels (multi-texturing, transparency, effects, etc.) rounding error accumulate. It's even more visible when Bayer dithering is used (with an Intel integrated graphics core back then, this was visible on, e.g., Quake III's logo which had multiple translucent flames effect. The errors accumulated with each layer and the logo ends up looking like a checkered board.
In contrast, when in 16bit mode, all 3Dfx chips run at "pseudo 22bits" internally and on their video output, and combine the value of 4 sources pixels (hence the 2 extra bits per channel), less error accumulate on translucent layers, less visible dithering on the output.
On Voodoo 1 and 2, this is simply done by using 4 horizontally adjacent source pixels, giving the characteristic "slightly horizontally blurred" visual effect that is typically for early 3Dfx cards.
On Voodoo 3, the 4 sources can be also arranged in a square and there's more logic in how to combine them (I suspect something like conditional blur, but never read any full description)
On Voodoo 4/5/6, the chip introduces multiple buffers (2 per chip, up to 4 buffers on the most common dual-chip Voodoo 5) (equivalent to OpenGL's accumulation buffers), which introduced tons of cool efffects (introduce an offset between the buffer, and you get both antialiasing on the edges and the same result as anisotropic filtering by using plain trilinear on texture surface; render each buffer at a different time increment, and you get motion blur; etc.) - but this also allows even less blur when pick the 4 sources pixels for "pseudo 22bit" output (just pick 4 pixels at each same coordinate on each 4 buffers)