That definitely makes a difference. The quality of response you see between something like Gemini Flash and Gemini Pro is astounding because it's indexing on getting it right rather than getting it fast.
I assume you're saying Pro is massively better for your workload. IMO, thinking is either good or bad, depending on whether it moves you closer to or farther away from correctness.
For example, I've seen certain types of workload (e.g. anything involving image recognition or image segmentation) be massively better with Flash, because Pro overthinks things and ends up changing perfectly correct answers to be wrong, either by coming up with creative ways to misinterpret the prompt or by screwing up the JSON image segmentation fragment so that it can no longer be parsed.
And I've also found that LLMs struggle to understand existing terms in a different context that they weren't trained on. As a result, I've had to substitute nonsense terms in place of terms based on common English words and phrases so that it won't ignore my definitions of those phrases in context and substitute its own understanding of their meaning and give incorrect results. The more thinking you allow, the more likely it is for that to occur.