The modern JIT compilers do not do the kind of performance optimisations that they could, in theory, do. It simply costs too much in development time to cater for all the combinations and possibilities, or it costs more much in CPU time to calculate the optimisations than it would save in executing the slower code.
GC is faster than malloc for allocating, but when it comes to deallocating.. its a lot slower. Obviously, it has to do a lot more like compact the heap which is a pretty slow operation.
Most functions in C are inlined - the compiler has plenty of time to optimise a C program (ie the compile stage is a lot slower) that is can decide which functions are better inlined (based on size and/or call quantity).
Now you don't tend to notice much performance difference because modern CPUs are sitting spinning their wheels with cycles to spare - so the slower Java/.NET code still runs roughly as fast. But when you get server-side code that heavily uses the hardware - like this Database system - then you are really going to notice the difference.
There's one killer way to know if Java/.NET code is in fact slower than C++ native code: if the companies that produce it decide to create a native compiled version of their language. Which Microsoft has finally decided to do - .NET Native is building .NET source using the old C++ backend technology to produce native binaries. Microsoft says this makes their .NET programs run 30% faster. Still not as fast as C++ due to design limitations such as greater memory usage than a C++ program, but still - faster than old-style bytecode .NET.
The proof that C/C++ is faster than Java/.NET doesn't get more damning than that.