Just to clarify writing real operating systems in managed environments is difficult and a research area (see Singularity, Java OSs like JNODE). It strikes me that what your saying is that using a managed language to simulate part of an OS won't give the student the complete feeling of what it's like to program in an unmanaged environment. I imagine that this is largely done due to the constraints of what you can teach in a course. When I've taught classes involving C and C++ I've also ended up teaching utilities like electric fence and purify. These tools would be out of place in an OS but you have to teach them unless your number of students is small and you can hand debug every program that's being written - reducing the debugging burden and obvious pitfalls (e.g. returning the address of a local variable in C/C++) is the reason for using a managed language.
I think the issue here is that courses that are superficial are not useful if you want someone with detailed knowledge. I think that's kind of a no-brainer and not something that relates to a particular programming language. Maybe knowledge of C/C++ is an indicator of someone who has more than just superficial knowledge though :-)
There are real issue with OS design and the use of pointers though:
(Real World) Scenario 1:
Imagine a micro-kernel that allows messages to contain addresses, for example a shared buffer being used to draw fonts into between an application and the font server. Now imagine that your OS needs to support 32 and 64 bit, big (e.g. PowerPC) and little (e.g. Intel) endian architectures. As your application can generate 4 different types of address you need to either run a mega server that knows how to handle all 4 types of request, or 4 individual servers each configured for the respective address size and endianness. In both situations you are likely to end up with a large (100s of MBs) memory overhead.
(Real World) Scenario 2:
Imagine a monolithic kernel where device drivers are written in C, often by people only concerned with getting their flavour of architecture to work with their hardware. If you try to use this device driver on an a architecture with a different endianness or address size what guarantees do you have that the driver will work? From my experience, little (unless someone else tried it before you and fixed the problems).
I believe that constraining programmers is often the right thing to do; you can design out potential bugs. The programmer should also be aware of why they are being constrained though - ie higher levels of abstraction are better (by design) but you need to keep sight of the bottom of the stack and not be superficial, unless the course is an introduction (which many are).