I am the author of the CHERI C/C++ implementation (and a bunch of the instructions and key parts of the abstract model). A few corrections:
does pointer bounds and 'const' qualifier checking and some protection of C++ private member variables in hardware
We don't enforce const in hardware, because it broke too much code. For example, the C standard library function strstr takes a const char* argument and returns a char* derived from it. Our hardware doesn't permit you to add permissions to a pointer (or increase its bounds), and so the returned pointer also lacks the store permissions. Instead, we add __input and __output qualifiers that allow this to work: if you cast a pointer to an __input-qualified one then the compiler removes store permissions and no pointer derived from that pointer can be used to modify an object.
sufficient to replace the security properties of virtual memory (with the caveat you can't unmap a page without killing the process)
CHERI doesn't require that you throw away the MMU and some things work a lot better when you compose the two. The MMU is good for coarse-grained isolation, CHERI is good for fine-grained sharing. You can use the MMU to revoke objects (unmap the pages), without having to find and invalidate all pointers (which we can do - a couple of my students and I have added accurate garbage collection to C).
It's implemented in FPGA and only requires a few gates over a standard MIPS and is working with comparable performance without any exotic kinds of cache memory.
Note that we do rely on tagged memory, though we are able to efficiently implement this in commodity DRAM via a tag cache (some of my colleagues have done some great work on improving the efficiency here). We need one tag bit per 128 bits of memory (the tag bit tells you whether an aligned 128-bit value contains a pointer or normal data), so 256 bits per page. You can read 256 bits from a single DRAM read, so anything with vaguely good cache locality rarely needs to pull the tags from DRAM.
It's only "approaching" because I don't think it can solve use-after-free
We can implement an accurate garbage collector in C with CHERI. Some of my current work is attempting to push the performance of this up to where C programmers will be willing to just leave it on, because they won't be able to measure a slowdown vs malloc / free. For the Java interop work that I published at ASPLOS this year, we did very coarse-grained revocation, allowing the JVM to invalidate pointers that the native code held for longer than it claimed that it would, thus preventing any spatial or temporal memory safety violations in C code from affecting the Java heap in any way.