I recommend looking at some of the Multics features, particularly the use of segmentation and paging instead of a more normal file system.
see http://www.multicians.org/features.html for an introduction to Multics features and the references for how they really work. Multics was written to run on an SMP system so a system with 1 processor was the special case.
I would also suggest looking into security, again both the permissions an perhaps the ring based permission levels. The more the cpu can help with building in security from the start the easier it will be in the long run.
You might also want to look into the NS32xxx series from National Semiconductor in the late 80's/90's( not sure exactly when). The NS3200 all had the same instruction set, however you could get the chip with 8bit, 16bit, and 32bit memory/data access. At this stage of CPU's I'd suggest looking into detaching the instruction set from the 'bit' size so the external access to memory/data was independent of the bit size. With a 32 bit data/address bus it may take 2 transfers to handle a 64 bit data item, but that can be done by the hardware without the software having to know about it. All the software sees is a 64 bit (or maybe later on with added instructions a 128bit) address/data path with the actual path width hidden by the hardware. A 64 bit program might run a little slower on a 32 bit bus, but it will run.