One of the problems with increasing clock speed is gate capacitance and the RC time constant charging curve causing the switching FETs to operate in the linear region, causing power dissipation to go up with clock speed. This is why a decrease in process size has typically yielded a corresponding decrease in power dissipation at a given clock speed.
If you make the capacitance smaller, you can increase the switching speed (capacitance would decrease with the square of the feature size (gate capacitance is dependent upon gate area), wheras resistance would increase linearly, inversely proportional to feature width, assuming the feature depth doesn't change (resistance dependent upon cross-sectional area)).
Another poster has already mentioned asynchronous designs, so I'll pass on that particular nuance.
But clock propagation is a serious issue, and I can see a vacuum transistor improving this considerably.
Now, figuring out how far a wavefront will propagate in some period of time isn't too hard.
Undoped silicon has a relative permittivity of 11.68; the reciprocal of the square root of the relative permittivity is the velocity factor of a particular dielectric; for undoped silicon that's about 30% of c. Silicon dioxide, as used for most of the insulation on the typical MOSFET design, has a relative permittivity of 3.9 and thus a VF of about 51%. On a stripline laid on silicon dioxide (silica glass) the velocity of propagation is about 153 million meters per second, or 153 meters per microsecond or 153 millimeters per nanosecond or 153 microns per picosecond. 153 microns is a bit larger than the cladding on a typical fiber optic strand (most have a cladding diameter of 125 microns; OM1 multimode is 62.5 micron core/125 micron cladding, OM4 is 50 micron core/125 micron cladding, and single-mode is 8 micron core/125 micron cladding, for comparison). That's best case propagation time.
Now, to see how this translates to something of today, at least one of the models of the latest Haswell-DT Core i7 chips has a die size of 177 square millimeters. The chip is not square, and seems to be about a 4:1 rectangle in photos, which would yield about a 6.5 mm by 27.25mm die (yes, I know that gives 177.125 square millimeters; close enough).
Now, if a clock signal needs to go straight across the narrow portion, it will take about 42.5 picoseconds to do so, assuming transmission across silion dioxide alone. Propagation in the long direction would take about 178 picoseconds to do so, with the same assumption. The published top speed of this processor is at the time of this writing about 4.5GHz (I know it's a bit higher, but that's a moving target). This is a 222 picosecond clock period; easily doable in the short dimension, a bit more difficult in the long dimension, and probably already requiring some asynchronous elements and delay compensation. If you limit solely on clock propagation time, and are able to work in a slip of a full clock cycle, the long dimension will give you a limit of a bit over 5.5GHz; the short dimension will similarly give you a limit of 23.5GHz.
That's drastically oversimplified; each gate has it's own propagation delay that must be figured, and there are four cores (which makes it pretty understandable why the chip would have a 4:1 die dimension ratio, no?). A 20% clock delay factor will allow, with care, a good chance for synchronous operation (42.5 is pretty close to 20% of 222), but that's assuming straight clock traces (and they are not just straight across the chip).
Food for thought.