Desktop (and to a lesser extent) laptop processors use multiple pipelines to improve performance and limit stalls
ARM chips have multiple cores, each with its own pipeline. In fact, ARM processors using a "big.LITTLE" microarchitecture have sets of performance-optimized and power-optimized cores for use during different power management states. Are you referring to "superscalar", in which the instruction decoder reorders multiple instructions from one thread to run them in one cycle? Or are you referring to simultaneous multithreading (SMT), where two instruction decoders, one on each thread, feed into a single set of execute units? Intel Atom uses SMT to hide stalls, as do recent AMD microarchitectures where the two cores in a "module" have their own integer execute units but share FPU and other resources.