I agree with you that some benchmarking system would help, maybe "flops per amp-hour" or something, but your formula would lead to manufacturers gaming the system for higher scores and ignoring that people want different devices for different needs.
For example, a smartphone may have less screen size and performance than a tablet but not have an equal return in battery time or weight reduction, resulting in a lower score, and still may be "better" for someone that can't fit a tablet in their pocket.
Likewise, sedans and motorcycles are lighter and more fuel-efficient than larger, more powerful pickup trucks, but that just makes them suited for different markets and uses, and it wouldn't work to submit them to a one-size-fits-all Automobile Score.
Telling customers the class of the mobile device, processing ability, power efficiency, and battery capacity would be just as good describing the class of an automobile, engine output, MPG rating, and fuel capacity.