Every wonder why E = mc^2? It has always bugged me why it was
so . Hardly any college introductory physics courses go into why. In fact, hardly any college courses derive the
formula and just assume it is correct. I hardly ever accept anything on face value: I like running experiments, confirming
observations, compiling source code, and deriving formulas. By doing so, I get a better understanding of the stuff I use.
That is a worthwhile endeavor.
Now I finally figured out why E = mc^2, all by hand. It doesn't take much more than what you learn after a single year of
a single college physics courses (Physics I + II) and a year and a half of calculus (Calc I + II + III). I highly encourage
anyone who has taken those classes to try it out. It is very enlightening!
The Speed of Light Is a Constant
To start, you have to accept the following axiom.
The laws of the universe are valid in all inertial reference frames.
That seems super obvious, but there are interesting concequences. You see, one law of the universe is that the strength of electric fields is a certain constant in space. This is
represented by the variable e - AKA the "Permittivity of free space" - and the higher e is the weaker
the electric field.
Another property of the universe is the strength of
magnetic fields. This term was mostly subsumed into the permittivity constant since magnetic fields are created by
electric fields, so you don't really need to consider it. This magnetic constant was should be a geometric factor (4*pi) but
is usually multiplied by 10^-7 for practical reasons.
The reason this is important is that light is a wave. If you solve Maxwell's equations for an electromagnetic field (in
space), you find that the speed of light only
depends on the permittivity of space, e, and the strength of magnetic fields in space. In fact, the final result
describing the speed of light is simple:
c^2 = 1/(e*u)
- c
- Speed of light
- e
- Strength of Electric fields (Permittivity)
- u
- Strength of Magnetic fields (Permeability)
Such a simple result is rarely a fluke. It also makes little
sense. For many years, scientists tried to disprove this result with many experiments. Their skepticism was
based on the prevailing theory of the day: that light is not only a wave but a particle too. It is like Jello: while hot it looks like a liquid
and waves can be seen. While cold, Jello looks like a solid and can hold its shape. Light is similar - in certain
circumstances is acts like a wave and others it acts a particle. This violates the principle of relativity.
The Principle of Relativity
Relativity is a basic observation. Say you are in a bus, and you walk from one end to the other. You don't feel like you
are traveling very fast... only a few miles per hour. And to the rest of the people in the bus, you are moving that slowly.
However, to people on the street, you are moving very fast. Your speed, to them, is your velocity (relative to the people on
the bus) plus the velocity of the bus (relative to the people on the street). Take this example:
Look at this picture (from this page).
- O
- Represents the "stationary" reference frame with respect to us (you and me).
- O'
- Represents the "moving" reference frame wrt O (and us).
- v
- Velocity of O' relative to O.
- x, y, z
- Coordinates according to O.
- x', y', z'
- Coordinates according to O'.
- t
- Time in O (not shown in diagram) since some start event.
- t'
- Time in O' (same as O).
We are only going to consider one dimention, so y, z, y', z' can be ignored for now. Time
(t) and accelerations are the same in each frame (O and O') if V is a constant - i.e. O'
doesn't accelerate and moves with a constant speed in a constant direction (wrt O.) Say we have a point called
P with coordinate x' in O' and is not moving. Relative to O it is moving with speed v
(i.e. the same as O'). To figure out what point P has in O - called x - you use:
x = x' + v*t'
y = y'
z = z'
t = t'
Velocity times time is the distance O' moved, and x' is the distance P is in O', so added
together you get the distance P is from O.
From the vantage point of O', things are just the opposite. You come up with the following equations, which can be
derived from the equations above:
x' = x - v*t
y' = y
z' = z
t' = t
Just use algebra to figure out those. That second set (and the following set) seems almost trivial, but there are
important effects later. There is one more set of equations to note:
ux = ux' + v
uy = uy'
uz = uz'
These are the velocities of a particle at the point if it was moving instead of stationary wrt O'. We used
u to differentiate it from the speed of the reference frame, v.
Those are the Galilean Transoformation Equations and are the main result of the Principle of Relativity. See if you can
understand that before going on. It is pretty standard and makes sense if you think about it.
How the Speed of Light Mucks Everything Up
The speed of light depends on the strength of electric fields in space, as shown above. But in two reference frames, the
speeds observed have to be different according to relativity. If c is the speed of light, and it travels only in the
x-direction:
c = c' + v (Not true as explained below!)
Thus, each frame must see a different speed of a light beam. But Maxwell says the speed is a property of the universe (as I
said before), so c = c'. One of them is wrong. After lots of experiments, it seems Galileo's Relativity is
WRONG! The speed of light, no matter how it was measured, was the same for all frames of reference. So we have the
following axiom:
The speed of light is 186,282.397 miles/second for ALL reference frames.
So how do we fix Relativity, since it seems to work in most cases? There must be a correction factor that has to be added
in. Let's call that factor, gamma or g. That means the transformation equations will look something like:
x = g*(x' + v*t')
x' = g*(x - v*t)
Those are the same equations as above, just with the correction factor added. But what is the correction factor? Let's
perform an experiment to find out!
Now, even though O' is moving, lets start it at the same place as O. So this start occurs where t =
t' = 0. At this time, a light pulse is emitted from the origin of the frames (remember, they are at the same place
right now) and moves in the x-axis direction. According to O it moves a distance of:
But according to O' it moves a distance of (note c'=c):
Plugging these into the equations above:
c*t = g*(c*t' + v*t')
c*t' = g*(c*t - v*t)
Subsitituting:
c*t = g*(c + v)*t'
c*t' = g*(c - v)*t
This is a system of two equations. Solving one for t' (say the second one) and substituting into the first one can
help you solve for g:
t' = g*(c - v)*t/c
and
c*t = g*(c + v)*t'
c*t = g*(c + v)*g*(c - v)*t/c
c = g^2*(c + v)*(c - v)/c
c^2 = g^2*(c^2 - v^2)
1 = g^2*(1 - v^2/c^2)
g^2 = 1 / (1 - v^2/c^2)
g = 1 / (1 - v^2/c^2)^.5
That is what g actually is. For small values of v, it is about zero. But if you travel facter than light, it
grows to infinity! That was an early indication you couldn't travel faster than light.
Another result is that, since the space direction (x) changes, the time direction must also change. This makes sense,
since speed is distance over time. If the speed is constant, but the space dimention changes, then time should do. You can
solve this by using the two equations above again:
x = g*(x' + v*t')
x' = g*(x - v*t)
But because we know what g is, we can solve for t now:
x = g*(x' + v*t')
and
x' = g*(x - v*t)
x' = g*(g*(x' + v*t') - v*t)
x'/g = g*(x' + v*t') - v*t
v*t = g*(x' + v*t') - x'/g
t = g*(x'/v + t') - x'/(v*g)
t = g*(x'/v + t') - g*x'/(v*g^2)
t = g*(x'/v + t' - x'/(v*g^2)
t = g*(t' + x'/v*(1 - (1 - v^2/c^2)))
t = g*(t' + x'/v*v^2/c^2)
t = g*(t' + v*x'/c^2)
Together with the distance-equation, these form the new relativity equations, called the Lorentz Transformations:
x = g*(x' + v*t')
y = y'
z = z'
t = g*(t' + v*x'/c^2)
g = 1 / (1 - v^2/c^2)^.5
And from the other point-of-view:
x' = g*(x - v*t)
y' = y
z' = z
t' = g*(t - v*x/c^2)
g = 1 / (1 - v^2/c^2)^.5
Lorentz Velocity Transformations
But wait: there's more! Velocity is the derivative of distance with respect to time. Time and distance are distorted
between the two reference frames, so velocities measured between the two should also be distorted. Say a particle moves with
a velocity u (components ux uy uz) in O, and say that it moves with a velocity u'
(components ux' uy' uz') in O'. It is the same particle looked at from two vantage points
O and O'. How is the velocity in O' related to the velocity in O? First note the
definitions:
ux = dx/dt
uy = dy/dt
uz = dz/dt
and
ux' = dx'/dt'
uy' = dy'/dt'
uz' = dz'/dt'
Let's begin by looking at the x-direction velocities. Note the following identity from the chain rule:
dx/dt' = dx/dt * dt/dt'
or
dx/dt = dx/dt' / dt/dt'
ux = dx/dt' / dt/dt'
We want the derivatives of the varibles in O wrt variables in O', since that's how the Lorentz transforms
are defined. The velocity ux is dx/dt of course. Now taking the derivatives of the transforms and
plugging them into the equation yeilds:
dx/dt' = d/dt' (g*(x' + v*t'))
dx/dt' = g * d/dt' (x' + v*t')
dx/dt' = g * (dx'/dt' + v*dt'/dt')
dx/dt' = g * (ux' + v)
and
dt/dt' = d/dt' (g*(t' + v*x'/c^2))
dt/dt' = g * d/dt' (t' + v*x'/c^2)
dt/dt' = g * (dt'/dt' + v*dx'/dt'/c^2)
dt/dt' = g * (1 + v*ux'/c^2)
Therefore
ux = dx/dt' / dt/dt'
ux = (ux' + v) / (1 + v*ux'/c^2)
Compare with the Galilean result, ux = (ux' + v). There is a correction term, and it is caused by
the time distortion of Relativity. What about y-direction and z-direction velocities? Well, the two will have the same
form, since anything in one direction perpendicular to the direction of motion - x-direction - isn't special in any other
direction perpendicular to the direction of motion. (Can you see why?) So let's solve for the y-direction, and the
z-direction follows the same logic:
dy/dt' = dy/dt * dt/dt'
or
dy/dt = dy/dt' / dt/dt'
uy = dy/dt' / dt/dt'
also
dy/dt' = d/dt' (y')
dy/dt' = dy'/dt'
dy/dt' = uy'
and remember
dt/dt' = g * (1 + v*ux'/c^2)
Therefore
uy = uy' / g / (1 + v*ux'/c^2)
And in the z-direction
uz = uz' / g / (1 + v*ux'/c^2)
It is interesting that the changes to the Galilean result in the y/z-directions depend on the distortion from the
x-direction (and the ux' velocity). This is complete counter-intuitive at first glance, but makes sense after
thinking about it. The distortion is from movement in the x-direction of the O' frame, so that is what the change
depends on. In summary (from both points of view):
ux = (ux' + v) / (1 + v*ux'/c^2)
uy = uy' / g / (1 + v*ux'/c^2)
uz = uz' / g / (1 + v*ux'/c^2)
and
ux' = (ux - v) / (1 - v*ux/c^2)
uy' = uy / g / (1 - v*ux/c^2)
uz' = uz / g / (1 - v*ux/c^2)
Momentum changes
Momentum is highly depended on velocity. So does it change too? Let's perform an experiment and find out! :-) Here's
the skinny:
Look at this picture in the O
frame and this picture in the O' frame
from this page.
- a
- Ball thrown by O
- uax
- x-velocity of "ball a" wrt O before the collision
- uay
- y-velocity of "ball a" wrt O before the collision
- uax'
- x-velocity of "ball a" wrt O' before the collision
- uay'
- y-velocity of "ball a" wrt O' before the collision
- wax
- x-velocity of "ball a" wrt O after the collision
- way
- y-velocity of "ball a" wrt O after the collision
- wax'
- x-velocity of "ball a" wrt O' after the collision
- way'
- y-velocity of "ball a" wrt O' after the collision
and
- b
- Ball' thrown by O'
- ubx
- x-velocity of "ball b" wrt O before the collision
- uby
- y-velocity of "ball b" wrt O before the collision
- ubx'
- x-velocity of "ball b" wrt O' before the collision
- uby'
- y-velocity of "ball b" wrt O' before the collision
- wbx
- x-velocity of "ball b" wrt O after the collision
- wby
- y-velocity of "ball b" wrt O after the collision
- wbx'
- x-velocity of "ball b" wrt O' after the collision
- wby'
- y-velocity of "ball b" wrt O' after the collision
also
- uy
- Velocity each person measures throwing their ball in their reference frame (x-componet=0)
My notation is slightly different from the picture. Note the differences!
Here is what the situation is. Say the person at O throws a baseball straight out (y direction) relative to
her. Say the person' at O' also throws a baseball' straight out (-y' direction) relative to her'. Each ball
has constant velocity (no gravity), and each person throws the ball with the same velocity as measured in their reference
frame. Well, relative to the person at O, the baseball' moves in a diagonal line.
(Think of it this way. If you throw a ball up in a car, it seems to go straight. To a person on the sidewalk, it is
moving in a diagonal. You just don't notice any horizonal direction because you are moving at the same speed in that
direction.)
Well, the velocity of the "ball a" thrown by O is:
before the collision
uax = 0
uay = uy
and after the collision
wax = 0
way = -uy
Using the classical definition of momentum p = m * u then the change of momentum observed by O
is:
before the collision
pax = 0
pay = m * uy
and after the collision
qax = 0
qay = m * (-uy)
So the net momentum change
Pax = qax - pax = 0 - 0 = 0
Pay = qay - pay
Pay = m * (-uy) - m * uy
Pay = -2 * m * uy
Likewise for "ball b" thrown by O':
before the collision (using the velocity transforms)
ubx = v
uby = -uy / g
and after the collision
wbx = v
wby = uy / g
and the momentums:
before the collision
pbx = m * v
pby = - m * uy / g
and after the collision
qbx = m * v
qby = m * uy / g
So the net momentum change
Pbx = qbx - pbx = m * v - m * v = 0
Pby = qby - pby
Pby = m * uy / g + m * uy / g
Pby = 2 * m * uy / g
This makes no sense! The momentum from one side of the collision is not balanced by momentum on the other side:
Should be zero, but isn't:
Pynet = Pay + Pby
Pynet = -2 * m * uy + 2 * m * uy / g
Pynet = 2 * m * uy * ( 1 / g - 1)
Pynet != 0
Since the net is not zero, momentum was NOT conserved using Lorentz Transforms! To preserve the law of conservation of momentum the
definition of momentum must be changed. How should it be adjusted? Well the problems occured when we calculated the
momentum of the particle in the O' frame:
Remember this?
Pby = 2 * m * uy / g
The 1/g factor came from the Lorentz velocity transformations. To fix it, we adjust the definition of momentum
from:
To:
This doesn't affect the "ball a" result since for the O frame, relative to itself, g is zero. Plugging it
into the above result allows momentum to be conserved. Cool!
Proving E = m * c ^ 2
In summary so far... The effects of relativity means measurements between O and O' are different. The
changes include the Lorentz Transforms of position and velocity. The law of momentum was adjusted:
Also note the original definitions of velocity, force and energy (work):
v = dx/dt
F = dp/dt
E = integral of F dx
The first step is to get a more specific equation for Force. Momentum changed, so the force is not just F=ma. Let's put a particle right at the origin of O' and see what happens. This is a modification of Young's derivation of Kinetic Energy. By evaluating the derivate you find:
Note: u=v
F = dp/dt
F = d/dt (m * v / (1 - v^2 / c^2)^.5
F = m * v * d/dt((1 - v^2 / c^2)^-.5) + m / (1 - v^2 /
c^2)^.5 * dv/dt
F = m * v * d/dt((1 - v^2 / c^2)^-.5) + m * a / (1 - v^2 /
c^2)^.5
F = m * v * (1 - v^2 / c^2)^-3/2 * d/dt(1 - v^2 / c^2) + m
* a / (1 - v^2 / c^2)^.5
F = m * v * g^3 * (-1/2) * (-2 * v / c^2) * dv/dt + m *
a * g
F = m * a * g * (1 + g^2 * v^2 / c^2)
F = m * a * g * (1 + v^2 / c^2 / (1 - v^2 / c^2))
F = m * a * g * (1 - v^2 / c^2 + v^2 / c^2) / (1 - v^2 /
c^2)
F = m * a * g^3 * (1 - v^2 / c^2 + v^2 / c^2)
F = m * a * g^3
Now going back to the definition of energy:
E = integral of F dx
F = m * a * g^3
E = integral of m * a * g^3 dx
E = integral of m * g^3 * dv/dt dx
E = integral of m * g^3 * dx/dt dv
E = integral of m * g^3 * v dv
E = integral of m * v / (1 - v^2 / c^2)^3/2 dv
Note (from integral table):
integal du / (a^2 - u^2)^3/2 = u / a^2 / (a^2 - u^2)^1/2 + C
Using the product rule:
J = m * v
dJ = m dv
dK = dv / (1 - v^2 / c^2)^3/2
K = v / (1 - v^2 / c^2)^1/2
E = J * K - integral of K dJ
E = m * v^2 * g - integral of m * v / (1 - v^2 / c^2)^1/2 dv
Note (from integral table):
integal du / (a^2 - u^2)^1/2 = arcsin(u / a) + C
Using the product rule:
J = m * v
dJ = m dv
dK = dv / (1 - v^2 / c^2)^1/2
K = c * arcsin(v / c)
E = m * v^2 * g - J * K + integral of K dJ
E = m * v^2 * g - m * v * c * arcsin(v / c) + integral of m * c * arcsin(v / c) dv
E = m * v^2 * g - m * v * c * arcsin(v / c) + m * c^2 * integral of arcsin(v / c) dv / c
Note (from integral table):
integal of arcsin(u) du = arcsin(u) + (1 - u^2)^.5 + C
E = m * v^2 * g - m * v * c * arcsin(v / c) + m * c^2 * (v / c) * arcsin(v / c) + m * c^2 * (1 - v^2 / c^2)^1/2
E = m * v^2 * g - m * v * c * arcsin(v / c) + m * v * c * arcsin(v / c) + m * c^2 * (1 - v^2 / c^2) * g
E = m * v^2 * g + m * c^2 * g - m * v^2 * g
E = m * c^2 * g
Almost there! When the velocity is zero, g = 1. Therefore, when the object is at rest, it still has some energy. This is called rest energy. So what is the equation for rest enegy?