nextupprevious
Next:About this document ...
Site original du document : http://www.thphys.may.ie/staff/bdolan/SR/sr-LT/sr-LT.html

3rd Year Honours Mathematical Physics
Special Relativity
Lorentz Transformations

Brian Dolan

Let two inertial co-ordinate systems (ICR's), S and $S^\prime$, be in standard configuration. This means that$S^\prime$ is moving with constant velocity ${ \underline {\hbox{v}} }$ relative to S in the x-direction as in the following figure
 
  


\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT1.epshscale=100 vscale=100 voffset=-200}\end{figure}


and that the spatial origins, O and $O^\prime$, and the three spatial axes, $({ \underline {\hbox{e}} }_1,{ \underline {\hbox{e}} }_2,{ \underline {\hbox{e}} }_3)$ and $({ \underline {\hbox{e}} }^\prime_1,{ \underline {\hbox{e}} }^\prime_2,{ \underline {\hbox{e}} }^\prime_3)$, co-incide at time $t=t^\prime=0$.

We shall convert units of time (eg. seconds) to a length (eg. meters) by multiplying time by the speed of light c - thus our measure of time will be ct which has units of length (this assumes implicitly that c is the same in both reference frames - one of the postulates of relativity). We wish to determine Cartesian co-ordinates in $S^\prime$$(ct^\prime,x^\prime,y^\prime,z^\prime)$, as functions of Cartesian co-ordinates in S(ct,x,y,z), using reasonable assumptions. In other words $x^\prime$ will be a function of ct,x,y and z, i.e. $x^\prime(ct,x,y,z)$, etc. It will sometimes be convenient to adopt an index notation where the four co-ordinates (ct,x,y,z) are labelled by an index a taking on four possible values 0,1,2,3 with x0=ct, x1=x, x2=y and x3=z so that (ct,x,y,z)=(xa). Similarly for primed co-ordinates an index notation is sometimes useful, $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$, where the prime is placed on the index so that the index itself can be used to distinguish between the two co-ordinates systems. The change from (ct,x,y,z)=(xa) to $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$ is called a co-ordinate transformation.

The derivation of the explicit form of the co-ordinate transformation proceeds in four steps:
 
 Step 1): The transformations are linear


Consider a clock moving with constant velocity, showing a time $\tau$. The path of the clock in S can be described by four functions$x^a(\tau)$. Since the clock is moving with constant velocity in Sequal increments of $\tau$ must correspond to equal increments of the co-ordinates (xa) labeling the position of the clock in S. Thus $dx^a/d\tau$ is constant and $d^2x^a/d\tau^2=0$. Since $S^\prime$ is moving with constant velocity relative to S, the clock must aslo be moving with constant velocity in $S^\prime$ hence the same argument implies that $dx^{{a^\prime} }/d\tau$ is constant and $d^2x^{{a^\prime}}/d\tau^2=0$. Now treating $x^{a^\prime}$ as functions of xa the chain rule for differentiation implies

$\displaystyle {dx^{{a^\prime}}\over d\tau}$ = $\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}{dx^b\over d\tau}$ (1)
$\displaystyle {d^2x^{{a^\prime}}\over d\tau^2}$ = $\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}{d^2x^b\ov......x^{{a^\prime}}\over\partial x^b\partial x^c}{dx^b\over d\tau}{dx^c\over d\tau}.$ (2)


Thus $d^2x^a/d\tau^2=0$ and $d^2x^{{a^\prime}}/d\tau^2=0$can only be true if ${\partial^2 x^{{a^\prime}}\over\partial x^b\partial x^c}=0$, in other words the transformations must be linear in xa. In mathematical symbols this means

$\displaystyle ct^\prime$ = $\displaystyle {L^{0^\prime}}_0 ct + {L^{0^\prime}}_1 x+ {L^{0^\prime}}_2 y + {L^{0^\prime}}_3 z + C^{0^\prime}\crx^\prime$ (3)


where $C^{a^\prime}$ are constants and ${L^{a^\prime}}_b(v)$ are sixteen functions, independent of xa but possibly depending on v - the velocity of S relative to $S^\prime$. If S is in standard configuration relative to $S^\prime$ then $C^{a^\prime}=0$ for all four values of a=0,1,2,3.

These conditions can be summarised in the single formula

\begin{displaymath}x^{{a^\prime}}=\sum^3_{b=0}{L^{a^\prime}}_b(v)x^b,\end{displaymath}


which can be thought of as a matrix formula with ${L^{a^\prime}}_b(v)$$4\times 4$ matrix and $x^{a^\prime}$and xa column vectors,

\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\......r y\cr z}\right)=L(v)\left(\matrix{ ct\cr x\cr y\cr z}\right).\end{displaymath}


The matrix with components ${L^{a^\prime}}_b$ can be inverted to give xa in terms of $x^{a^\prime}$,

\begin{displaymath}\left(\matrix{ ct\cr x\cr y\cr z}\right)=L^{-1}(v)\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\right),\end{displaymath}


where L-1(v) is the inverse matrix to L(v), i.e. $L(v){L^{-1}}(v)={\bf 1}$ with ${\bf 1}$ the identity matrix. Since the (xa) co-ordinate system is moving in the negative$x^\prime$direction relative to the $(x^{a^\prime})$ with speed v it is clear that $(x^{a^\prime})$ bear the same relation to (xa) as (xa)do to $(x^{a^\prime})$, except that the sign of v is reversed. Mathematically this means that L-1(v)=L(-v).

We shall now determine the sixteen functions ${L^{a^\prime}}_b(v)$.Step 2): $ \bf {L^{0^\prime}}_2= {L^{0^\prime}}_3= {L^{1^\prime}}_2= {L^{1^\prime}}_3=0 $


At time $t=t^\prime=0$ the two planes $x=x^\prime=0$co-incide for all y and z, as in the following figure
 
  


\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT3.epshscale=100 vscale=100 voffset=-150}\end{figure}
thus
% latex2html id marker 1098$\displaystyle \hbox{equation }(2)\quad\Rightarrow$ $\textstyle \quad 0={L^{0^\prime}}_2 \;y +{L^{0^\prime}}_3 \;z\qquad\forall \;y,z$   (4)
% latex2html id marker 1100$\displaystyle \hbox{equation }(3)\quad\Rightarrow$ $\textstyle \quad 0={L^{1^\prime}}_2 \;y +{L^{1^\prime}}_3 \;z\qquad\forall \;y,z$   (5)
$\displaystyle \hbox{therefore}\qquad\qquad$ $\textstyle {L^{0^\prime}}_2= {L^{0^\prime}}_3={L^{1^\prime}}_2= {L^{1^\prime}}_3=0.$   (6)
Step 3): $ \bf z^\prime = z$ and $\bf y^\prime = y$


At time $t=t^\prime=0$ the two planes $z^\prime=z=0$ co-incide. Since the relative motion is in the x-direction and there is no rotation (by assumption), the planes $z^\prime=z=0$co-incide $\forall t$ as in the following figure 


\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT2.epsvoffset=-200 hscale=100 vscale=100}\end{figure}


thus

% latex2html id marker 1112$\displaystyle \hbox{equation } (5)$ $\textstyle \Rightarrow$ $\displaystyle 0={L^{3^\prime}}_0\; ct + {L^{3^\prime}}_1 \;x+{L^{3^\prime}}_2 \;y \qquad\forall \;t,x,y$ (7)
  $\textstyle \Leftrightarrow$ $\displaystyle {L^{3^\prime}}_0= {L^{3^\prime}}_1={L^{3^\prime}}_2=0$ (8)
$\displaystyle \hbox{therefore}$   $\displaystyle \quad z^\prime = {L^{3^\prime}}_3(v)z.$ (9)


We can apply the same argument with S and $S^\prime$interchanged, which requires that we replace L(v) by L-1(v)=L(-v), to deduce that $z = {L^{3^\prime}}_{3}(-v)z^\prime$. Hence ${L^{3^\prime}}_3(v){L^{3^\prime}}_{3}(-v)=1$.

Now if we reflect $x\rightarrow -x$, without changing the other co-ordinates in S, it should be clear that z and $z^\prime$do not change (since we have just proven that $z^\prime$is independent of x). But changing the sign of xchanges the sign of v, since the relative motion is in the x-direction. Hence ${L^{3^\prime}}_{3}(-v)={L^{3^\prime}}_{3}(v)$, thus ${L^{3^\prime}}_3(v)^2=1$, so ${L^{3^\prime}}_3(v)=\pm 1$. The sign can be determined by the trivial observation that v=0 should give the identity transformation, thus ${L^{3^\prime}}_3(v)=1$.

A similar argument applied to the two planes $y=y^\prime=0$allows us to conclude that

\begin{displaymath}{L^{2^\prime}}_0= {L^{2^\prime}}_1={L^{2^\prime}}_3=0\end{displaymath}


and ${L^{2^\prime}}_2(v)=1$.
 

In summary, we have now that the transformation matrix must be of the form

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{{L^{0^\prime}}_0(v)&{L^{0^......e}}_0(v)&{L^{1^\prime}}_1(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\end{displaymath}
Step 4): The functional form of $\bf {L^{0^\prime}}_0(v)$$\bf {L^{0^\prime}}_1(v)$$\bf {L^{1^\prime}}_0(v)$ and $\bf {L^{1^\prime}}_1(v)$


Up until now we have only really used the postulates of relativity to streamline the notation. Now it will be used to full effect. First suppose a flash of light is emitted from the origin O=(0,0,0) of S at t=0 (and so also from the origin $O^\prime=(0,0,0)$ of $S^\prime$ at $t^\prime=0$). The flash expands with the speed of light, c which is the same in both reference frames, as a spherical shell whose radius at time t is given by x2+y2+z2=c2t2 in S and by $x^{\prime\; 2}+y^{\prime\; 2}+z^{\prime\; 2}=c^2t^{\prime\; 2}$ in $S^\prime$. Now we already know that $y^\prime =y$ and $z^\prime = z$ so

\begin{displaymath}(ct^\prime)^2-(x^\prime)^2 =(ct)^2-x^2=y^2+z^2.\end{displaymath}


Also

\begin{displaymath}x^\prime = {L^{1^\prime}}_0(v)\;ct+{L^{1^\prime}}_1(v)\;x \qq......quadct^\prime= {L^{0^\prime}}_0(v)\; ct+{L^{0^\prime}}_1(v)\;x\end{displaymath}


so

\begin{displaymath}\bigl({L^{0^\prime}}_0\;ct+{L^{0^\prime}}_1\;x\bigr)^2-\bigl({L^{1^\prime}}_0\;ct+{L^{1^\prime}}_1\;x\bigr)^2=(ct)^2-x^2.\end{displaymath}


Demanding that this hold true for all t and any x with |x|<ct gives three conditions

$\displaystyle \bigl({L^{1^\prime}}_1\bigr)^2- \bigl({L^{0^\prime}}_1\bigr)^2=$ 1   (10)
$\displaystyle \bigl({L^{0^\prime}}_0\bigr)^2-\bigl({L^{1^\prime}}_0\bigr)^2=$ 1   (11)
$\displaystyle {L^{1^\prime}}_0 {L^{1^\prime}}_1 - {L^{0^\prime}}_0 {L^{0^\prime}}_1=$ 0   (12)


on four unkowns. We can express these as four functions of a single parameter by using the identity $\cosh^2\alpha - \sinh^2\alpha=1$for any real $\alpha$ to write

$\displaystyle {L^{1^\prime}}_1{=\cosh\alpha}$ $\textstyle \qquad$ $\displaystyle {L^{0^\prime}}_0=\cosh\alpha$ (13)
$\displaystyle {L^{0^\prime}}_1={-\sinh\alpha}$ $\textstyle \qquad$ $\displaystyle {L^{1^\prime}}_0=-\sinh\alpha,$ (14)


where $\alpha(v)$ is a function of v which is yet to be determined (the minus sign is for later convenience). We have now arrived at the following form for the transformation matrix:

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{\cosh\alpha(v)&-\sinh\alph......ha(v)&\cosh\alpha(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\eqno(5)\end{displaymath}
Step 5): The functional form of $\bf \alpha (v)$


The spatial origin $O^\prime$ of $S^\prime$ is determined by $x^\prime=y^\prime=z^\prime=0$. In S the point $x^\prime=y^\prime=z^\prime=0$ moves with speed v in the x-direction, i.e. it has x co-ordinate x=vt. Thus

\begin{displaymath}x^\prime=-ct\sinh\alpha + vt\cosh\alpha =0 \qquad\Rightarrow\qquad\tanh\alpha = {v\over c}\end{displaymath}


so $\alpha$ can be written as an inverse hyper-trigonometric function

\begin{displaymath}\alpha(v)=\tanh^{-1}(v/c).\end{displaymath}


Note that the properties of the $\tanh$ function now imply that -c<v<c (see the figure below).$\alpha$ is called the rapidity of the transformation.
 
  


\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/tanh.pshscale=60 vscale=60 angle=-90}\end{figure}
A plot of $v/c=\tanh\alpha$ as a function of the rapidity $\alpha$.


Using $\cosh^2\alpha - \sinh^2\alpha=1$ we have

\begin{displaymath}\cosh\alpha={1\over\sqrt{1-\tanh^2\alpha}}={1\over\sqrt{1-(v/......uad\sinh\alpha={1\over\sqrt{1-(v/c)^2}}\left({v\over c}\right).\end{displaymath}


It is conventional to define

\begin{displaymath}\gamma(v)={1\over\sqrt{1-(v/c)^2}}\end{displaymath}


and then the transformation matrix can be written as

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{\gamma(v)&-\gamma(v)v/c&0&0\cr-\gamma(v)v/c&\gamma(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\end{displaymath}


Thus we have finally arrived at the following form for the transformation

$\displaystyle t^\prime$ = $\displaystyle \gamma(v)(t-xv/c^2)$ (15)
$\displaystyle x^\prime$ = $\displaystyle \gamma(v)(x-vt)$ (16)
$\displaystyle y^\prime$ = y (17)
$\displaystyle z^\prime$ = z. (18)


These are called Lorentz Transformations or sometimes Lorentz Boosts, to distinguish them from rotations - the name ``boost'' is unfortunate as there is no acceleration involved.

In matrix notation the Lorentz Transformations can be represented as

\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime \cr z^\prime}......0\cr0&0&0&1}\right)\left(\matrix{ct\cr x\cr y \cr z}\right).\end{displaymath}


The rapidity has the useful property that it is additive under successive transformations in the same direction with $v_1=c\tanh\alpha_1$ and $v_2=c\tanh\alpha_2$. This is most easily established using matrix multiplication to show that

\begin{displaymath}L(\alpha_1)L(\alpha_2)=L(\alpha_1+\alpha_2)\end{displaymath}


(use the hyperbolic trigonometric identities

$\displaystyle \cosh\alpha_1\cosh\alpha_2+\sinh\alpha_1\sinh\alpha_2$ = $\displaystyle \cosh(\alpha_1 + \alpha_2)$ (19)
$\displaystyle \cosh\alpha_1\sinh\alpha_2+\sinh\alpha_1\cosh\alpha_2$ = $\displaystyle \sinh(\alpha_1 + \alpha_2)$ (20)


). Thus these two transformations are equivalent to a single transformation with rapidity $\alpha_3 = \alpha_1 + \alpha_2$.



nextupprevious
Next:About this document ...
Brian Dolan

1998-11-27