Next:About this document ...
Site original du document : http://www.thphys.may.ie/staff/bdolan/SR/sr-LT/sr-LT.html

3rd Year Honours Mathematical Physics
Special Relativity
Lorentz Transformations

Brian Dolan

Let two inertial co-ordinate systems (ICR's), S and $S^\prime$ , be in standard configuration. This means that $S^\prime$ is moving with constant velocity ${ \underline {\hbox{v}} }$ relative to S in the x-direction as in the following figure

$\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT1.epshscale=100 vscale=100 voffset=-200}\end{figure}$

and that the spatial origins, O and $O^\prime$ , and the three spatial axes, $({ \underline {\hbox{e}} }_1,{ \underline {\hbox{e}} }_2,{ \underline {\hbox{e}} }_3)$ and $({ \underline {\hbox{e}} }^\prime_1,{ \underline {\hbox{e}} }^\prime_2,{ \underline {\hbox{e}} }^\prime_3)$ , co-incide at time $t=t^\prime=0$ .

We shall convert units of time (eg. seconds) to a length (eg. meters) by multiplying time by the speed of light c - thus our measure of time will be ct which has units of length (this assumes implicitly that c is the same in both reference frames - one of the postulates of relativity). We wish to determine Cartesian co-ordinates in $S^\prime$ , $(ct^\prime,x^\prime,y^\prime,z^\prime)$ , as functions of Cartesian co-ordinates in S, (ct,x,y,z), using reasonable assumptions. In other words $x^\prime$ will be a function of ct,x,y and z, i.e. $x^\prime(ct,x,y,z)$ , etc. It will sometimes be convenient to adopt an index notation where the four co-ordinates (ct,x,y,z) are labelled by an index a taking on four possible values 0,1,2,3 with x⁰=ct, x¹=x, x²=y and x³=z so that (ct,x,y,z)=(x^a). Similarly for primed co-ordinates an index notation is sometimes useful, $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$ , where the prime is placed on the index so that the index itself can be used to distinguish between the two co-ordinates systems. The change from (ct,x,y,z)=(x^a) to $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$ is called a co-ordinate transformation.

The derivation of the explicit form of the co-ordinate transformation proceeds in four steps:

Step 1): The transformations are linear

Consider a clock moving with constant velocity, showing a time $\tau$ . The path of the clock in S can be described by four functions $x^a(\tau)$ . Since the clock is moving with constant velocity in Sequal increments of $\tau$ must correspond to equal increments of the co-ordinates (x^a) labeling the position of the clock in S. Thus $dx^a/d\tau$ is constant and $d^2x^a/d\tau^2=0$ . Since $S^\prime$ is moving with constant velocity relative to S, the clock must aslo be moving with constant velocity in $S^\prime$ hence the same argument implies that $dx^{{a^\prime} }/d\tau$ is constant and $d^2x^{{a^\prime}}/d\tau^2=0$ . Now treating $x^{a^\prime}$ as functions of x^a the chain rule for differentiation implies

$\displaystyle {dx^{{a^\prime}}\over d\tau}$	=	$\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}{dx^b\over d\tau}$	(1)
$\displaystyle {d^2x^{{a^\prime}}\over d\tau^2}$	=	$\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}{d^2x^b\ov......x^{{a^\prime}}\over\partial x^b\partial x^c}{dx^b\over d\tau}{dx^c\over d\tau}.$	(2)

Thus $d^2x^a/d\tau^2=0$ and $d^2x^{{a^\prime}}/d\tau^2=0$ can only be true if ${\partial^2 x^{{a^\prime}}\over\partial x^b\partial x^c}=0$ , in other words the transformations must be linear in x^a. In mathematical symbols this means

$\displaystyle ct^\prime$

$\displaystyle {L^{0^\prime}}_0 ct + {L^{0^\prime}}_1 x+ {L^{0^\prime}}_2 y + {L^{0^\prime}}_3 z + C^{0^\prime}\crx^\prime$

(3)

where $C^{a^\prime}$ are constants and ${L^{a^\prime}}_b(v)$ are sixteen functions, independent of x^a but possibly depending on v - the velocity of S relative to $S^\prime$ . If S is in standard configuration relative to $S^\prime$ then $C^{a^\prime}=0$ for all four values of a=0,1,2,3.

These conditions can be summarised in the single formula

$\begin{displaymath}x^{{a^\prime}}=\sum^3_{b=0}{L^{a^\prime}}_b(v)x^b,\end{displaymath}$

which can be thought of as a matrix formula with ${L^{a^\prime}}_b(v)$ a $4\times 4$ matrix and $x^{a^\prime}$ and x^a column vectors,

$\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\......r y\cr z}\right)=L(v)\left(\matrix{ ct\cr x\cr y\cr z}\right).\end{displaymath}$

The matrix with components ${L^{a^\prime}}_b$ can be inverted to give x^a in terms of $x^{a^\prime}$ ,

$\begin{displaymath}\left(\matrix{ ct\cr x\cr y\cr z}\right)=L^{-1}(v)\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\right),\end{displaymath}$

where L^-1(v) is the inverse matrix to L(v), i.e. $L(v){L^{-1}}(v)={\bf 1}$ with ${\bf 1}$ the identity matrix. Since the (x^a) co-ordinate system is moving in the negative $x^\prime$ direction relative to the $(x^{a^\prime})$ with speed v it is clear that $(x^{a^\prime})$ bear the same relation to (x^a) as (x^a)do to $(x^{a^\prime})$ , except that the sign of v is reversed. Mathematically this means that L^-1(v)=L(-v).

We shall now determine the sixteen functions ${L^{a^\prime}}_b(v)$ .Step 2): $\bf {L^{0^\prime}}_2= {L^{0^\prime}}_3= {L^{1^\prime}}_2= {L^{1^\prime}}_3=0$

At time $t=t^\prime=0$ the two planes $x=x^\prime=0$ co-incide for all y and z, as in the following figure

$\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT3.epshscale=100 vscale=100 voffset=-150}\end{figure}$ thus

$% latex2html id marker 1098$\displaystyle \hbox{equation }(2)\quad\Rightarrow$$	$\textstyle \quad 0={L^{0^\prime}}_2 \;y +{L^{0^\prime}}_3 \;z\qquad\forall \;y,z$	(4)
$% latex2html id marker 1100$\displaystyle \hbox{equation }(3)\quad\Rightarrow$$	$\textstyle \quad 0={L^{1^\prime}}_2 \;y +{L^{1^\prime}}_3 \;z\qquad\forall \;y,z$	(5)
$\displaystyle \hbox{therefore}\qquad\qquad$	$\textstyle {L^{0^\prime}}_2= {L^{0^\prime}}_3={L^{1^\prime}}_2= {L^{1^\prime}}_3=0.$	(6)

Step 3): $\bf z^\prime = z$ and $\bf y^\prime = y$

At time $t=t^\prime=0$ the two planes $z^\prime=z=0$ co-incide. Since the relative motion is in the x-direction and there is no rotation (by assumption), the planes $z^\prime=z=0$ co-incide $\forall t$ as in the following figure

$\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT2.epsvoffset=-200 hscale=100 vscale=100}\end{figure}$

thus

$% latex2html id marker 1112$\displaystyle \hbox{equation } (5)$$	$\textstyle \Rightarrow$	$\displaystyle 0={L^{3^\prime}}_0\; ct + {L^{3^\prime}}_1 \;x+{L^{3^\prime}}_2 \;y \qquad\forall \;t,x,y$	(7)
	$\textstyle \Leftrightarrow$	$\displaystyle {L^{3^\prime}}_0= {L^{3^\prime}}_1={L^{3^\prime}}_2=0$	(8)
$\displaystyle \hbox{therefore}$		$\displaystyle \quad z^\prime = {L^{3^\prime}}_3(v)z.$	(9)

We can apply the same argument with S and $S^\prime$ interchanged, which requires that we replace L(v) by L^-1(v)=L(-v), to deduce that $z = {L^{3^\prime}}_{3}(-v)z^\prime$ . Hence ${L^{3^\prime}}_3(v){L^{3^\prime}}_{3}(-v)=1$ .

Now if we reflect $x\rightarrow -x$ , without changing the other co-ordinates in S, it should be clear that z and $z^\prime$ do not change (since we have just proven that $z^\prime$ is independent of x). But changing the sign of xchanges the sign of v, since the relative motion is in the x-direction. Hence ${L^{3^\prime}}_{3}(-v)={L^{3^\prime}}_{3}(v)$ , thus ${L^{3^\prime}}_3(v)^2=1$ , so ${L^{3^\prime}}_3(v)=\pm 1$ . The sign can be determined by the trivial observation that v=0 should give the identity transformation, thus ${L^{3^\prime}}_3(v)=1$ .

A similar argument applied to the two planes $y=y^\prime=0$ allows us to conclude that

$\begin{displaymath}{L^{2^\prime}}_0= {L^{2^\prime}}_1={L^{2^\prime}}_3=0\end{displaymath}$

and ${L^{2^\prime}}_2(v)=1$ .

In summary, we have now that the transformation matrix must be of the form

$\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{{L^{0^\prime}}_0(v)&{L^{0^......e}}_0(v)&{L^{1^\prime}}_1(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\end{displaymath}$ Step 4): The functional form of $\bf {L^{0^\prime}}_0(v)$ , $\bf {L^{0^\prime}}_1(v)$ , $\bf {L^{1^\prime}}_0(v)$ and $\bf {L^{1^\prime}}_1(v)$

Up until now we have only really used the postulates of relativity to streamline the notation. Now it will be used to full effect. First suppose a flash of light is emitted from the origin O=(0,0,0) of S at t=0 (and so also from the origin $O^\prime=(0,0,0)$ of $S^\prime$ at $t^\prime=0$ ). The flash expands with the speed of light, c which is the same in both reference frames, as a spherical shell whose radius at time t is given by x²+y²+z²=c²t² in S and by $x^{\prime\; 2}+y^{\prime\; 2}+z^{\prime\; 2}=c^2t^{\prime\; 2}$ in $S^\prime$ . Now we already know that $y^\prime =y$ and $z^\prime = z$ so

$\begin{displaymath}(ct^\prime)^2-(x^\prime)^2 =(ct)^2-x^2=y^2+z^2.\end{displaymath}$

Also

$\begin{displaymath}x^\prime = {L^{1^\prime}}_0(v)\;ct+{L^{1^\prime}}_1(v)\;x \qq......quadct^\prime= {L^{0^\prime}}_0(v)\; ct+{L^{0^\prime}}_1(v)\;x\end{displaymath}$

$\begin{displaymath}\bigl({L^{0^\prime}}_0\;ct+{L^{0^\prime}}_1\;x\bigr)^2-\bigl({L^{1^\prime}}_0\;ct+{L^{1^\prime}}_1\;x\bigr)^2=(ct)^2-x^2.\end{displaymath}$

Demanding that this hold true for all t and any x with |x|<ct gives three conditions

$\displaystyle \bigl({L^{1^\prime}}_1\bigr)^2- \bigl({L^{0^\prime}}_1\bigr)^2=$	1	(10)
$\displaystyle \bigl({L^{0^\prime}}_0\bigr)^2-\bigl({L^{1^\prime}}_0\bigr)^2=$	1	(11)
$\displaystyle {L^{1^\prime}}_0 {L^{1^\prime}}_1 - {L^{0^\prime}}_0 {L^{0^\prime}}_1=$	0	(12)

on four unkowns. We can express these as four functions of a single parameter by using the identity $\cosh^2\alpha - \sinh^2\alpha=1$ for any real $\alpha$ to write

$\displaystyle {L^{1^\prime}}_1{=\cosh\alpha}$	$\textstyle \qquad$	$\displaystyle {L^{0^\prime}}_0=\cosh\alpha$	(13)
$\displaystyle {L^{0^\prime}}_1={-\sinh\alpha}$	$\textstyle \qquad$	$\displaystyle {L^{1^\prime}}_0=-\sinh\alpha,$	(14)

where $\alpha(v)$ is a function of v which is yet to be determined (the minus sign is for later convenience). We have now arrived at the following form for the transformation matrix:

$\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{\cosh\alpha(v)&-\sinh\alph......ha(v)&\cosh\alpha(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\eqno(5)\end{displaymath}$ Step 5): The functional form of $\bf \alpha (v)$

The spatial origin $O^\prime$ of $S^\prime$ is determined by $x^\prime=y^\prime=z^\prime=0$ . In S the point $x^\prime=y^\prime=z^\prime=0$ moves with speed v in the x-direction, i.e. it has x co-ordinate x=vt. Thus

$\begin{displaymath}x^\prime=-ct\sinh\alpha + vt\cosh\alpha =0 \qquad\Rightarrow\qquad\tanh\alpha = {v\over c}\end{displaymath}$

so $\alpha$ can be written as an inverse hyper-trigonometric function

$\begin{displaymath}\alpha(v)=\tanh^{-1}(v/c).\end{displaymath}$

Note that the properties of the $\tanh$ function now imply that -c<v<c (see the figure below). $\alpha$ is called the rapidity of the transformation.

$\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/tanh.pshscale=60 vscale=60 angle=-90}\end{figure}$ A plot of $v/c=\tanh\alpha$ as a function of the rapidity $\alpha$ .

Using $\cosh^2\alpha - \sinh^2\alpha=1$ we have

$\begin{displaymath}\cosh\alpha={1\over\sqrt{1-\tanh^2\alpha}}={1\over\sqrt{1-(v/......uad\sinh\alpha={1\over\sqrt{1-(v/c)^2}}\left({v\over c}\right).\end{displaymath}$

It is conventional to define

$\begin{displaymath}\gamma(v)={1\over\sqrt{1-(v/c)^2}}\end{displaymath}$

and then the transformation matrix can be written as

$\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{\gamma(v)&-\gamma(v)v/c&0&0\cr-\gamma(v)v/c&\gamma(v)&0&0\cr0&0&1&0\cr0&0&0&1}\right).\end{displaymath}$

Thus we have finally arrived at the following form for the transformation

$\displaystyle t^\prime$	=	$\displaystyle \gamma(v)(t-xv/c^2)$	(15)
$\displaystyle x^\prime$	=	$\displaystyle \gamma(v)(x-vt)$	(16)
$\displaystyle y^\prime$	=	y	(17)
$\displaystyle z^\prime$	=	z.	(18)

These are called Lorentz Transformations or sometimes Lorentz Boosts, to distinguish them from rotations - the name ``boost'' is unfortunate as there is no acceleration involved.

In matrix notation the Lorentz Transformations can be represented as

$\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime \cr z^\prime}......0\cr0&0&0&1}\right)\left(\matrix{ct\cr x\cr y \cr z}\right).\end{displaymath}$

The rapidity has the useful property that it is additive under successive transformations in the same direction with $v_1=c\tanh\alpha_1$ and $v_2=c\tanh\alpha_2$ . This is most easily established using matrix multiplication to show that

$\begin{displaymath}L(\alpha_1)L(\alpha_2)=L(\alpha_1+\alpha_2)\end{displaymath}$

(use the hyperbolic trigonometric identities

$\displaystyle \cosh\alpha_1\cosh\alpha_2+\sinh\alpha_1\sinh\alpha_2$	=	$\displaystyle \cosh(\alpha_1 + \alpha_2)$	(19)
$\displaystyle \cosh\alpha_1\sinh\alpha_2+\sinh\alpha_1\cosh\alpha_2$	=	$\displaystyle \sinh(\alpha_1 + \alpha_2)$	(20)

). Thus these two transformations are equivalent to a single transformation with rapidity $\alpha_3 = \alpha_1 + \alpha_2$ .

About this document ...

Next:About this document ...

Brian Dolan

1998-11-27

3rd Year Honours Mathematical Physics Special Relativity Lorentz Transformations

3rd Year Honours Mathematical Physics
Special Relativity
Lorentz Transformations