Читать книгу Lectures on Quantum Field Theory - Ashok Das - Страница 11
ОглавлениеCHAPTER 3
Properties of the Dirac equation
3.1Lorentz transformations
In three dimensions, we are well acquainted with rotations. For example, we know that a rotation of coordinates around the z-axis by an angle θ can be represented as the transformation
where R represents the rotation matrix such that
Here we are using a three dimensional notation, but this can also be written in terms of the four vector notation we have developed earlier. The rotation around the z-axis in (3.2) can also be written in matrix form as
so that the coefficient matrix on the right hand side can be identified with the rotation matrix R in (3.1), namely,
Thus, we see from (3.4) that a finite rotation around the 3-axis (z-axis) or in the 1-2 plane is denoted by an orthogonal matrix, R with unit determinant (det R = 1). We also note from (3.2) that an infinitesimal rotation around the 3-axis (z-axis) takes the form
where we have identified θ = ϵ = infinitesimal. We observe here that the matrix representing the infinitesimal change under a rotation (namely, see also (3.4) with θ = ϵ) is anti-symmetric.
Under a Lorentz boost along the x-axis, we also know that the coordinates transform as (boost velocity β = v since c = 1, otherwise, )
such that
where the Lorentz factor γ is defined in terms of the boost velocity to be
We recognize that (3.7) can also be written in the matrix form as
where we have defined
so that
Since the range of the boost velocity is given by −1 ≤ β ≤ 1 (namely, |v| ≤ c = 1), we conclude from (3.10) that −∞ ≤ ω ≤ ∞.
Thus, we note that a Lorentz boost along the x-direction can be written as a matrix relation
where
From this, we can obtain,
which would lead to the transformation of the covariant coordinate vector as
The matrix representing the Lorentz transformation of the coordinates, (or ), is easily seen from (3.13) or (3.14) to be an orthogonal matrix in the sense that
where we have used
From (3.16), we see that the matrix Λµν has a unit determinant, much like the rotation matrix R in (3.3). (Incidentally, (3.16) also shows that the covariant vector transforms in an inverse manner compared with the contravariant vector.) Therefore, we can think of the Lorentz boost along the 1-axis (x-axis) as a rotation in the 0-1 plane with an imaginary angle (so that we have hyperbolic functions instead of ordinary trigonometric functions). (That these rotations become complex is related to the fact that the metric has opposite signature for time and space components.) Furthermore, as we have seen, the “angle” of rotation, ω, can take any real value and, as a result, Lorentz boosts correspond to noncompact transformations unlike space rotations.
Let us finally note that if we are considering an infinitesimal Lorentz boost along the 1-axis (or a rotation in the 0-1 plane), then we can write (see (3.13) with ω = ϵ = infinitesimal)
where,
It follows from this that
In other words, the matrix representing the change under an infinitesimal Lorentz boost is anti-symmetric just like the case of an infinitesimal rotation. In a general language, therefore, we note that we can combine rotations and Lorentz boosts into what are known as the homogeneous Lorentz transformations, which can be thought of as rotations in the four dimensional space-time.
General Lorentz transformations are defined as transformations
which leave the length of the vector invariant, namely,
where we have used the fact that the metric, ηµν, remains invariant under a Lorentz transformation. Equation (3.22) is, of course, what we have seen before in (3.16). Lorentz transformations define the maximal symmetry of the space-time manifold which leaves the origin invariant.
Choosing ρ = σ = 0, we can write out the relation (3.22) explicitly as
Therefore, we conclude that
If Λ00 ≥ 1, then the transformation is called orthochronous. (The Greek prefix “ortho” means straight up. Thus, orthochronous means straight up in time. Namely, such a Lorentz transformation does not change the direction of time. Incidentally, “gonia” in Greek means an angle or a corner and, therefore, orthogonal means the corner that is straight up (perpendicular). In the same spirit, an orthodontist is someone who can make your teeth straight.) Note also that since (see (3.16))
we obtain
where we have used det ΛT = det Λ. The set of homogeneous Lorentz transformations satisfying
are known as the proper orthochronous Lorentz transformations and constitute a set of continuous transformations that can be connected to the identity matrix. (Just to emphasize, we note that the set of transformations with det Λ = 1 are known as proper transformations and the set for which Λ00 ≥ 1 are called orthochronous.) In general, however, there are four kinds of Lorentz transformations, namely,
Given the proper orthochronous Lorentz transformations, we can obtain the other Lorentz transformations by simply appending space reflection or time reflection or both (which are discrete transformations). Thus, if Λprop denotes a proper orthochronous Lorentz transformation, then by adding space reflection, x → −x, we obtain a Lorentz transformation
This would correspond to having Λ00 ≥ 1, det Λ = −1 (which is orthochronous but no longer proper). If we add time reversal, t → −t, to a proper orthochronous Lorentz transformation, then we obtain a Lorentz transformation
satisfying Λ00 ≤ −1 and det Λ = −1 (which is neither proper nor orthochronous). Finally, if we add both space and time reflections, xµ → −xµ, to a proper orthochronous Lorentz tranformation, we obtain a Lorentz transformation
with Λ00 ≤ −1 and det Λ = 1 (which is proper but not orthochronous). These additional transformations, however, cannot be continuously connected to the identity matrix since they involve discrete reflections. In these lectures, we would refer to proper orthochronous Lorentz transformations as the Lorentz transformations.
3.2Covariance of the Dirac equation
Given any dynamical equation of the form
where L is a linear operator, we say that it is covariant under a given transformation provided the transformed equation has the form
where ψ′ represents the transformed wavefunction and L′ stands for the transformed operator (namely, the operator L with the transformed variables). In simple terms, covariance implies that a given equation is form invariant under a particular transformation (has the same form in different reference frames).
With this general definition, let us now consider the Dirac equation
Under a Lorentz transformation
if the transformed equation has the form
where ψ′(x′) is the Lorentz transformed wave function, then the Dirac equation would be covariant under a Lorentz transformation. Note that the Dirac matrices, γµ, are a set of four space-time independent matrices and, therefore, do not change under a Lorentz transformation.
Let us assume that, under a Lorentz transformation, the transformed wavefunction has the form
where S(Λ) is a 4 × 4 matrix, since ψ(x) is a four component spinor. Parenthetically, what this means is that we are finding a representation of the Lorentz transformation on the Hilbert space. In the notation of other symmetries that we know from studies in non-relativistic quantum mechanics, we can define an operator L(Λ) to represent the Lorentz transformation on the coordinate states as (with indices suppressed)
However, since the Dirac wavefunction is a four component spinor, in addition to the change in the coordinates, the Lorentz transformation can also mix up the spinor components (much like angular momentum/rotation does). Thus, we can define the Lorentz transformation acting on the Dirac Hilbert space (Hilbert space of states describing a Dirac particle) as, (with S(Λ) representing the 4 × 4 matrix which rotates the matrix components of the wave function)
where the wave function is recognized to be
so that, from (3.39) we obtain (see (3.37))
Namely, the effect of the Lorentz transformation, on the wave function, can be represented by a matrix S(Λ) which depends only on the parameter of transformation Λ and not on the space-time coordinates. A more physical way to understand this is to note that the Dirac wave function simply consists of four functions which do not change, but get rotated by the S(Λ) matrix.
Since the Lorentz transformations are invertible, the matrix S(Λ) must possess an inverse so that from (3.37) we can write
Let us also note from (3.35) that
define a set of real quantities. Thus, we can write
where we have used (3.42).
Therefore, we see from (3.44) that the Dirac equation will be form invariant (covariant) under a Lorentz transformation provided there exists a matrix S(Λ), generating Lorentz transformations (for the Dirac wavefunction), such that
Let us note that if we define
then,
where we have used the orthogonality of the Lorentz transformations (see (3.16)). Therefore, the matrices γ′µ satisfy the Clifford algebra and, by Pauli’s fundamental theorem, there must exist a matrix connecting the two representations, γµ and γ′µ. It now follows from (3.46) that the matrix S exists and all we need to show is that it also generates Lorentz transformations in order to prove that the Dirac equation is covariant under a Lorentz transformation.
Next, let us note that since the parameters of Lorentz transformation are real (namely, (Λ∗)µν = Λµν)
Here we have used (3.45) and the relations (γ0)† = γ0 = (γ0)−1 as well as γ0(γµ)†γ0 = γµ. It is clear from (3.48) that the matrix Sγ0S†γ0 commutes with the four Dirac γµ matrices and, therefore, with all the 16 basis matrices in the 4 × 4 space given in (2.101) and must be proportional to the identity matrix (this follows simply because each of the sixteen basis matrices in (2.101) consists of products of γµ which commute with Sγ0S†γ0). As a result, we can denote
Taking the Hermitian conjugate of (3.49), we obtain
which, therefore, determines that the parameter b is real, namely,
We also note that det γ0 = 1 and since we are interested in proper Lorentz transformations, det S = 1. Using these in (3.49), we determine
The real roots of this equation are
In fact, we can determine the unique value of b in the following way.
Let us note, using (3.45) and (3.49), that
which follows since S†S represents a non-negative matrix. The two solutions of this equation are obvious
Since we are dealing with proper Lorentz transformations, we are assuming
which implies (see (3.55)) that b > 0 and, therefore, it follows from (3.53) that
Thus, we conclude from (3.49) that
These are some of the properties satisfied by the matrix S which will be useful in showing that it provides a representation for the Lorentz transformations.
Next, let us consider an infinitesimal Lorentz transformation of the form ( infinitesimal)
From our earlier discussion in (3.20), we recall that the infinitesimal transformation matrix is anti-symmetric, namely,
For an infinitesimal transformation, therefore, we can expand the matrix S(Λ) as
where the matrices Mµν are assumed to be anti-symmetric in the Lorentz indices (for different values of the Lorentz indices, Mµν denote matrices in the Dirac space since S(ϵ) is a matrix in this 4 × 4 space),
since
We can also write
so that
To the leading order, therefore, S−1(ϵ) indeed represents the inverse of the matrix S(ϵ).
The defining relation for the matrix S(Λ) in (3.45) now takes the form
At this point, let us recall the commutation relation (2.107)
and note from (3.66) that if we identify
then,
which coincides with the right hand side of (3.66). Therefore, we see that for infinitesimal transformations, we have determined the form of S(ϵ) to be
Let us note here from the form of S(ϵ) that we can identify
with the generators of infinitesimal Lorentz transformations for the Dirac wave function. (The other factor of is there to avoid double counting.) We will see in the next chapter (when we study the representations of the Lorentz group) that the algebra (2.110) which the generators of the infinitesimal transformations, σµν, satisfy can be identified with the Lorentz algebra (which also explains why they are closed under multiplication).
Thus, at least for infinitesimal Lorentz transformations, we have shown that there exists a S(Λ) which satisfies (3.45) and generates Lorentz transformations and as a result, the Dirac equation is form invariant (covariant) under such a Lorentz transformation. A finite transformation can, of course, be constructed out of a series of infinitesimal transformations and, consequently, the matrix S(Λ) for a finite Lorentz transformation will be the product of a series of such infinitesimal matrices which leads to an exponentiation of the infinitesimal generators with the appropriate parameters of transformation.
For completeness, let us note that infinitesimal rotations around the 3-axis or in the 1-2 plane would correspond to choosing
with all other components of ϵµν vanishing. In such a case (see also (2.99)),
A finite rotation by angle θ in the 1-2 plane would, then, be obtained from an infinite sequence of infinitesimal transformations resulting in an exponentiation of the infinitesimal generators as
Note that since we have S†(θ) = S−1(θ), namely, rotations define unitary transformations. Furthermore, recalling that
we have
and, therefore, we can determine
This shows that
That is, the rotation operator, in this case, is double valued and, therefore, corresponds to a spinor representation. This is, of course, consistent with the fact that the Dirac equation describes spin particles.
Let us next consider an infinitesimal rotation in the 0-1 plane, namely, we are considering an infinitesimal boost along the 1-axis (x-axis). In this case, we can identify
with all other components of ϵµν vanishing, so that we can write (see also (2.99))
In this case, the matrix for a finite boost ω can be obtained through exponentiation as
Furthermore, recalling that
and, therefore,
we can determine
We note here that since
That is, in this four dimensional space (namely, as 4 × 4 matrices), operators defining boosts are not unitary. This is related to the fact that Lorentz boosts are non-compact transformations and for such transformations, there does not exist any finite dimensional unitary representation. All the unitary representations are necessarily infinite dimensional.
3.3Transformation of bilinears
In the last section, we have shown how to construct the matrix S(Λ) for finite Lorentz transformations (for both rotations and boosts). Let us note next that, since under a Lorentz transformation
it follows that
where we have used the relation (3.58). In other words, we see that the adjoint wave function transforms inversely, under a Lorentz transformation, compared to the wave function ψ(x). This implies that a bilinear product such as would transform under a Lorentz transformation as
Namely, such a product will not change under a Lorentz transformation – would behave like a scalar – which is what we had discussed earlier in connection with the normalization of the Dirac wavefunction (see (2.50) and (2.55)).
Similarly, under a Lorentz transformation
where we have used (3.45). Thus, we see that if we define a current of the form it would transform as a four vector under a proper Lorentz transformation, namely,
This is, of course, what we had observed earlier. Namely, the probability current density (see also (2.86)) transforms like a four vector so that the probability density transforms as the time component of a four vector. Finally, we note that in this way, we can determine the transformation properties of the other bilinears under a Lorentz transformation in a straightforward manner.
3.4Projection operators, completeness relation
Let us note that the positive energy solutions of the Dirac equation satisfy
where
while the negative energy solutions satisfy
with the same value of p0 as in (3.92). It is customary to identify (see (2.49), the reason for this will become clear when we discuss the quantization of Dirac field theory later)
so that the equations satisfied by u(p) and v(p) (positive and negative energy solutions), (3.91) and (3.93), can be written as
and
Given these equations, the adjoint equations are easily obtained to be (taking the Hermitian conjugate and multiplying γ0 on the right)
where we have used (γµ)†γ0 = γ0γµ (see (2.84)). As we have seen earlier there are two positive energy solutions and two negative energy solutions of the Dirac equation. Let us denote them by
where r, as we had seen earlier, can represent the spin projection of the two component spinors (in terms of which the four component solutions were obtained). Let us also note that each of the four solutions really represents a four component spinor. Let us denote the spinor index by α = 1, 2, 3, 4. With these notations, we can write down the Lorentz invariant conditions we had derived earlier from the normalization of a massive Dirac particle as (see (2.50))
Although we had noted earlier that the last re lation in (3.99) can be checked to be true simply because v(p) = u−(−p0, −p), namely, because the direction of momentum changes for v(p) (see the derivation in (2.52)). This also allows us to write
For completeness we note here that it is easy to check
for any two spin components of the positive and the negative energy spinors.
From the form of the equations satisfied by the positive and the negative energy spinors, (3.95) and (3.96), it is clear that we can define projection operators for such solutions as
These are, of course, 4 × 4 matrices and their effect on the Dirac spinors is quite clear,
Similar relations also hold for the adjoint spinors and it is clear that Λ+(p) projects only on to the space of positive energy solutions, while Λ−(p) projects only on to the space of negative energy ones.
Let us note that
where we have used Thus, we see that Λ±(p) are indeed projection operators and they are orthogonal to each other. Furthermore, let us also note that
as it should be since all the solutions can be divided into either positive or negative energy ones.
Let us next consider the outer product of the spinor solutions. Let us define a 4 × 4 matrix P with elements
This matrix has the property that acting on a positive energy spinor it gives back the same spinor. Namely,
where we have used (3.99). Thus, we see that the matrix P projects only on to the space of positive energy solutions and, therefore, we can identify
Similarly, if we define
then, it is straightforward to see that
Namely, the matrix Q projects only on to the space of negative energy solutions with a phase (a negative sign). Hence we can identify
The completeness relation for the solutions of the Dirac equation now follows from the observation that (see (3.105))
In a matrix notation, the completeness relation (3.112) can also be written as
We note here that the relative negative sign between the two terms in (3.112) or in (3.113) can be understood as follows. As we have seen, and have opposite sign, the latter being negative while the former is positive. Hence, we can think of the space of solutions of the Dirac equation as an indefinite metric space. In such a space, the completeness relation does not involve a sum of terms with positive definite sign, rather it involves a sum with the metric structure of the space built in.
These relations are particularly useful in simplifying the evaluations of transition amplitudes and probabilities. For example, let us suppose that we are interested in a transition amplitude which has the form
where M stands for a 4 × 4 matrix (a combination of the 16 Dirac matrices). If the initial and the final states are the same, this may represent the expectation value of a given operator in a given electron state and will have the form (r not summed)
If we are not interested in the expectation value in a particular electron state, but rather wish to obtain an average over the two possible electron states (in experiments we may want to average over the spin polarization states), then we will have
Similarly, if we have a transition from a given electron state to another and if we are interested in a process where we average over the initial electron states and sum over the final electron states (for example, think of an experiment with unpolarized initial electron states where the final spin polarization is not measured), the probability for such a transition will be determined from
The trace is over the 4 × 4 matrix indices and can be easily performed using the properties of the Dirac matrices that we have discussed earlier in section 2.6.
3.5Helicity
As we have seen in section 2.3, the Dirac Hamiltonian
does not commute either with the orbital angular momentum or with spin (rather, it commutes with the total angular momentum). Thus, unlike the case of non-relativistic systems where we specify a given energy state by the projection of spin along the z-axis (namely, by the eigenvalue of Sz), in the relativistic case this is not useful since spin is not a constant of motion. In fact, we have already seen that the spin operator
satisfies the commutation relation (see (2.68))
As a consequence, it can be easily checked that the plane wave solutions which we had derived earlier are not eigenstates of the spin operator. Note, however, that for a particle at rest, spin commutes with the Hamiltonian (since in this frame p = 0) and such solutions can be labelled by the spin projection.
On the other hand, we note that since momentum commutes with the Dirac Hamiltonian, namely,
the operator S · p does also (momentum and spin commute and, therefore, the order of these operators in the product is not relevant). Namely,
Therefore, this operator is a constant of motion. The normalized operator
measures the longitudinal component of the spin of the particle or the projection of the spin along the direction of motion. This is known as the helicity operator and we note that since the Hamiltonian commutes with helicity, the eigenstates of energy can also be labelled by the helicity eigenvalues. Note that
where we have used (this is the generalization of the identity satisfied by the Pauli matrices)
Therefore, the eigenvalues of the helicity operator, for a spin Dirac particle, can only be and we can label the positive and the negative energy solutions also as u(p, h), v(p, h) with (the two helicity eigenvalues). The normalization relations in this case will take the forms
Furthermore, the completeness relation (3.112) or (3.113) can now be written as
3.6Massless Dirac particle
Let us consider the free Dirac equation for a massive spin particle,
where we are not assuming any relation between p0 and p as yet. Let us represent the four component spinor (as before) as
where u1(p) and u2(p) are two component spinors. In terms of u1(p) and u2(p), the Dirac equation takes the form
Explicitly, this leads to the two (2-component) coupled equations
which can also be written as
Taking the sum and the difference of the two equations in (3.132), we obtain
We note that if we define two new (2-component) spinors as
then, the equations in (3.133) can be rewritten as a set of two coupled (2-component) spinor equations of the form
This shows that it is the mass term which couples the two equations.
Let us note that in the limit m → 0, the two equations in (3.135) reduce to two (2-component) spinor equations which are decoupled and have the simpler forms
These two equations, like the Dirac equation, can be shown to be covariant under proper Lorentz transformations (as they should be, since vanishing of the mass which is a Lorentz scalar should not change the behavior of the equation under proper Lorentz transformations). These equations, however, are not invariant under parity or space reflection and are known as the Weyl equations. The corresponding two component spinors uL and uR are also known as Weyl spinors.
Let us note that, in the massless limit,
Similarly, we can show that uL(p) also satisfies
Thus, for a nontrivial solution of these equations to exist, we must have
which is the Einstein relation for a massless particle. It is clear, therefore, that for such solutions, we must have
For p0 = |p|, namely, for the positive energy solutions, we note that
while
In other words, the two different Weyl equations really describe particles with opposite helicity. Recalling that σ denotes the spin operator for a two component spinor, we note that uL(p) describes a particle with helicity or a particle with spin anti-parallel to its direction of motion. If we think of spin as arising from a circular motion, then we conclude that for such a particle, the circular motion would correspond to that of a left-handed screw. Correspondingly, such a particle is called a left-handed particle (which is the reason for the subscript L). On the other hand, uR(p) describes a particle with helicity or a particle with spin parallel to its direction of motion. Such a particle is known as a right-handed particle since its spin motion would correspond to that of a right-handed screw. This is shown in Fig. 3.1 and we note here that this nomenclature is opposite of what is commonly used in optics. (Handedness is also referred to as chirality and these (4 component Dirac) spinors can be shown to be eigenstates of the γ5 matrix which can also be understood more easily from the chiral symmetry associated with massless Dirac systems.)
Figure 3.1: Right-handed and left-handed particles with spins parallel and anti-parallel to the direction of motion.
As we know, the electron neutrino emitted in a beta decay
is massless (present experiments suggest they are almost massless) and, therefore, can be described by a two component Weyl equation. We also know, experimentally, that νe is left-handed, namely, its helicity is In the hole theoretic language, then, the absence of a negative energy neutrino would appear as a “hole” with the momentum reversed. Therefore, the anti-neutrino, in this description, will have opposite helicity or will be right-handed. Alternatively, the neutrino is left-handed and hence satisfies the equation
and has negative helicity. It is helicity which is the conserved quantum number and, hence, the absence of a negative energy neutrino would appear as a “hole” with opposite helicity. That the anti-neutrino is right-handed is, of course, observed in experiments such as
A very heuristic way to conclude that parity is violated in processes involving neutrinos is as follows. The neutrino is described by the equation
Under parity or space reflection,
Since σ represents an angular momentum, we conclude that it must transform under parity like L, so that under a space reflection
Consequently, the neutrino equation is not invariant under parity, and processes involving neutrinos, therefore, would violate parity. This has been experimentally verified in a number of processes.
3.7Chirality
With the normalization for massless spinors discussed in (2.53) and (2.54), the solutions of the massless Dirac equation (m = 0)
can be written as (see (2.53) and compare with the massive case (3.94))
From the structure of the massless Dirac equation (3.149), we note that if u(p) (or v(p)) is a solution, then γ5u(p) (or γ5v(p)) is also a solution. Therefore, the solutions of the massless Dirac equation can be classified according to the eigenvalues of γ5 also known as the chirality or the handedness.
This can also be seen from the fact that the Hamiltonian for a massless Dirac fermion (see (1.100))
commutes with γ5 (in fact, in the Pauli-Dirac representation γ5 = ρ defined in (2.60) and ρ commutes with α, see, for example, (2.61)). Since
it follows that the eigenvalues of γ5 are ±1 and spinors with the eigenvalue +1, namely,
are known as right-handed (positive chirality) spinors while those with the eigenvalue −1, namely,
are called left-handed (negative chirality) spinors. We note that if the fermion is massive (m ≠ 0), then the Dirac Hamiltonian (1.100) would no longer commute with γ5 and in this case chirality would not be a good quantum number to label the states with.
Given a general spinor, the right-handed and the left-handed components can be obtained through the projection operators ( denotes the identity matrix in the appropriate space)
where we have defined
We note that by definition these projection operators satisfy
which implies that any four component spinor can be uniquely decomposed into a right-handed and a left-handed component. (In the Pauli-Dirac representation, these projection operators have the explicit forms (see (2.92))
We note from (3.155) that in the massless Dirac theory, the four component spinors can be effectively described by two component spinors. This is connected with our earlier observation (see section 3.6) that in the massless limit, the Dirac equation reduces to two decoupled two component Weyl equations (recall that it is the mass term which generally couples these two spinors). The reducibility of the spinors is best seen in the Weyl representation for the Dirac matrices discussed in (2.120). However, we will continue our discussion in the Pauli-Dirac representation of the Dirac matrices which we have used throughout. From the definition of the helicity operator in (3.123) (for the two component spinors S = σ), we note that spinors of the form
correspond to states with definite helicity, namely,
so that the right-handed (four component) spinors in (3.155) are described by two component spinors with positive helicity while the left-handed (four component) spinors are described in terms of two component spinors of negative helicity. Explicitly, we see from (3.155) and (3.159) that we can identify
We note here that the operators (see (3.159))
can also be written in a covariant notation as
with defined in (2.121) and It is straightforward to check that the operators P (±) satisfy the relations
and, therefore, define projection operators into the space of positive and negative helicity two component spinors. They can be easily generalized to a reducible representation of operators acting on the four component spinors and have the form (see (2.71) or (3.119))
and it is straightforward to check from (3.158) and (3.165) that
which is the reason the spinors can be simultaneous eigenstates of chirality and helicity (when mass vanishes). In fact, from (3.161) as well as (3.165) we see that the right-handed spinors with chirality +1 are characterized by helicity +1 while the left-handed spinors with chirality −1 have helicity −1.
For completeness as well as for later use, let us derive some properties of these spinors. We note from (3.159) that we can write the positive and the negative energy solutions as (we will do this in detail for the right-handed spinors and only quote the results for the left-handed spinors)
Each of these spinors, is one dimensional (namely, each of them has only one non-zero component) and together they span the two dimensional spinor space. We can choose and to be normalized so that we have
For example, we can choose
such that when p1 = p2 = 0, the helicity spinors simply reduce to eigenstates of σ3. Furthermore, we can also define normalized spinors u(+) and v(+). For example, with the choice of the basis in (3.169), the normalized spinors take the forms (here we are using the three dimensional notation so that pi = (p)i)
105
However, we do not need to use any particular representation for our discussions. In general, the positive helicity spinors satisfy
which can be checked from the explicit forms of the spinors in (3.170). Here we note that the second relation follows from the fact that a positive helicity spinor changes into an orthogonal negative helicity spinor when the direction of the momentum is reversed (which is also manifest in the projection operators in (3.162)).
Given the form of the right-handed spinors in (3.161), together with (3.171), it now follows in a straightforward manner that
The completeness relation in (3.172) can be simplified by noting the following identity. We note that with p0 = |p|, we can write
so that we can write the completeness relation in (3.172) as
which can also be derived using the methods in section 3.4. We conclude this section by noting (without going into details) that similar relations can be derived for the left-handed spinors and take the forms
3.8Non-relativistic limit of the Dirac equation
Let us recall that the positive energy solutions of the Dirac equation have the form (see (2.49))
while the negative energy solutions have the form
In (3.176) we have defined
and we emphasize that the subscript “L” here does not stand for the left-handed particles introduced in the last section. Similarly, in (3.177) we have denoted
It is clear that in the non-relativistic limit, when |p| m, the component uS(p) is much smaller than (of the order of ) uL(p) and correspondingly, uL(p) and uS(p) are known as the large and the small components of the positive energy Dirac solution. Similarly, vL(p) and vS(p) are also known as the large and the small components of the negative energy solution. In the non-relativistic limit, we expect the large components to give the dominant contribution to the wave function.
Let us next look at the positive energy solutions in (3.176), which satisfy the equation
This would lead to the two (2-component) equations
We note that the second equation in (3.181) gives the relation
while, with the substitution of this, the first equation in (3.181) takes the form
where we have used the fact that for a non-relativistic system, |p| m, and, therefore, E ≈ m (recall that we have set c = 1). Furthermore, if we identify the non-relativistic energy (without the rest mass term) as
then, equation (3.183) has the form
Namely, the Dirac equation in this case reduces to the Schrödinger equation for a two component spinor which we are familiar with. This is, of course, what we know for a free non-relativistic electron (spin particle).
3.9Electron in an external magnetic field
The coupling of a charged particle to an external electromagnetic field can be achieved through what is conventionally known as the minimal coupling. This preserves the gauge invariance associated with the Maxwell’s equations and corresponds to defining
where e denotes the charge of the particle and Aµ represents the four vector potential of the associated electromagnetic field. Since the coordinate representation of pµ is given by (see (1.33) and remember that we are choosing ħ = 1)
the minimal coupling prescription also corresponds to defining (in the coordinate representation)
Let us next consider an electron interacting with a time independent external magnetic field. In this case, we have
where we are assuming that A = A(x). The Dirac equation for the positive energy electrons, in this case, takes the form
Explicitly, we can write the two (2-component) equations as
In this case, the second equation in (3.191) leads to
where in the last relation, we have used |p| m in the non-relativistic limit. Substituting this back into the first equation in (3.191), we obtain
Let us simplify the expression on the left hand side of (3.193) using the following identity for the Pauli matrices
Note that (here, we are going to use purely three dimensional notation for simplicity)
We can use this in (3.194) to write
Consequently, in the non-relativistic limit, when we can approximate the Dirac equation by that satisfied by the two component spinor uL(p), equation (3.193) takes the form
where we have identified (as before)
We recognize (3.197) to be the Schrödinger equation for a charged electron with a minimal coupling to an external vector field along with a magnetic dipole interaction with the external magnetic field. Namely, a minimally coupled Dirac particle automatically leads, in the non-relativistic limit, to a magnetic dipole interaction (recall that in the non-relativistic theory, we have to add such an interaction by hand) and we can identify the magnetic moment operator associated with the electron to correspond to
Of course, this shows that a point Dirac particle has a magnetic moment corresponding to a gyro-magnetic ratio
Let us recall that the magnetic moment of a particle is defined in general as (c = 1)
Since S = σ for a two component electron, comparing with (3.199) we obtain g = 2. Quantum mechanical corrections (higher order corrections) in an interacting theory such as quantum electrodynamics, however, change this value slightly and the experimental deviation of g from the value of 2 (g − 2 experiment) for the electron agrees exceptionally well with the theoretical predictions of quantum electrodynamics. Particles with a nontrivial structure (that is particles which are not point like and have extended structures), however, can have g-factors quite different from 2. In this case, one says that there is an anomalous contribution to the magnetic moment. Thus, for example, for the proton and the neutron, we know that the magnetic moments are given by
where the nuclear magneton is defined to be
with mP denoting the mass of the proton.
Anomalous magnetic moments can be accommodated through an additional interaction Hamiltonian (in the Dirac system) of the form (this is known as a non-minimal coupling)
where
denotes the electromagnetic field strength tensor and κ represents the anomalous magnetic moment of the particle. This is commonly known as the Pauli coupling or the Pauli interaction.
3.10Foldy-Wouthuysen transformation
In the last two sections, we have described how the non-relativistic limit of a Dirac theory can be taken in a simple manner. In the non-relativistic limit, the relevant expansion parameter is and the method works quite well in the lowest order of expansion, as we have seen explicitly. However, at higher orders, this method runs into difficulty. For example, if we were to calculate the electric dipole interaction of an electron in a background electromagnetic field using the method described in the earlier sections, the electric dipole moment becomes imaginary at order (namely, the Hamiltonian becomes non-Hermitian). This puzzling feature can be understood in a simple manner as follows. The process of eliminating the “small” components from the Dirac equation described in the earlier sections can be understood mathematically as
where the matrix A, in the case of the free Dirac equation, for example, has the form (see (3.182))
for the positive energy spinors. The matrix T that takes us to the two component “large” spinors (from the original spinor) in (3.206) has the form
It is clear from the form of the matrix in (3.208) that it is not unitary and this is the reason that the Hamiltonian becomes non-Hermitian at higher orders in the inverse mass expansion (non-relativistic expansion). This difficulty in taking a consistent non-relativistic limit to any order in the expansion in was successfully solved by Foldy and Wouthuysen and also independently by Tani which we describe below.
Since the lack of unitarity in (3.208) is the source of the problem in taking the non-relativistic limit consistently, the main idea in the works of Foldy-Wouthuysen as well as Tani is to ensure that the relevant transformation used in going to the non-relativistic limit is manifestly unitary. Thus, for example, let us look at the free Dirac theory where we know that the Hamiltonian has the form (see (1.100) as well as (1.101))
Let us next look for a unitary transformation that will diagonalize the Hamiltonian in (3.209). In this case, such a transformation would also transform the spinor into two 2-component spinors that will be decoupled and we do not have to eliminate one in favor of the other (namely, avoid the problem with “large” and “small” spinors). Let us consider a transformation of the type
where the real scalar parameter of the transformation is a function of p and m,
From the properties of the gamma matrices in (1.83) or (1.91), we note that
and using this we can simplify and write
It follows now that
which leads to
Namely, the transformation (3.210) is indeed unitary.
Under the unitary transformation (3.210), the free Dirac Hamiltonian (3.209) would transform as
So far, our discussion has been quite general and the parameter of the transformation, θ, has been arbitrary. However, if we want the transformation to diagonalize the Hamiltonian, it is clear from (3.216) that we can choose the parameter of transformation to satisfy
In this case, we have
which, from (3.216), leads to the diagonalized Hamiltonian
We see from (3.219) that the Hamiltonian is now diagonalized in the positive and the negative energy spaces. As a result, the two components of the transformed spinor
would be decoupled in the energy eigenvalue equation and we can without any difficulty restrict ourselves to the positive energy sector where the energy eigenvalue equation takes the form
For |p| m, this leads to the non-relativistic equation in (3.185) to the lowest order and it can be expanded to any order in without any problem. We also note that with the parameter θ determined in (3.217), the unitary transformation in (3.213) takes the form
which has a natural non-relativistic expansion in powers of This analysis can be generalized even in the presence of interactions and the higher order terms in the interaction Hamiltonian are all well behaved without any problem of non-hermiticity.
There is a second limit of the Dirac equation, namely, the ultrarelativistic limit |p| m, for which the generalized Foldy-Wouthuysen transformation (3.213) is also quite useful. In this case, the transformation is known as the Cini-Touschek transformation and is obtained as follows. Let us note from (3.216) that if we choose the parameter of transformation to satisfy
this would lead to
As a result, in this case, the transformed Hamiltonian (3.216) will have the form
which has a natural expansion in powers of In fact, in this case, the unitary transformation (3.213) has the form
which clearly has a natural expansion in powers of (ultrarelativistic expansion). Therefore, we can think of the Foldy-Wouthuysen transformation (3.222) as transforming away the α · p term in the Hamiltonian (3.209) while the Cini-Touschek transformation rotates away the mass term βm from the Hamiltonian (3.209).
3.11Zitterbewegung
The presence of negative energy solutions for the Dirac equation leads to various interesting consequences. For example, let us consider the free Dirac Hamiltonian (1.100)
In the Heisenberg picture, where operators carry time dependence and states are time independent, the Heisenberg equations of motion take the forms (ħ = 1)
Here a dot denotes differentiation with respect to time.
The second equation in (3.228) shows that the momentum is a constant of motion as it should be for a free particle. The first equation, on the other hand, identifies α(t) with the velocity operator. Let us recall that, by definition,
where we have denoted the operator in the Schrödinger picture by
Furthermore, using (1.101) we conclude that
As a result, it follows that
In other words, even though the momentum of a free particle is a constant of motion, the velocity is not. Secondly, since the eigenvalues of α are ±1 (see, for example, (1.101)), it follows that the eigenvalues of α(t) are ±1 as well. This is easily understood from the fact that the eigenvalues of an operator do not change under a unitary transformation. More explicitly, we note that if
where λ denotes the eigenvalue of the velocity operator α, then, it follows that
where we have identified
Equation (3.234) shows that the eigenvalues of α(t) are the same as those of α (only the eigenfunctions are transformed) and, therefore, are ±1. This would seem to imply that the velocity of an electron is equal to the speed of light which is unacceptable even classically, since the electron is a massive particle.
These peculiarities of the relativistic theory can be understood as follows. We note from Heisenberg’s equations of motion that the time derivative of the velocity operator is given by
Here we have used the relations (see (1.102))
as well as the fact that momentum commutes with the Hamiltonian (so that p commutes with α(t)). Let us note next that both p and H are constants of motion. Therefore, differentiating (3.236) with respect to time, we obtain
On the other hand, from (3.236) we have
Substituting this back into (3.238), we obtain
Furthermore, using this relation in (3.236), we finally determine
The first term in (3.241) is quite expected. For example, in an eigenstate of momentum it would have the form which is the true relativistic expression for velocity. We note that, for a relativistic particle, (c = 1)
so that
which is the first term in (3.241). It is the second term, however, which is unexpected. It represents an additional component to the velocity which is oscillating at a very high frequency (for an electron at rest, for example, the energy is ≈ .5 MeV corresponding to a frequency of the order of 1021/sec) and gives a time dependence to α(t). Let us also note from (3.228) that since
integrating this over time, we obtain
where a is a constant. The first two terms in (3.245) are again what we will expect classically for uniform motion. However, the third term represents an additional contribution to the electron trajectory which is oscillatory with a very high frequency. Its occurrence is quite surprising, since there is no potential whatsoever in the problem. This quivering motion of the electron was first studied by Schrödinger and is known as Zitterbewegung (“jittery motion”).
The unconventional operator relations in (3.241) and (3.245) can be shown in the Schrödinger picture to arise from the presence of negative energy solutions. In fact, it is easy to check that for a positive energy electron state
we have
This shows that even though the operator relations are unconventional, in a positive energy electron state
as we should expect from the Ehrenfest theorem. This shows that even though the eigenvalues of the operator α(t) are ±1 corresponding to motion with the speed of light, the physical velocity of the electron (observed velocity which is the expectation value of the operator in the positive energy electron state) is what we would expect. This also shows that the eigenstates of the velocity operator, α(t), which are not simultaneous eigenstates of the Hamiltonian must necessarily contain both positive and negative energy solutions as superposition and that the extra terms have non-zero value only in the transition between a positive energy and a negative energy state. (This makes clear that neglecting the negative energy solutions of the Dirac equation would lead to inconsistencies.)
3.12References
1.J. D. Bjorken and S. Drell, Relativistic Quantum Mechanics, McGraw-Hill, New York (1964).
2.M. Cini and B. Touschek, Nuovo Cimento 7, 422 (1958).
3.L. L. Foldy and S. A. Wouthuysen, Physical Review 78, 29 (1950).
4.C. Itzykson and J.-B. Zuber, Quantum Field Theory, McGraw-Hill, New York (1980).
5.S. Okubo, Progress of Theoretical Physics 12, 102 (1954); ibid. 12, 603 (1954).
6.L. I. Schiff, Quantum Mechanics, McGraw-Hill, New York (1968).
7.S. Tani, Progress of Theoretical Physics 6, 267 (1951).