Читать книгу Lectures on Quantum Field Theory - Ashok Das - Страница 9
ОглавлениеCHAPTER 1
Relativistic equations
1.1Introduction
As we know, in single particle, non-relativistic quantum mechanics, we start with the Hamiltonian description of the corresponding classical, non-relativistic physical system and promote each of the observables to a Hermitian operator. The time evolution of the quantum mechanical system (state), in this case, is given by the time dependent Schrödinger equation which has the form
Here ψ(x, t) represents the wave function of the system which corresponds to the probability amplitude for finding the particle at the coordinate x at a given time t and the Hamiltonian, H, has the generic form
with p denoting the momentum of the particle and V (x) representing the potential through which the particle moves. (Throughout the book we will use a bold symbol to represent a three dimensional vector.)
This formalism is clearly non-relativistic (non-covariant) which can be easily seen by noting that, even for a free particle, the dynamical equation (1.1) takes the form
In the coordinate basis, the momentum operator has the form
so that the time dependent Schrödinger equation, in this case, takes the form
This equation is linear in the time derivative while it is quadratic in the space derivatives. Therefore, space and time are not treated on an equal footing in this case and, consequently, the equation cannot have the same form (covariant) in different Lorentz frames. A relativistic equation, on the other hand, must treat space and time coordinates on an equal footing and remain form invariant in all inertial frames (Lorentz frames). Let us also recall that, even for a simple fundamental system such as the Hydrogen atom, the ground state electron is fairly relativistic ( for the ground state electron is of the order of the fine structure constant). Consequently, there is a need to generalize the non-relativistic quantum mechanical description to relativistic systems. In this chapter, we will study how we can systematically develop a quantum mechanical description of a single relativistic particle and the difficulties associated with such a description.
1.2Notations
Before proceeding any further, let us fix our notations. We note that in the three dimensional Euclidean space, which we are all familiar with, a vector is labelled uniquely by its three components. (We denote three dimensional vectors in boldface.) Thus,
where x and J represent respectively the position and the angular momentum vectors of a particle (system) while A stands for any arbitrary vector. In such a space, as we know, the scalar product of any two arbitrary vectors is defined to be
where repeated indices are assumed to be summed. The scalar product of two vectors is invariant under rotations of the three dimensional space which is the maximal symmetry group of the Euclidean space that leaves the origin invariant. This also allows us to define the length of a vector simply as
The Kronecker delta, δij, in this case, represents the metric of the Euclidean space and is trivial (in the sense that all the nonzero components are positive and simply unity). Consequently, it does not matter whether we write the indices “up” or “down”. Let us note from the definition of the length of a vector in Euclidean space that, for any nontrivial vector, it is necessarily positive definite, namely,
When we treat space and time on an equal footing and enlarge our three dimensional Euclidean manifold to the four dimensional space-time manifold, we can again define vectors in this manifold. However, these would now consist of four components. Namely, any point in this manifold will be specified uniquely by four coordinates and, consequently, any vector would also have four components. However, unlike the case of the Euclidean space, there are now two distinct four vectors that we can define on this manifold, namely, (µ = 0, 1, 2, 3 and we are being a little sloppy in representing the four vector by what may seem like its component)
Here c represents the speed of light (necessary to give the same dimension to all the components) and we note that the two four vectors simply represent the two distinct possible ways space and time components can be embedded into the four vector. On a more fundamental level, the two four vectors have distinct transformation properties under Lorentz transformations (in fact, one transforms inversely with respect to the other) and are known respectively as contravariant and covariant vectors.
The contravariant and the covariant vectors are related to each other through the metric tensor of the four dimensional manifold, commonly known as the Minkowski space, namely,
From the forms of the contravariant and the covariant vectors in (1.10) as well as using (1.11), we can immediately read out the components of the metric tensors for the four dimensional Minkowski space which are diagonal with the signature (+, −, −, −). Namely, we can write them in the matrix form as
The contravariant metric tensor, ηµν, and the covariant metric tensor, ηµν, are inverses of each other, since they satisfy
Furthermore, each is symmetric as they are expected to be, namely,
This particular choice of the metric is conventionally known as the Bjorken-Drell metric and this is what we will be using throughout these lectures. Different authors, however, use different metric conventions and you should be careful in reading the literature. (As is clear from the above discussion, the nonuniqueness in the choice of the metric tensors reflects the nonuniqueness of the embedding of space and time components into a four vector. Physical results, however, are independent of the choice of a metric.)
Given two arbitrary four vectors
we can define an invariant scalar product of the two vectors as
Since the contravariant and the covariant vectors transform in an inverse manner, such a product is easily seen to be invariant under Lorentz transformations. This is the generalization of the scalar product of the three dimensional Euclidean space (1.7) to the four dimensional Minkowski space and is invariant under Lorentz transformations which are the analogs of rotations in Minkowski space. In fact, any product of Lorentz tensors defines a scalar if all the Lorentz indices are contracted, namely, if there is no free Lorentz index left in the resulting product. (Two Lorentz indices are said to be contracted if a contravariant and a covariant index are summed over all possible values.)
Given this, we note that the length of a (four) vector in Minkowski space can be determined to have the form (compare with (1.8))
Unlike the Euclidean space, however, here we see that the length of a vector need not always be positive semi-definite (recall (1.9)). In fact, if we look at the Minkowski space itself, we find that
This is the invariant length (of any point from the origin) in this space. The invariant length between two points infinitesimally close to each other follows from this to be
where τ is known as the proper time.
For coordinates which satisfy (see (1.19), we will set c = 1 from now on for simplicity)
we say that the region of space-time is time-like for obvious reasons. On the other hand, for coordinates which satisfy
the region of space-time is known as space-like. The boundary of the two regions, namely, the region for which
defines trajectories for light-like particles and is, consequently, known as the light-like region. (Light-like vectors, for which the invariant length vanishes, are nontrivial unlike the case of the Euclidean space in (1.9).)
Figure 1.1: Different invariant regions of Minkowski space.
Thus, we see that, unlike the Euclidean space, the Minkowski space-time manifold separates into four invariant cones (namely, regions which do not mix under Lorentz transformations), which in a two dimensional projection has the form of wedges shown in Fig. 1.1. The different invariant cones (wedges) are known as
All physical processes are assumed to take place in the future light cone or the forward light cone defined by
Given the contravariant and the covariant coordinates, we can define the contragradient and the cogradient respectively as (c = 1)
From these, we can construct the Lorentz invariant quadratic operator
which is known as the D’Alembertian. It is the generalization of the Laplacian to the four dimensional Minkowski space.
Let us note next that energy and momentum also define four vectors in this case. (Namely, they transform like four vectors under Lorentz transformations.) Thus, we can write (remember that c = 1, otherwise, we have to write )
Given the energy-momentum four vectors, we can construct the Lorentz scalar
The Einstein relation for a free particle (remember c = 1)
where m represents the rest mass of the particle, can now be seen as the Lorentz invariant condition
In other words, in this space, the energy and the momentum of a free particle must lie on a hyperbola satisfying the relation (1.31).
We already know that the coordinate representations of the energy and the momentum operators take the forms
We can combine these to write the coordinate representation for the energy-momentum four vector operator as
Finally, let us note that in the four dimensional space-time, we can construct two totally anti-symmetric fourth rank tensors ϵµνλρ, ϵµνλρ, the four dimensional contravariant and covariant Levi-Civita tensors respectively. We will choose the normalization ϵ0123 = 1 = −ϵ0123 so that
where ϵijk denotes the three dimensional Levi-Civita tensor with ϵ123 = 1. An anti-symmetric tensor such as ϵijk is then understood to denote
and so on. This completes the review of all the essential basic notation that we will be using in this book. We will introduce new notations as they arise in the context of our discussions.
1.3Klein-Gordon equation
With all these basics, we are now ready to write down the simplest of the relativistic equations. We recall that in the case of a non-relativistic particle, we start with the non-relativistic energy-momentum relation
and promote the dynamical variables (observables) to Hermitian operators to obtain the time-dependent Schrödinger equation (see (1.1))
Let us consider the simplest of relativistic systems, namely, a relativistic free particle of mass m. In this case, we have seen that the energy-momentum relation is none other than the Einstein relation (1.30), namely,
Thus, as before, promoting these to operators, we obtain the simplest relativistic quantum mechanical equation to be (see (1.33))
Setting ħ = 1 from now on for simplicity, the equation above takes the form
Since the operator in the parenthesis is a Lorentz scalar and since we assume the quantum mechanical wave function, ϕ(x, t), to be a scalar function, this equation is invariant under Lorentz transformations.
This equation, (1.40), is known as the Klein-Gordon equation and, for m = 0, or when the rest mass of the particle vanishes, it reduces to the wave equation (recall Maxwell’s equations). Like the wave equation, the Klein-Gordon equation also has plane wave solutions which are characteristic of free particle solutions. In fact, the functions
with kµ = (k0, k) are eigenfunctions of the energy-momentum operator, namely, using (1.33) (remember that ħ = 1) we obtain
so that ±kµ are the eigenvalues of the energy-momentum operator. (In fact, the eigenvalues should be ±ħkµ, but we have set ħ = 1.) This shows that the plane waves define a solution of the Klein-Gordon equation (1.39) or (1.40) provided
Thus, we see the first peculiarity of the Klein-Gordon equation (which is a relativistic equation), namely, that it allows for both positive and negative energy solutions. This basically arises from the fact that, for a relativistic particle (even a free one), the energy-momentum relation is given by the Einstein relation which is a quadratic relation in E, as opposed to the case of a non-relativistic particle, where the energy-momentum relation is linear in E. If we accept the Klein-Gordon equation as describing a free, relativistic, quantum mechanical particle of mass m, then, we will see shortly that the presence of the negative energy solutions would render the theory inconsistent.
To proceed further, let us note that the Klein-Gordon equation and its complex conjugate (remember that a quantum mechanical wave function is, in general, complex), namely,
would imply
Defining the probability current density four vector as
where
we note that equation (1.45) can be written as a continuity equation for the probability current, namely,
The probability current density,
of course, has the same form as in non-relativistic quantum mechanics. However, we note that the form of the probability density (which results from the requirement of covariance)
is quite different from that in non-relativistic quantum mechanics (namely, ρ = ϕ∗ϕ) and it is here that the problem of the negative energy states shows up. For example, even for the simplest of solutions, namely, plane waves of the form
we obtain
Since energy can take both positive and negative values, it follows that ρ cannot truly represent the probability density which, by definition, has to be positive semi-definite. It is worth noting here that this problem really arises because the Klein-Gordon equation, unlike the time dependent Schrödinger equation, is second order in time derivatives. This has the consequence that the probability density involves a first order time derivative and that is how the problem of the negative energy states enters. (Note that if the equation is second order in the space derivatives, then covariance would require that it be second order in time derivative as well. This would, in turn, lead to the difficulty with the probability density being positive semi-definite.) One can, of course, ask whether we can restrict ourselves only to positive energy solutions in order to avoid the difficulty with the interpretation of ρ. Classically, we can do this. However, quantum mechanically, we cannot arbitrarily impose this for a variety of reasons. The simplest way to see this is to note that the positive energy solutions alone do not define a complete set of (basis) states in the Hilbert space and, consequently, even if we restrict the states to be of positive energy to begin with, negative energy states may be generated through quantum mechanical corrections. It is for these reasons that the Klein-Gordon equation was abandoned as a quantum mechanical equation for a relativistic single particle. However, as we will see later, this equation is quite meaningful as a relativistic field equation.
1.3.1 Klein paradox. Let us consider a charged scalar particle described by the Klein-Gordon equation (1.40) in an external electromagnetic field. We recall that the coupling of a charged particle to an electromagnetic field is given by the minimal coupling
where we have used the coordinate representation for the momentum as in (1.33) and Aµ denotes the vector potential associated with the electromagnetic field. In this case, therefore, the scalar particle will satisfy the minimally coupled Klein-Gordon equation (e > 0, namely, the particles are chosen to carry positive charge)
As a result, the probability current density in (1.46) can be determined to have the form
where we have defined
With this general description, let us consider the scattering of a charged scalar (Klein-Gordon) particle with positive energy from a constant electrostatic potential. In this case, therefore, we have
For simplicity, let us assume the constant electrostatic potential to be of the form
and we assume that the particle is incident on the potential along the z-axis as shown in Fig. 1.2.
Figure 1.2: Klein-Gordon particle scattering from a constant electrostatic potential.
The dynamical equations will now be different in the two regions, z < 0 (region I) and z > 0 (region II), and have the forms (see (1.54))
In region I, there will be an incident as well as a reflected (plane) wave so that we can write (remember that the incident particle has positive energy)
while in region II, we only expect a transmitted wave (traveling to the right) of the form
where A, B are related respectively to reflection and transmission coefficients. We note here that the continuity of the wave function at the boundary z = 0 requires that the energy be the same in the two regions.
For the wave functions in (1.60) and (1.61) to satisfy the respective equations in (1.59), we must have
Here we have used the fact that the energy of the incident particle is positive and, therefore, the square root in the first equation in (1.62) is with a positive sign. However, the sign of the square root in the second relation remains to be fixed.
Let us note from the second relation in (1.62) that p′ is real for both E − eΦ0 > m (weak potential) and for E − eΦ0 < −m (strong potential). However, for a potential of intermediate strength satisfying −m < E − eΦ0 < m, we note that p′ is purely imaginary. Thus, the behavior of the transmitted wave depends on the strength of the potential. As a result, in this second case, we must have
in order that the wave function is damped in region II. To determine the sign of the square root in the cases when p′ is real, let us note from the second relation in (1.62) that the group velocity of the transmitted wave is given by
Since we expect the transmitted wave to be travelling to the right, we determine from (1.64) that
This, therefore, fixes the sign of the square root in the second relation in (1.62) for various cases.
Matching the wave functions in (1.60) and (1.61) and their first derivatives at the boundary z = 0, we determine
so that we determine
Let us next determine the probability current densities associated with the different waves. From (1.55) as well as the form of the potential in (1.58) we obtain
where we have used (1.67) as well as the fact that, while p is real and positive, p′ can be positive or negative or even imaginary depending on the strength of the potential (see (1.63) and (1.65)). We can now determine the reflection and the transmission coefficients simply as
We see from the reflection and the transmission coefficients that
so that the reflection and the transmission coefficients satisfy unitarity for all strengths of the potential.
However, let us now analyze the different cases of the potential strengths individually. First, for the case, E − eΦ0 > m (weak potential), we see that p′ is real and positive and we have
which corresponds to the normal scenario in scattering. For the case of an intermediate potential strength, −m < E − eΦ0 < m, we note from (1.63) that p′ is purely imaginary in this case. As a result, it follows from (1.69) that
so that the incident beam is totally reflected and there is no transmission in this case. The third case of the strong potential, E − eΦ0 < −m, is the most interesting. In this case, we note from (1.65) that p′ is real, but negative. As a result, from (1.69) we have
Namely, even though unitarity is not violated, in this case the transmission coefficient is negative and the reflection coefficient exceeds unity. This is known as the Klein paradox and it contradicts our intuition from the one particle scatterings studied in non-relativistic quantum mechanics. On the other hand, if we go beyond the one particle description and assume that a sufficiently strong enough electrostatic potential can produce particle-antiparticle pairs, there is no paradox. For example, the antiparticles are attracted towards the barrier leading to a negative charged current moving to the right (remember that the particles are chosen to carry positive charge so that antiparticles carry negative charge) which explains the negative transmission coefficient. On the other hand, the particles are reflected from the barrier and add to the totally reflected incident particles (which is already seen for intermediate strength potentials) to give a reflection coefficient that exceeds unity.
1.4Dirac equation
As we have seen, relativistic equations seem to imply the presence of both positive as well as negative energy solutions and that quantum mechanically, we need both these solutions to describe a physical system. Furthermore, as we have seen, the Klein-Gordon equation is second order in the time derivatives and this leads to the definition of the probability density which is first order in the time derivative. Together with the negative energy solutions, this implies that the probability density can become negative which is inconsistent with the definition of a probability density. It is clear, therefore, that even if we cannot avoid the negative energy solutions, we can still possibly obtain a consistent probability density provided we have a relativistic equation which is first order in the time derivative just like the time dependent Schrödinger equation. The difference, of course, is that Lorentz invariance (or covariance under Lorentz transformations) would require space and time to be treated on an equal footing and, therefore, such an equation, if we can find it, must be first order in both space and time derivatives. Clearly, this can be done provided we have a linear relation between energy and momentum operators. Let us recall that the Einstein relation gives
The positive square root of this gives
which is far from a linear relation.
Although the naive square root of the Einstein relation does not lead to a linear relation between the energy and the momentum variables, a matrix square root may, in fact, lead to such a relation. This is exactly what Dirac proposed. Let us, for example, write the Einstein relation as
Let us consider this as a matrix relation (namely, an n × n identity matrix multiplying both sides). Let us further assume that there exist four linearly independent n × n matrices γµ, µ = 0, 1, 2, 3, which are space-time independent such that
represents the matrix square root of p2. If this is true, then, by definition, we have
Here denotes the identity matrix (in the appropriate matrix space, in this case, n dimensional) and we have used the fact that the matrices, γµ, are constant to move them past the momentum operators. For the relation (1.78) to be true, it is clear that the matrices, γµ, have to satisfy the algebra (µ = 0, 1, 2, 3)
Here the brackets with a subscript “+” stand for the anti-commutator of two quantities defined in (1.79) (sometimes it is also denoted by curly brackets which we will not use to avoid confusion with Poisson brackets) and this algebra is known as the Clifford algebra. We see that if we can find a set of four linearly independent constant matrices satisfying the Clifford algebra, then, we can obtain a matrix square root of p2 which would be linear in energy and momentum.
Before going into an actual determination of such matrices, let us look at the consequences of such a possibility. In this case, the solutions of the equation (sign of the mass term is irrelevant and the wave function is a matrix in this case)
would automatically satisfy the Einstein relation. Namely,
Furthermore, since the new equation, (1.80), is linear in the energy and momentum variables, it will, consequently, be linear in the space and time derivatives. This is, of course, what we would like for a consistent definition of the probability density. The equation (1.80) (or its coordinate representation) is known as the Dirac equation.
To determine the matrices, γµ, and their dimensionality, let us note that the Clifford algebra in (1.79)
can be written out explicitly as
We can choose any one of the matrices to be diagonal and without loss of generality, let us choose
From the fact that we conclude that each of the diagonal elements in γ0 must be ±1, namely,
Let us next note that using the relations from the Clifford algebra in (1.83), for a fixed i, we obtain
where “Tr” denotes trace over the matrix indices. On the other hand, the cyclicity property of the trace, namely,
leads to
Thus, comparing Eqs. (1.86) and (1.88), we obtain
For this to be true, we conclude that γ0 must have as many diagonal elements with value +1 as with −1. Consequently, the γµ matrices must be even dimensional.
Let us assume that n = 2N. The simplest nontrivial matrix structure would arise for N = 1 when the matrices would be two dimensional (namely, 2 × 2 matrices). We know that the three Pauli matrices along with the identity matrix define a complete basis for 2 × 2 matrices. However, as we know, they do not satisfy the Clifford algebra. Namely, if we define then,
In fact, we know that in two dimensions, there cannot exist four anti-commuting matrices.
The next choice is N = 2 for which the matrices will be four dimensional (4 × 4 matrices). In this case, we can find a set of four linearly independent, constant matrices which satisfy the Clifford algebra. A particular choice of these matrices, for example, has the form
where each element of the 4 × 4 matrices represents a 2 × 2 matrix and the σi correspond to the three Pauli matrices. This particular choice of the Dirac matrices is commonly known as the Pauli-Dirac representation.
There are, of course, other representations for the γµ matrices. However, the physics of Dirac equation is independent of any particular representation for the γµ matrices. This can be easily seen by invoking Pauli’s fundamental theorem which says that if there are two sets of (constant) matrices γµ and γ′µ satisfying the Clifford algebra, then, they must be related by a similarity transformation. Namely, if
then, there exists a constant, nonsingular matrix S such that (in fact, the similarity transformation is really a unitary transformation if we take the Hermiticity properties of the γ-matrices into account)
Therefore, given the equation
we obtain
with ψ = S−1ψ′. (The matrix S−1 can be moved past the momentum operator since it is assumed to be constant.) This shows that different representations of the γµ matrices are equivalent and merely correspond to a change in the basis of the wave function. As we know, a change of basis does not change physics.
To obtain the Hamiltonian for the Dirac equation, let us go to the coordinate representation where the Dirac equation (1.80) takes the form (remember ħ = 1)
Multiplying with γ0 from the left and using the fact that we obtain
Conventionally, one denotes
In terms of these matrices, then, we can write (1.97) as
This is a first order equation (in time derivative) like the Schrödinger equation and we can identify the Hamiltonian for the Dirac equation with (recall the time dependent Schrödinger equation (1.37))
In the particular representation of the γµ matrices in (1.91), we note that
We can now determine either from the definition in (1.98) and (1.79) or from the explicit representation in (1.101) that the matrices α, β satisfy the anti-commutation relations
with We can, of course, directly check from this explicit representation that both β and α are Hermitian matrices. But, independently, we also note from the form of the Hamiltonian in (1.100) that, in order for it to be Hermitian, we must have
In terms of the γµ matrices, this translates to
Equivalently, we can write
Namely, independent of the representation, the γµ matrices must satisfy the Hermiticity properties in (1.105). (With a little bit of more analysis, it can be seen that, in general, the Hermiticity properties of the γµ matrices are related to the choice of the metric tensor and this particular choice is associated with the Bjorken-Drell metric.) In the next chapter, we would study the plane wave solutions of the first order Dirac equation.
1.5References
The material presented in this chapter is covered in many standard textbooks and we list below only a few of them.
1.J. D. Bjorken and S. Drell, Relativistic Quantum Mechanics, McGraw-Hill, New York (1964).
2.A. Das, Lectures on Quantum Mechanics, (second edition) Hindustan Publishing, India and World Scientific, Singapore (2011).
3.C. Itzykson and J.-B. Zuber, Quantum Field Theory, McGraw-Hill, New York (1980).
4.L. I. Schiff, Quantum Mechanics, McGraw-Hill, New York (1968).
